assembly-x86

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

x86-64 Assembly

x86-64汇编

Purpose

用途

Guide agents through x86-64 assembly: reading compiler output, understanding the ABI, writing inline asm, and common patterns.
指导Agent掌握x86-64汇编:包括阅读编译器输出、理解ABI、编写内联汇编以及常见代码模式。

Triggers

触发场景

  • "How do I read the assembly GCC generated?"
  • "What are the x86-64 registers?"
  • "What is the calling convention on Linux/macOS?"
  • "How do I write inline assembly in C?"
  • "How do I use SSE/AVX intrinsics?"
  • "This assembly uses
    %rsp
    /
    %rbp
    — what does it mean?"
  • "如何阅读GCC生成的汇编代码?"
  • "x86-64的寄存器有哪些?"
  • "Linux/macOS上的调用约定是什么?"
  • "如何在C语言中编写内联汇编?"
  • "如何使用SSE/AVX intrinsics?"
  • "这段汇编代码里的
    %rsp
    /
    %rbp
    是什么意思?"

Workflow

工作流程

1. Generate and read assembly

1. 生成与阅读汇编代码

bash
undefined
bash
undefined

AT&T syntax (GCC default)

AT&T syntax (GCC default)

gcc -S -O2 -fverbose-asm foo.c -o foo.s
gcc -S -O2 -fverbose-asm foo.c -o foo.s

Intel syntax

Intel syntax

gcc -S -masm=intel -O2 foo.c -o foo.s
gcc -S -masm=intel -O2 foo.c -o foo.s

From GDB

From GDB

(gdb) disassemble /s main # with source (gdb) x/20i $rip
(gdb) disassemble /s main # with source (gdb) x/20i $rip

From objdump

From objdump

objdump -d -M intel -S prog # Intel + source (needs -g)
undefined
objdump -d -M intel -S prog # Intel + source (needs -g)
undefined

2. x86-64 registers

2. x86-64寄存器

64-bit32-bit16-bit8-bit high8-bit lowPurpose
%rax
%eax
%ax
%ah
%al
Return value / accumulator
%rbx
%ebx
%bx
%bh
%bl
Callee-saved
%rcx
%ecx
%cx
%ch
%cl
4th arg / count
%rdx
%edx
%dx
%dh
%dl
3rd arg / 2nd return
%rsi
%esi
%si
%sil
2nd arg
%rdi
%edi
%di
%dil
1st arg
%rbp
%ebp
%bp
%bpl
Frame pointer (callee-saved)
%rsp
%esp
%sp
%spl
Stack pointer
%r8
%r11
%r8d
%r11d
%r8w
%r11w
%r8b
%r11b
5th–8th args / caller-saved
%r12
%r15
%r12d
%r15d
%r12w
%r15w
%r12b
%r15b
Callee-saved
%rip
Instruction pointer
%rflags
%eflags
Status flags
%xmm0
%xmm7
FP/SIMD args and return
%xmm8
%xmm15
Caller-saved SIMD
%ymm0
%ymm15
AVX 256-bit
%zmm0
%zmm31
AVX-512 512-bit
64-bit32-bit16-bit8-bit high8-bit low用途
%rax
%eax
%ax
%ah
%al
返回值 / 累加器
%rbx
%ebx
%bx
%bh
%bl
被调用者保存
%rcx
%ecx
%cx
%ch
%cl
第4个参数 / 计数器
%rdx
%edx
%dx
%dh
%dl
第3个参数 / 128位时的高位返回值
%rsi
%esi
%si
%sil
第2个参数
%rdi
%edi
%di
%dil
第1个参数
%rbp
%ebp
%bp
%bpl
帧指针(被调用者保存)
%rsp
%esp
%sp
%spl
栈指针
%r8
%r11
%r8d
%r11d
%r8w
%r11w
%r8b
%r11b
第5-8个参数 / 调用者保存
%r12
%r15
%r12d
%r15d
%r12w
%r15w
%r12b
%r15b
被调用者保存
%rip
指令指针
%rflags
%eflags
状态标志寄存器
%xmm0
%xmm7
浮点/SIMD参数与返回值
%xmm8
%xmm15
调用者保存的SIMD寄存器
%ymm0
%ymm15
AVX 256位寄存器
%zmm0
%zmm31
AVX-512 512位寄存器

3. System V AMD64 ABI (Linux, macOS, FreeBSD)

3. System V AMD64 ABI(Linux、macOS、FreeBSD)

Integer/pointer argument registers (in order):
%rdi, %rsi, %rdx, %rcx, %r8, %r9
Floating-point argument registers:
%xmm0
%xmm7
Return values:
  • Integer:
    %rax
    (low),
    %rdx
    (high if 128-bit)
  • Float:
    %xmm0
    (low),
    %xmm1
    (high)
Caller-saved (scratch):
%rax, %rcx, %rdx, %rsi, %rdi, %r8–%r11, %xmm0–%xmm15
Callee-saved (must preserve):
%rbx, %rbp, %r12–%r15
Stack: 16-byte aligned before
call
;
call
pushes 8 bytes → 16-byte aligned at function entry after prologue.
Red zone: 128 bytes below
%rsp
may be used by leaf functions without adjusting
%rsp
. Not available in kernel/signal handlers.
整数/指针参数寄存器(顺序):
%rdi, %rsi, %rdx, %rcx, %r8, %r9
浮点参数寄存器:
%xmm0
%xmm7
返回值:
  • 整数:
    %rax
    (低位),
    %rdx
    (128位时的高位)
  • 浮点数:
    %xmm0
    (低位),
    %xmm1
    (高位)
调用者保存(临时寄存器):
%rax, %rcx, %rdx, %rsi, %rdi, %r8–%r11, %xmm0–%xmm15
被调用者保存(必须保留):
%rbx, %rbp, %r12–%r15
栈:
call
指令执行前需保持16字节对齐;
call
指令会压入8字节 → 函数入口处执行序言代码后恢复16字节对齐。
红区:
%rsp
下方128字节区域可被叶子函数使用,无需调整
%rsp
。内核或信号处理程序中不可用。

4. Common instruction patterns

4. 常见指令模式

PatternMeaning
mov %rdi, %rax
Copy rdi to rax
mov (%rdi), %rax
Load 8 bytes from address in rdi
mov %rax, 8(%rdi)
Store rax to rdi+8
lea 8(%rdi), %rax
Load effective address rdi+8 into rax (no memory access)
push %rbx
Push rbx; rsp -= 8
pop %rbx
Pop into rbx; rsp += 8
call foo
Push return addr; jmp foo
ret
Pop return addr; jmp to it
xor %eax, %eax
Zero rax (smaller encoding than
mov $0, %rax
)
test %rax, %rax
Set ZF if rax == 0 (cheaper than
cmp $0, %rax
)
cmp $5, %rdi
Set flags for rdi - 5
jl label
Jump if signed less than
模式含义
mov %rdi, %rax
将rdi的值复制到rax
mov (%rdi), %rax
从rdi指向的地址加载8字节数据到rax
mov %rax, 8(%rdi)
将rax的值存储到rdi+8的地址
lea 8(%rdi), %rax
将rdi+8的有效地址加载到rax(无内存访问)
push %rbx
将rbx压入栈;rsp -= 8
pop %rbx
从栈弹出数据到rbx;rsp += 8
call foo
压入返回地址;跳转到foo
ret
弹出返回地址;跳转到该地址
xor %eax, %eax
将rax置零(编码比
mov $0, %rax
更短)
test %rax, %rax
若rax等于0则设置ZF标志(比
cmp $0, %rax
更高效)
cmp $5, %rdi
计算rdi - 5并设置标志位
jl label
有符号小于时跳转

5. AT&T vs Intel syntax

5. AT&T与Intel语法对比

FeatureAT&TIntel
Operand ordersource, destdest, source
Register prefix
%rax
rax
Immediate prefix
$42
42
Memory operand
8(%rdi)
[rdi+8]
Size suffix
movl
,
movq
— (inferred)
GCC emits AT&T by default. Use
-masm=intel
for Intel syntax.
特性AT&TIntel
操作数顺序源操作数在前,目的操作数在后目的操作数在前,源操作数在后
寄存器前缀
%rax
rax
立即数前缀
$42
42
内存操作数
8(%rdi)
[rdi+8]
大小后缀
movl
,
movq
—(自动推导)
GCC默认生成AT&T语法。使用
-masm=intel
参数可生成Intel语法。

6. Inline assembly (GCC extended asm)

6. 内联汇编(GCC扩展asm)

c
// Basic: increment a register
int x = 5;
__asm__ volatile (
    "incl %0"
    : "=r"(x)   // outputs: =r means write-only register
    : "0"(x)    // inputs: 0 means same as output 0
    : // clobbers: none
);

// CPUID example
uint32_t eax, ebx, ecx, edx;
__asm__ volatile (
    "cpuid"
    : "=a"(eax), "=b"(ebx), "=c"(ecx), "=d"(edx)
    : "a"(1)    // input: leaf 1
);

// Atomic increment
static inline int atomic_inc(volatile int *p) {
    int ret;
    __asm__ volatile (
        "lock; xaddl %0, %1"
        : "=r"(ret), "+m"(*p)
        : "0"(1)
        : "memory"
    );
    return ret + 1;
}
Constraint codes:
  • "r"
    — any general register
  • "m"
    — memory operand
  • "i"
    — immediate integer
  • "a"
    ,
    "b"
    ,
    "c"
    ,
    "d"
    — specific registers (%rax, %rbx, %rcx, %rdx)
  • "="
    prefix — output (write-only)
  • "+"
    prefix — read-write
  • "memory"
    clobber — tells compiler memory may be modified (barrier)
c
// Basic: increment a register
int x = 5;
__asm__ volatile (
    "incl %0"
    : "=r"(x)   // outputs: =r means write-only register
    : "0"(x)    // inputs: 0 means same as output 0
    : // clobbers: none
);

// CPUID example
uint32_t eax, ebx, ecx, edx;
__asm__ volatile (
    "cpuid"
    : "=a"(eax), "=b"(ebx), "=c"(ecx), "=d"(edx)
    : "a"(1)    // input: leaf 1
);

// Atomic increment
static inline int atomic_inc(volatile int *p) {
    int ret;
    __asm__ volatile (
        "lock; xaddl %0, %1"
        : "=r"(ret), "+m"(*p)
        : "0"(1)
        : "memory"
    );
    return ret + 1;
}
约束码说明:
  • "r"
    — 任意通用寄存器
  • "m"
    — 内存操作数
  • "i"
    — 立即数
  • "a"
    ,
    "b"
    ,
    "c"
    ,
    "d"
    — 指定寄存器(%rax, %rbx, %rcx, %rdx)
  • "="
    前缀 — 输出(只写)
  • "+"
    前缀 — 读写
  • "memory"
    破坏声明 — 告知编译器内存可能被修改(内存屏障)

7. SSE/AVX intrinsics (preferred over inline asm)

7. SSE/AVX intrinsics(优先于内联汇编)

c
#include <immintrin.h>   // includes all x86 SIMD headers

// Add 8 floats at once with AVX
__m256 a = _mm256_loadu_ps(arr_a);   // load 8 floats (unaligned)
__m256 b = _mm256_loadu_ps(arr_b);
__m256 c = _mm256_add_ps(a, b);
_mm256_storeu_ps(result, c);
Check CPU support at compile time:
-mavx2
or
-march=native
. Check at runtime:
__builtin_cpu_supports("avx2")
.
For a full register and instruction reference, see references/reference.md.
c
#include <immintrin.h>   // includes all x86 SIMD headers

// Add 8 floats at once with AVX
__m256 a = _mm256_loadu_ps(arr_a);   // load 8 floats (unaligned)
__m256 b = _mm256_loadu_ps(arr_b);
__m256 c = _mm256_add_ps(a, b);
_mm256_storeu_ps(result, c);
编译时检查CPU支持:使用
-mavx2
-march=native
参数。 运行时检查:
__builtin_cpu_supports("avx2")
完整的寄存器与指令参考,请查看references/reference.md

Related skills

相关技能

  • Use
    skills/low-level-programming/assembly-arm
    for AArch64/ARM assembly
  • Use
    skills/compilers/gcc
    for
    -S -masm=intel
    flag details
  • Use
    skills/debuggers/gdb
    for stepping through assembly (
    si
    ,
    ni
    ,
    x/i
    )
  • 如需AArch64/ARM汇编相关内容,请使用
    skills/low-level-programming/assembly-arm
    技能
  • 如需了解
    -S -masm=intel
    参数细节,请使用
    skills/compilers/gcc
    技能
  • 如需了解汇编单步调试(
    si
    ,
    ni
    ,
    x/i
    ),请使用
    skills/debuggers/gdb
    技能