llvm

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

LLVM IR and Tooling

LLVM IR 与工具使用

Purpose

用途

Guide agents through the LLVM IR pipeline: generating IR, running optimisation passes with
opt
, lowering to assembly with
llc
, and inspecting IR for debugging or performance work.
指导Agent完成LLVM IR全流程操作:生成IR、使用
opt
运行优化pass、使用
llc
降为汇编,以及为调试或性能优化工作检查IR。

Triggers

触发场景

  • "Show me the LLVM IR for this function"
  • "How do I run an LLVM optimisation pass?"
  • "What does this LLVM IR instruction mean?"
  • "How do I write a custom LLVM pass?"
  • "Why isn't auto-vectorisation happening in LLVM?"
  • "展示这个函数对应的LLVM IR"
  • "如何运行LLVM优化pass?"
  • "这条LLVM IR指令是什么意思?"
  • "如何编写自定义LLVM pass?"
  • "为什么LLVM没有执行自动向量化?"

Workflow

工作流程

1. Generate LLVM IR

1. 生成LLVM IR

bash
undefined
bash
undefined

Emit textual IR (.ll)

Emit textual IR (.ll)

clang -O0 -emit-llvm -S src.c -o src.ll
clang -O0 -emit-llvm -S src.c -o src.ll

Emit bitcode (.bc)

Emit bitcode (.bc)

clang -O2 -emit-llvm -c src.c -o src.bc
clang -O2 -emit-llvm -c src.c -o src.bc

Disassemble bitcode to text

Disassemble bitcode to text

llvm-dis src.bc -o src.ll
undefined
llvm-dis src.bc -o src.ll
undefined

2. Run optimisation passes with
opt

2. 使用
opt
运行优化pass

bash
undefined
bash
undefined

Apply a specific pass

Apply a specific pass

opt -passes='mem2reg,instcombine,simplifycfg' src.ll -S -o out.ll
opt -passes='mem2reg,instcombine,simplifycfg' src.ll -S -o out.ll

Standard optimisation pipelines

Standard optimisation pipelines

opt -passes='default<O2>' src.ll -S -o out.ll opt -passes='default<O3>' src.ll -S -o out.ll
opt -passes='default<O2>' src.ll -S -o out.ll opt -passes='default<O3>' src.ll -S -o out.ll

List available passes

List available passes

opt --print-passes 2>&1 | less
opt --print-passes 2>&1 | less

Print IR before and after a pass

Print IR before and after a pass

opt -passes='instcombine' --print-before=instcombine --print-after=instcombine src.ll -S -o out.ll 2>&1 | less
undefined
opt -passes='instcombine' --print-before=instcombine --print-after=instcombine src.ll -S -o out.ll 2>&1 | less
undefined

3. Lower IR to assembly with
llc

3. 使用
llc
将IR降为汇编

bash
undefined
bash
undefined

Compile IR to object file

Compile IR to object file

llc -filetype=obj src.ll -o src.o
llc -filetype=obj src.ll -o src.o

Compile to assembly

Compile to assembly

llc -filetype=asm -masm-syntax=intel src.ll -o src.s
llc -filetype=asm -masm-syntax=intel src.ll -o src.s

Target a specific CPU

Target a specific CPU

llc -mcpu=skylake -mattr=+avx2 src.ll -o src.s
llc -mcpu=skylake -mattr=+avx2 src.ll -o src.s

Show available targets

Show available targets

llc --version
undefined
llc --version
undefined

4. Inspect IR

4. 检查IR

Key IR constructs to understand:
ConstructMeaning
alloca
Stack allocation (pre-SSA;
mem2reg
promotes to registers)
load
/
store
Memory access
getelementptr
(GEP)
Pointer arithmetic / field access
phi
SSA φ-node: merges values from predecessor blocks
call
/
invoke
Function call (
invoke
has exception edges)
icmp
/
fcmp
Integer/float comparison
br
Branch (conditional or unconditional)
ret
Return
bitcast
Reinterpret bits (no-op in codegen)
ptrtoint
/
inttoptr
Pointer↔integer (avoid where possible)
需要了解的核心IR结构:
结构含义
alloca
栈分配(预SSA形式,
mem2reg
会将其提升为寄存器)
load
/
store
内存访问
getelementptr
(GEP)
指针运算 / 字段访问
phi
SSA φ节点:合并来自前驱块的值
call
/
invoke
函数调用(
invoke
带有异常边)
icmp
/
fcmp
整数/浮点数比较
br
分支(条件或无条件)
ret
返回
bitcast
位重解释(代码生成阶段无操作)
ptrtoint
/
inttoptr
指针与整数互转(尽量避免使用)

5. Key passes

5. 核心pass

PassEffect
mem2reg
Promote alloca to SSA registers
instcombine
Instruction combining / peephole
simplifycfg
CFG cleanup, dead block removal
loop-vectorize
Auto-vectorisation
slp-vectorize
Superword-level parallelism (straight-line vectorisation)
inline
Function inlining
gvn
Global value numbering (common subexpression elimination)
licm
Loop-invariant code motion
loop-unroll
Loop unrolling
argpromotion
Promote pointer args to values
sroa
Scalar Replacement of Aggregates
Pass作用
mem2reg
将alloca提升为SSA寄存器
instcombine
指令合并/窥孔优化
simplifycfg
CFG清理、死块移除
loop-vectorize
自动向量化
slp-vectorize
超字级并行(直线代码向量化)
inline
函数内联
gvn
全局值编号(公共子表达式消除)
licm
循环不变量外提
loop-unroll
循环展开
argpromotion
将指针参数提升为值
sroa
聚合类型标量替换

6. Debugging missed optimisations

6. 排查未触发的优化

bash
undefined
bash
undefined

Why was a loop not vectorised?

Why was a loop not vectorised?

clang -O2 -Rpass-missed=loop-vectorize -Rpass-analysis=loop-vectorize src.c
clang -O2 -Rpass-missed=loop-vectorize -Rpass-analysis=loop-vectorize src.c

Dump pass pipeline

Dump pass pipeline

clang -O2 -mllvm -debug-pass=Structure src.c -o /dev/null 2>&1 | less
clang -O2 -mllvm -debug-pass=Structure src.c -o /dev/null 2>&1 | less

Print IR after each pass (very verbose)

Print IR after each pass (very verbose)

opt -passes='default<O2>' -print-after-all src.ll -S 2>&1 | less
undefined
opt -passes='default<O2>' -print-after-all src.ll -S 2>&1 | less
undefined

7. Useful llvm tools

7. 实用LLVM工具

ToolPurpose
llvm-dis
Bitcode → textual IR
llvm-as
Textual IR → bitcode
llvm-link
Link multiple bitcode files
llvm-lto
Standalone LTO
llvm-nm
Symbols in bitcode/object
llvm-objdump
Disassemble objects
llvm-profdata
Merge/show PGO profiles
llvm-cov
Coverage reporting
llvm-mca
Machine code analyser (throughput/latency)
For binutils equivalents, see
skills/binaries/binutils
.
工具用途
llvm-dis
位码转为文本格式IR
llvm-as
文本格式IR转为位码
llvm-link
链接多个位码文件
llvm-lto
独立LTO工具
llvm-nm
查看位码/目标文件中的符号
llvm-objdump
反汇编目标文件
llvm-profdata
合并/展示PGO配置文件
llvm-cov
覆盖率报告
llvm-mca
机器代码分析器(吞吐量/延迟分析)
对应的binutils等价工具,参见
skills/binaries/binutils

Related skills

相关技能

  • Use
    skills/compilers/clang
    for source-level Clang flags
  • Use
    skills/binaries/linkers-lto
    for LTO at link time
  • Use
    skills/profilers/linux-perf
    combined with
    llvm-mca
    for micro-architectural analysis
  • 源码层面的Clang flag使用请参考
    skills/compilers/clang
  • 链接阶段的LTO相关内容请参考
    skills/binaries/linkers-lto
  • 微架构分析请结合
    skills/profilers/linux-perf
    llvm-mca
    使用