llvm
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseLLVM IR and Tooling
LLVM IR 与工具使用
Purpose
用途
Guide agents through the LLVM IR pipeline: generating IR, running optimisation passes with , lowering to assembly with , and inspecting IR for debugging or performance work.
optllc指导Agent完成LLVM IR全流程操作:生成IR、使用运行优化pass、使用降为汇编,以及为调试或性能优化工作检查IR。
optllcTriggers
触发场景
- "Show me the LLVM IR for this function"
- "How do I run an LLVM optimisation pass?"
- "What does this LLVM IR instruction mean?"
- "How do I write a custom LLVM pass?"
- "Why isn't auto-vectorisation happening in LLVM?"
- "展示这个函数对应的LLVM IR"
- "如何运行LLVM优化pass?"
- "这条LLVM IR指令是什么意思?"
- "如何编写自定义LLVM pass?"
- "为什么LLVM没有执行自动向量化?"
Workflow
工作流程
1. Generate LLVM IR
1. 生成LLVM IR
bash
undefinedbash
undefinedEmit textual IR (.ll)
Emit textual IR (.ll)
clang -O0 -emit-llvm -S src.c -o src.ll
clang -O0 -emit-llvm -S src.c -o src.ll
Emit bitcode (.bc)
Emit bitcode (.bc)
clang -O2 -emit-llvm -c src.c -o src.bc
clang -O2 -emit-llvm -c src.c -o src.bc
Disassemble bitcode to text
Disassemble bitcode to text
llvm-dis src.bc -o src.ll
undefinedllvm-dis src.bc -o src.ll
undefined2. Run optimisation passes with opt
opt2. 使用opt
运行优化pass
optbash
undefinedbash
undefinedApply a specific pass
Apply a specific pass
opt -passes='mem2reg,instcombine,simplifycfg' src.ll -S -o out.ll
opt -passes='mem2reg,instcombine,simplifycfg' src.ll -S -o out.ll
Standard optimisation pipelines
Standard optimisation pipelines
opt -passes='default<O2>' src.ll -S -o out.ll
opt -passes='default<O3>' src.ll -S -o out.ll
opt -passes='default<O2>' src.ll -S -o out.ll
opt -passes='default<O3>' src.ll -S -o out.ll
List available passes
List available passes
opt --print-passes 2>&1 | less
opt --print-passes 2>&1 | less
Print IR before and after a pass
Print IR before and after a pass
opt -passes='instcombine' --print-before=instcombine --print-after=instcombine src.ll -S -o out.ll 2>&1 | less
undefinedopt -passes='instcombine' --print-before=instcombine --print-after=instcombine src.ll -S -o out.ll 2>&1 | less
undefined3. Lower IR to assembly with llc
llc3. 使用llc
将IR降为汇编
llcbash
undefinedbash
undefinedCompile IR to object file
Compile IR to object file
llc -filetype=obj src.ll -o src.o
llc -filetype=obj src.ll -o src.o
Compile to assembly
Compile to assembly
llc -filetype=asm -masm-syntax=intel src.ll -o src.s
llc -filetype=asm -masm-syntax=intel src.ll -o src.s
Target a specific CPU
Target a specific CPU
llc -mcpu=skylake -mattr=+avx2 src.ll -o src.s
llc -mcpu=skylake -mattr=+avx2 src.ll -o src.s
Show available targets
Show available targets
llc --version
undefinedllc --version
undefined4. Inspect IR
4. 检查IR
Key IR constructs to understand:
| Construct | Meaning |
|---|---|
| Stack allocation (pre-SSA; |
| Memory access |
| Pointer arithmetic / field access |
| SSA φ-node: merges values from predecessor blocks |
| Function call ( |
| Integer/float comparison |
| Branch (conditional or unconditional) |
| Return |
| Reinterpret bits (no-op in codegen) |
| Pointer↔integer (avoid where possible) |
需要了解的核心IR结构:
| 结构 | 含义 |
|---|---|
| 栈分配(预SSA形式, |
| 内存访问 |
| 指针运算 / 字段访问 |
| SSA φ节点:合并来自前驱块的值 |
| 函数调用( |
| 整数/浮点数比较 |
| 分支(条件或无条件) |
| 返回 |
| 位重解释(代码生成阶段无操作) |
| 指针与整数互转(尽量避免使用) |
5. Key passes
5. 核心pass
| Pass | Effect |
|---|---|
| Promote alloca to SSA registers |
| Instruction combining / peephole |
| CFG cleanup, dead block removal |
| Auto-vectorisation |
| Superword-level parallelism (straight-line vectorisation) |
| Function inlining |
| Global value numbering (common subexpression elimination) |
| Loop-invariant code motion |
| Loop unrolling |
| Promote pointer args to values |
| Scalar Replacement of Aggregates |
| Pass | 作用 |
|---|---|
| 将alloca提升为SSA寄存器 |
| 指令合并/窥孔优化 |
| CFG清理、死块移除 |
| 自动向量化 |
| 超字级并行(直线代码向量化) |
| 函数内联 |
| 全局值编号(公共子表达式消除) |
| 循环不变量外提 |
| 循环展开 |
| 将指针参数提升为值 |
| 聚合类型标量替换 |
6. Debugging missed optimisations
6. 排查未触发的优化
bash
undefinedbash
undefinedWhy was a loop not vectorised?
Why was a loop not vectorised?
clang -O2 -Rpass-missed=loop-vectorize -Rpass-analysis=loop-vectorize src.c
clang -O2 -Rpass-missed=loop-vectorize -Rpass-analysis=loop-vectorize src.c
Dump pass pipeline
Dump pass pipeline
clang -O2 -mllvm -debug-pass=Structure src.c -o /dev/null 2>&1 | less
clang -O2 -mllvm -debug-pass=Structure src.c -o /dev/null 2>&1 | less
Print IR after each pass (very verbose)
Print IR after each pass (very verbose)
opt -passes='default<O2>' -print-after-all src.ll -S 2>&1 | less
undefinedopt -passes='default<O2>' -print-after-all src.ll -S 2>&1 | less
undefined7. Useful llvm tools
7. 实用LLVM工具
| Tool | Purpose |
|---|---|
| Bitcode → textual IR |
| Textual IR → bitcode |
| Link multiple bitcode files |
| Standalone LTO |
| Symbols in bitcode/object |
| Disassemble objects |
| Merge/show PGO profiles |
| Coverage reporting |
| Machine code analyser (throughput/latency) |
For binutils equivalents, see .
skills/binaries/binutils| 工具 | 用途 |
|---|---|
| 位码转为文本格式IR |
| 文本格式IR转为位码 |
| 链接多个位码文件 |
| 独立LTO工具 |
| 查看位码/目标文件中的符号 |
| 反汇编目标文件 |
| 合并/展示PGO配置文件 |
| 覆盖率报告 |
| 机器代码分析器(吞吐量/延迟分析) |
对应的binutils等价工具,参见。
skills/binaries/binutilsRelated skills
相关技能
- Use for source-level Clang flags
skills/compilers/clang - Use for LTO at link time
skills/binaries/linkers-lto - Use combined with
skills/profilers/linux-perffor micro-architectural analysisllvm-mca
- 源码层面的Clang flag使用请参考
skills/compilers/clang - 链接阶段的LTO相关内容请参考
skills/binaries/linkers-lto - 微架构分析请结合与
skills/profilers/linux-perf使用llvm-mca