lattner-compiler-infrastructure
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseChris Lattner Style Guide
Chris Lattner 风格指南
Overview
概述
Chris Lattner created LLVM (the compiler infrastructure that powers most modern compilers), Clang (the C/C++/Objective-C frontend), Swift (Apple's systems language), and MLIR (multi-level intermediate representation). His work fundamentally changed how compilers are built and how languages evolve.
Chris Lattner 创建了LLVM(为多数现代编译器提供支持的编译器基础设施)、Clang(C/C++/Objective-C前端)、Swift(苹果的系统级语言)以及MLIR(多级中间表示)。他的工作从根本上改变了编译器的构建方式与语言的演进路径。
Core Philosophy
核心理念
"The key insight of LLVM is that compiler infrastructure should be reusable."
"Good IR design is about finding the right level of abstraction."
"Languages should evolve based on real-world usage, not theoretical purity."
Lattner believes in building robust, reusable infrastructure that enables an ecosystem of tools—not one-off solutions.
"LLVM的核心洞见在于编译器基础设施应当具备可复用性。"
"优秀的IR设计关键在于找到合适的抽象层级。"
"语言应当基于实际使用场景演进,而非追求理论纯粹性。"
Lattner 主张构建健壮、可复用的基础设施,以此支撑工具生态系统,而非打造一次性解决方案。
Design Principles
设计原则
-
Modular Infrastructure: Build reusable components, not monolithic systems.
-
Progressive Lowering: Transform through well-defined IR levels.
-
Library-First Design: Compilers are libraries, not just executables.
-
Pragmatic Evolution: Languages improve through real usage feedback.
-
模块化基础设施:构建可复用组件,而非单体式系统。
-
渐进式下推转换:通过定义清晰的IR层级完成转换。
-
优先库设计:编译器是库,而非仅为可执行文件。
-
务实演进:语言通过实际使用反馈逐步改进。
When Writing Compiler Code
编写编译器代码时的准则
Always
始终遵循
- Design IRs with clear semantics and invariants
- Make passes composable and reusable
- Provide excellent diagnostics and error messages
- Build infrastructure others can extend
- Think about the entire compilation pipeline
- Document design decisions and tradeoffs
- 设计具备清晰语义与不变性的IR
- 让编译过程(pass)具备可组合性与可复用性
- 提供优质的诊断与错误信息
- 构建可供他人扩展的基础设施
- 考量整个编译流水线
- 记录设计决策与权衡取舍
Never
绝对避免
- Build closed, monolithic compiler architectures
- Sacrifice usability for implementation convenience
- Ignore error recovery and diagnostics
- Let optimization passes have hidden dependencies
- Couple frontend concerns with backend concerns
- Design IRs without considering transformations
- 构建封闭的单体式编译器架构
- 为实现便利牺牲易用性
- 忽略错误恢复与诊断功能
- 让优化过程存在隐藏依赖
- 将前端关注点与后端关注点耦合
- 在设计IR时不考虑转换需求
Prefer
优先选择
- SSA form for optimization IRs
- Explicit type systems over implicit
- Library APIs over command-line tools
- Incremental compilation where possible
- Clear phase ordering over ad-hoc passes
- Compositional design over special cases
- 针对优化IR使用SSA形式
- 显式类型系统而非隐式类型系统
- 库API而非命令行工具
- 尽可能采用增量编译
- 清晰的阶段排序而非临时编译过程
- 组合式设计而非特殊用例
Code Patterns
代码模式
LLVM IR Philosophy
LLVM IR 设计理念
llvm
; LLVM IR: explicit, typed, SSA form
; Every value has exactly one definition
; Control flow is explicit
define i32 @factorial(i32 %n) {
entry:
%cmp = icmp sle i32 %n, 1
br i1 %cmp, label %base, label %recurse
base:
ret i32 1
recurse:
%n_minus_1 = sub i32 %n, 1
%fact_sub = call i32 @factorial(i32 %n_minus_1)
%result = mul i32 %n, %fact_sub
ret i32 %result
}
; Key properties:
; - SSA: each %variable defined exactly once
; - Typed: every operation has explicit types
; - Explicit control flow: br, ret, etc.
; - No hidden state or side effects in IRllvm
; LLVM IR: explicit, typed, SSA form
; Every value has exactly one definition
; Control flow is explicit
define i32 @factorial(i32 %n) {
entry:
%cmp = icmp sle i32 %n, 1
br i1 %cmp, label %base, label %recurse
base:
ret i32 1
recurse:
%n_minus_1 = sub i32 %n, 1
%fact_sub = call i32 @factorial(i32 %n_minus_1)
%result = mul i32 %n, %fact_sub
ret i32 %result
}
; Key properties:
; - SSA: each %variable defined exactly once
; - Typed: every operation has explicit types
; - Explicit control flow: br, ret, etc.
; - No hidden state or side effects in IRPass Infrastructure Design
编译过程(Pass)基础设施设计
cpp
// LLVM-style pass infrastructure
// Passes are modular, composable, declarative
class MyOptimizationPass : public PassInfoMixin<MyOptimizationPass> {
public:
PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {
// Get required analyses
auto &DT = AM.getResult<DominatorTreeAnalysis>(F);
auto &LI = AM.getResult<LoopAnalysis>(F);
bool Changed = false;
for (auto &BB : F) {
Changed |= optimizeBlock(BB, DT, LI);
}
if (!Changed)
return PreservedAnalyses::all();
// Declare what we preserved
PreservedAnalyses PA;
PA.preserve<DominatorTreeAnalysis>();
return PA;
}
private:
bool optimizeBlock(BasicBlock &BB, DominatorTree &DT, LoopInfo &LI);
};
// Register the pass
extern "C" LLVM_ATTRIBUTE_WEAK ::llvm::PassPluginLibraryInfo
llvmGetPassPluginInfo() {
return {
LLVM_PLUGIN_API_VERSION, "MyPass", "v0.1",
[](PassBuilder &PB) {
PB.registerPipelineParsingCallback(
[](StringRef Name, FunctionPassManager &FPM,
ArrayRef<PassBuilder::PipelineElement>) {
if (Name == "my-opt") {
FPM.addPass(MyOptimizationPass());
return true;
}
return false;
});
}
};
}cpp
// LLVM-style pass infrastructure
// Passes are modular, composable, declarative
class MyOptimizationPass : public PassInfoMixin<MyOptimizationPass> {
public:
PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {
// Get required analyses
auto &DT = AM.getResult<DominatorTreeAnalysis>(F);
auto &LI = AM.getResult<LoopAnalysis>(F);
bool Changed = false;
for (auto &BB : F) {
Changed |= optimizeBlock(BB, DT, LI);
}
if (!Changed)
return PreservedAnalyses::all();
// Declare what we preserved
PreservedAnalyses PA;
PA.preserve<DominatorTreeAnalysis>();
return PA;
}
private:
bool optimizeBlock(BasicBlock &BB, DominatorTree &DT, LoopInfo &LI);
};
// Register the pass
extern "C" LLVM_ATTRIBUTE_WEAK ::llvm::PassPluginLibraryInfo
llvmGetPassPluginInfo() {
return {
LLVM_PLUGIN_API_VERSION, "MyPass", "v0.1",
[](PassBuilder &PB) {
PB.registerPipelineParsingCallback(
[](StringRef Name, FunctionPassManager &FPM,
ArrayRef<PassBuilder::PipelineElement>) {
if (Name == "my-opt") {
FPM.addPass(MyOptimizationPass());
return true;
}
return false;
});
}
};
}Diagnostic Excellence
优质诊断实现
cpp
// Swift/Clang-style diagnostics
// Errors should be helpful, not cryptic
class DiagnosticEngine {
public:
// Structured diagnostics with fix-its
void diagnose(SourceLoc Loc, Diagnostic Diag) {
emitDiagnostic(Loc, Diag.getKind(), Diag.getMessage());
// Show the source location
emitSourceSnippet(Loc);
// Provide fix-its when possible
for (auto &FixIt : Diag.getFixIts()) {
emitFixIt(FixIt);
}
// Add educational notes
for (auto &Note : Diag.getNotes()) {
emitNote(Note);
}
}
};
// Example diagnostic output:
// error: cannot convert value of type 'String' to expected type 'Int'
// let x: Int = "hello"
// ^~~~~~~
// fix-it: did you mean to use Int(_:)?
// let x: Int = Int("hello") ?? 0cpp
// Swift/Clang-style diagnostics
// Errors should be helpful, not cryptic
class DiagnosticEngine {
public:
// Structured diagnostics with fix-its
void diagnose(SourceLoc Loc, Diagnostic Diag) {
emitDiagnostic(Loc, Diag.getKind(), Diag.getMessage());
// Show the source location
emitSourceSnippet(Loc);
// Provide fix-its when possible
for (auto &FixIt : Diag.getFixIts()) {
emitFixIt(FixIt);
}
// Add educational notes
for (auto &Note : Diag.getNotes()) {
emitNote(Note);
}
}
};
// Example diagnostic output:
// error: cannot convert value of type 'String' to expected type 'Int'
// let x: Int = "hello"
// ^~~~~~~
// fix-it: did you mean to use Int(_:)?
// let x: Int = Int("hello") ?? 0Progressive Lowering (MLIR Style)
渐进式下推转换(MLIR 风格)
cpp
// MLIR: Multi-Level IR for progressive lowering
// High-level ops → Mid-level ops → Low-level ops → LLVM IR
// High-level: domain-specific operations
%result = linalg.matmul ins(%A, %B : tensor<4x8xf32>, tensor<8x16xf32>)
outs(%C : tensor<4x16xf32>) -> tensor<4x16xf32>
// After tiling transformation:
%tiled = scf.for %i = %c0 to %c4 step %c2 {
%slice_a = tensor.extract_slice %A[%i, 0][2, 8][1, 1]
%slice_c = tensor.extract_slice %C[%i, 0][2, 16][1, 1]
%computed = linalg.matmul ins(%slice_a, %B) outs(%slice_c)
scf.yield %computed
}
// After vectorization:
%vec = vector.contract {indexing_maps = [...], kind = #vector.kind<add>}
%vec_a, %vec_b, %vec_c : vector<2x8xf32>, vector<8x16xf32> into vector<2x16xf32>
// Finally: LLVM IR
// Each level has clear semantics and transformationscpp
// MLIR: Multi-Level IR for progressive lowering
// High-level ops → Mid-level ops → Low-level ops → LLVM IR
// High-level: domain-specific operations
%result = linalg.matmul ins(%A, %B : tensor<4x8xf32>, tensor<8x16xf32>)
outs(%C : tensor<4x16xf32>) -> tensor<4x16xf32>
// After tiling transformation:
%tiled = scf.for %i = %c0 to %c4 step %c2 {
%slice_a = tensor.extract_slice %A[%i, 0][2, 8][1, 1]
%slice_c = tensor.extract_slice %C[%i, 0][2, 16][1, 1]
%computed = linalg.matmul ins(%slice_a, %B) outs(%slice_c)
scf.yield %computed
}
// After vectorization:
%vec = vector.contract {indexing_maps = [...], kind = #vector.kind<add>}
%vec_a, %vec_b, %vec_c : vector<2x8xf32>, vector<8x16xf32> into vector<2x16xf32>
// Finally: LLVM IR
// Each level has clear semantics and transformationsType System Design
类型系统设计
swift
// Swift-style type system: expressive, safe, pragmatic
// Protocol-oriented design
protocol Numeric {
static func +(lhs: Self, rhs: Self) -> Self
static func *(lhs: Self, rhs: Self) -> Self
}
// Associated types for flexibility
protocol Collection {
associatedtype Element
associatedtype Index: Comparable
var startIndex: Index { get }
var endIndex: Index { get }
subscript(position: Index) -> Element { get }
}
// Generics with constraints
func sum<T: Numeric>(_ values: [T]) -> T {
values.reduce(.zero, +)
}
// Optionals as explicit nullability
func find<T: Equatable>(_ value: T, in array: [T]) -> Int? {
for (index, element) in array.enumerated() {
if element == value {
return index
}
}
return nil // Explicit absence
}
// Result types for error handling
enum Result<Success, Failure: Error> {
case success(Success)
case failure(Failure)
}swift
// Swift-style type system: expressive, safe, pragmatic
// Protocol-oriented design
protocol Numeric {
static func +(lhs: Self, rhs: Self) -> Self
static func *(lhs: Self, rhs: Self) -> Self
}
// Associated types for flexibility
protocol Collection {
associatedtype Element
associatedtype Index: Comparable
var startIndex: Index { get }
var endIndex: Index { get }
subscript(position: Index) -> Element { get }
}
// Generics with constraints
func sum<T: Numeric>(_ values: [T]) -> T {
values.reduce(.zero, +)
}
// Optionals as explicit nullability
func find<T: Equatable>(_ value: T, in array: [T]) -> Int? {
for (index, element) in array.enumerated() {
if element == value {
return index
}
}
return nil // Explicit absence
}
// Result types for error handling
enum Result<Success, Failure: Error> {
case success(Success)
case failure(Failure)
}Compiler as Library
编译器即库
cpp
// Clang as a library, not just a tool
// Enable building custom tools on compiler infrastructure
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Frontend/FrontendActions.h"
#include "clang/Tooling/Tooling.h"
// Custom AST visitor
class FunctionFinder : public RecursiveASTVisitor<FunctionFinder> {
public:
bool VisitFunctionDecl(FunctionDecl *FD) {
if (FD->hasBody()) {
llvm::outs() << "Found function: " << FD->getName() << "\n";
analyzeComplexity(FD);
}
return true;
}
private:
void analyzeComplexity(FunctionDecl *FD);
};
// Build custom tools using Clang's libraries
int main(int argc, const char **argv) {
auto ExpectedParser = CommonOptionsParser::create(argc, argv, MyCategory);
if (!ExpectedParser) {
llvm::errs() << ExpectedParser.takeError();
return 1;
}
ClangTool Tool(ExpectedParser->getCompilations(),
ExpectedParser->getSourcePathList());
return Tool.run(newFrontendActionFactory<MyFrontendAction>().get());
}cpp
// Clang as a library, not just a tool
// Enable building custom tools on compiler infrastructure
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Frontend/FrontendActions.h"
#include "clang/Tooling/Tooling.h"
// Custom AST visitor
class FunctionFinder : public RecursiveASTVisitor<FunctionFinder> {
public:
bool VisitFunctionDecl(FunctionDecl *FD) {
if (FD->hasBody()) {
llvm::outs() << "Found function: " << FD->getName() << "\n";
analyzeComplexity(FD);
}
return true;
}
private:
void analyzeComplexity(FunctionDecl *FD);
};
// Build custom tools using Clang's libraries
int main(int argc, const char **argv) {
auto ExpectedParser = CommonOptionsParser::create(argc, argv, MyCategory);
if (!ExpectedParser) {
llvm::errs() << ExpectedParser.takeError();
return 1;
}
ClangTool Tool(ExpectedParser->getCompilations(),
ExpectedParser->getSourcePathList());
return Tool.run(newFrontendActionFactory<MyFrontendAction>().get());
}Memory Ownership in Swift
Swift 中的内存所有权
swift
// Swift's ownership model: safe by default, explicit when needed
// Default: automatic reference counting
class Node {
var value: Int
var children: [Node]
init(value: Int) {
self.value = value
self.children = []
}
}
// Explicit ownership for performance-critical code
func processBuffer(_ buffer: borrowing [UInt8]) -> Int {
// borrowing: read-only access, no copy
buffer.reduce(0, +)
}
func consumeBuffer(_ buffer: consuming [UInt8]) -> [UInt8] {
// consuming: takes ownership, no copy
var result = buffer
result.append(0)
return result
}
// Copy-on-write for value semantics with efficiency
struct LargeData {
private var storage: Storage
mutating func modify() {
// Copy only if shared
if !isKnownUniquelyReferenced(&storage) {
storage = storage.copy()
}
storage.data[0] = 42
}
}swift
// Swift's ownership model: safe by default, explicit when needed
// Default: automatic reference counting
class Node {
var value: Int
var children: [Node]
init(value: Int) {
self.value = value
self.children = []
}
}
// Explicit ownership for performance-critical code
func processBuffer(_ buffer: borrowing [UInt8]) -> Int {
// borrowing: read-only access, no copy
buffer.reduce(0, +)
}
func consumeBuffer(_ buffer: consuming [UInt8]) -> [UInt8] {
// consuming: takes ownership, no copy
var result = buffer
result.append(0)
return result
}
// Copy-on-write for value semantics with efficiency
struct LargeData {
private var storage: Storage
mutating func modify() {
// Copy only if shared
if !isKnownUniquelyReferenced(&storage) {
storage = storage.copy()
}
storage.data[0] = 42
}
}IR Design Principles
IR 设计原则
Intermediate Representation Design
══════════════════════════════════════════════════════════════
Level Abstraction Purpose
────────────────────────────────────────────────────────────
Source Syntax trees Parsing, early semantic
AST/HIR Typed trees Type checking, inference
MIR/SIL Typed CFG Optimization, ownership
LLVM IR Typed SSA Machine-independent opt
Machine IR Target ops Instruction selection
Assembly Text Final output
Key principles:
• Each level has ONE clear purpose
• Lowering is progressive and well-defined
• Analyses valid at one level may not be at another
• Transformations declare their requirementsIntermediate Representation Design
══════════════════════════════════════════════════════════════
Level Abstraction Purpose
────────────────────────────────────────────────────────────
Source Syntax trees Parsing, early semantic
AST/HIR Typed trees Type checking, inference
MIR/SIL Typed CFG Optimization, ownership
LLVM IR Typed SSA Machine-independent opt
Machine IR Target ops Instruction selection
Assembly Text Final output
Key principles:
• Each level has ONE clear purpose
• Lowering is progressive and well-defined
• Analyses valid at one level may not be at another
• Transformations declare their requirementsMental Model
思维模型
Lattner approaches compiler design by asking:
- What's the right abstraction level? Different problems need different IRs
- Is this reusable? Build infrastructure, not one-off tools
- What's the user experience? Diagnostics, error recovery, tooling
- How will this evolve? Design for change and extension
- Can others build on this? Library-first, composable design
Lattner 设计编译器时会思考以下问题:
- 合适的抽象层级是什么? 不同问题需要不同的IR
- 这是否可复用? 构建基础设施,而非一次性工具
- 用户体验如何? 诊断功能、错误恢复、工具链
- 它将如何演进? 为变更与扩展而设计
- 他人能否在此基础上构建? 优先库设计、组合式设计
Signature Lattner Moves
Lattner 的标志性设计
- LLVM's pass manager: Modular, composable optimization passes
- Clang's diagnostics: The gold standard for helpful error messages
- Swift's optionals: Explicit nullability without verbosity
- MLIR's dialect system: Multi-level IR with extensible operations
- Library-first design: Compilers as reusable infrastructure
- Progressive lowering: Clear transformation stages
- SwiftUI's result builders: Compiler magic that feels natural
- LLVM 的 Pass 管理器:模块化、可组合的优化过程
- Clang 的诊断系统:优质错误信息的黄金标准
- Swift 的可选类型(Optionals):无需冗余代码的显式空值处理
- MLIR 的方言系统:具备可扩展操作的多级IR
- 优先库设计:作为可复用基础设施的编译器
- 渐进式下推转换:清晰的转换阶段
- SwiftUI 的结果构建器:自然易用的编译器魔法