torvalds-kernel-pragmatism

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Linus Torvalds Style Guide

Linus Torvalds 代码风格指南

Overview

概述

Linus Torvalds created the Linux kernel and Git, managing one of the largest collaborative software projects in history. His approach combines deep technical excellence with pragmatic decision-making and famously direct code review.

Linus Torvalds创建了Linux内核和Git，管理着史上规模最大的协作软件项目之一。他的方法将深厚的技术卓越性与务实决策相结合，还有著名的直接式代码评审。

Core Philosophy

核心理念

"Talk is cheap. Show me the code."

"Bad programmers worry about the code. Good programmers worry about data structures and their relationships."

"Given enough eyeballs, all bugs are shallow."

Torvalds believes in practical excellence: code that works, performs well, and can be maintained by a distributed team of thousands.

"空谈无用，上代码。"

"糟糕的程序员关心代码，优秀的程序员关心数据结构及其关系。"

"只要有足够多的人审视，所有漏洞都不难发现。"

Torvalds坚信务实卓越：代码要能正常运行、性能优异，并且可供数千人的分布式团队维护。

Design Principles

设计原则

Data Structures First: Get the data structures right; the code follows.
Performance Matters: Understand cache, branches, and memory.
Pragmatism Over Purity: Working code beats elegant theory.
Code Review Is Essential: Every patch must withstand scrutiny.

数据结构优先：先设计好数据结构，代码自然水到渠成。
性能至关重要：理解缓存、分支和内存机制。
务实优先于纯粹：能运行的代码胜过优雅的理论。
代码评审必不可少：每一个补丁都必须经得起严格审查。

When Writing Code

代码编写规范

Always

必须遵守

Design data structures before algorithms
Think about cache locality
Profile before optimizing
Write clear commit messages
Keep patches small and focused
Test on real hardware

先设计数据结构，再考虑算法
考虑缓存局部性
先做性能分析再优化
撰写清晰的提交信息
保持补丁小巧且聚焦单一功能
在真实硬件上测试

Never

绝对禁止

Submit untested code
Ignore performance implications
Use abstractions that hide costs
Write clever code that obscures intent
Break userspace API/ABI
Ignore reviewer feedback

提交未经测试的代码
忽视性能影响
使用隐藏性能开销的抽象层
写意图晦涩的“聪明”代码
破坏用户空间API/ABI
忽视评审者的反馈

Prefer

优先选择

Arrays over linked lists (cache friendly)
Simple loops over recursion
Inline functions over macros
Explicit state over hidden magic
Measured optimizations over speculative

优先使用数组而非链表（缓存友好）
优先使用简单循环而非递归
优先使用内联函数而非宏
优先使用显式状态而非隐藏的“黑魔法”
优先基于实测的优化而非推测性优化

Code Patterns

代码模式

Linux Kernel Style

Linux内核代码风格

// kernel style: tabs, 80 columns, spaces around operators

#include <linux/kernel.h>
#include <linux/slab.h>

struct device_data {
        struct list_head list;
        unsigned long flags;
        void __iomem *base;
        int irq;
};

static int device_init(struct device_data *dev)
{
        int ret;

        dev->base = ioremap(DEVICE_BASE, DEVICE_SIZE);
        if (!dev->base) {
                pr_err("Failed to map device memory\n");
                return -ENOMEM;
        }

        ret = request_irq(dev->irq, device_handler, 0, "mydev", dev);
        if (ret) {
                iounmap(dev->base);
                return ret;
        }

        return 0;
}

// kernel style: tabs, 80 columns, spaces around operators

#include <linux/kernel.h>
#include <linux/slab.h>

struct device_data {
        struct list_head list;
        unsigned long flags;
        void __iomem *base;
        int irq;
};

static int device_init(struct device_data *dev)
{
        int ret;

        dev->base = ioremap(DEVICE_BASE, DEVICE_SIZE);
        if (!dev->base) {
                pr_err("Failed to map device memory\n");
                return -ENOMEM;
        }

        ret = request_irq(dev->irq, device_handler, 0, "mydev", dev);
        if (ret) {
                iounmap(dev->base);
                return ret;
        }

        return 0;
}

Data Structures Matter

数据结构至关重要

// BAD: Linked list for frequently traversed data
struct node {
    struct node *next;
    int value;
};

// Traversal: terrible cache behavior
// Each node is a cache miss

// GOOD: Array-based for cache locality
struct array {
    int *values;
    size_t count;
    size_t capacity;
};

// Traversal: sequential memory access
// Prefetcher works, cache is happy


// When you need linked lists, use the kernel's
#include <linux/list.h>

struct my_item {
    struct list_head list;  // Embed the list node
    int data;
};

struct list_head my_list;
INIT_LIST_HEAD(&my_list);

// Iterate safely
struct my_item *item;
list_for_each_entry(item, &my_list, list) {
    process(item->data);
}

// BAD: Linked list for frequently traversed data
struct node {
    struct node *next;
    int value;
};

// Traversal: terrible cache behavior
// Each node is a cache miss

// GOOD: Array-based for cache locality
struct array {
    int *values;
    size_t count;
    size_t capacity;
};

// Traversal: sequential memory access
// Prefetcher works, cache is happy


// When you need linked lists, use the kernel's
#include <linux/list.h>

struct my_item {
    struct list_head list;  // Embed the list node
    int data;
};

struct list_head my_list;
INIT_LIST_HEAD(&my_list);

// Iterate safely
struct my_item *item;
list_for_each_entry(item, &my_list, list) {
    process(item->data);
}

Error Handling Patterns

错误处理模式

// Single exit point with goto for cleanup
int complex_init(struct device *dev)
{
        int ret;

        dev->buffer = kmalloc(BUF_SIZE, GFP_KERNEL);
        if (!dev->buffer) {
                ret = -ENOMEM;
                goto err_buffer;
        }

        dev->workqueue = create_workqueue("mydev");
        if (!dev->workqueue) {
                ret = -ENOMEM;
                goto err_workqueue;
        }

        ret = register_device(dev);
        if (ret)
                goto err_register;

        return 0;

err_register:
        destroy_workqueue(dev->workqueue);
err_workqueue:
        kfree(dev->buffer);
err_buffer:
        return ret;
}

// Cleanup in reverse order of initialization
// One error path, easy to audit

// Single exit point with goto for cleanup
int complex_init(struct device *dev)
{
        int ret;

        dev->buffer = kmalloc(BUF_SIZE, GFP_KERNEL);
        if (!dev->buffer) {
                ret = -ENOMEM;
                goto err_buffer;
        }

        dev->workqueue = create_workqueue("mydev");
        if (!dev->workqueue) {
                ret = -ENOMEM;
                goto err_workqueue;
        }

        ret = register_device(dev);
        if (ret)
                goto err_register;

        return 0;

err_register:
        destroy_workqueue(dev->workqueue);
err_workqueue:
        kfree(dev->buffer);
err_buffer:
        return ret;
}

// Cleanup in reverse order of initialization
// One error path, easy to audit

Commit Message Excellence

优秀的提交信息范例

subsystem: short summary (50 chars or less)

More detailed explanatory text, if necessary. Wrap it to about 72
characters. The blank line separating the summary from the body is
critical.

Explain the problem that this commit is solving. Focus on why you
are making this change as opposed to how. The code shows the how.

If there are any side effects or other unintuitive consequences of
this change, explain them here.

Fixes: abc123def456 ("commit that introduced bug")
Reported-by: Someone <someone@example.com>
Signed-off-by: Your Name <you@example.com>

subsystem: short summary (50 chars or less)

More detailed explanatory text, if necessary. Wrap it to about 72
characters. The blank line separating the summary from the body is
critical.

Explain the problem that this commit is solving. Focus on why you
are making this change as opposed to how. The code shows the how.

If there are any side effects or other unintuitive consequences of
this change, explain them here.

Fixes: abc123def456 ("commit that introduced bug")
Reported-by: Someone <someone@example.com>
Signed-off-by: Your Name <you@example.com>

Performance-Conscious Code

性能敏感型代码

// Branch prediction: common case first
if (likely(fast_path_condition)) {
    // Common case
    return quick_result;
}
// Slow path
return handle_slow_case();


// Cache-friendly iteration
// BAD: strided access
for (int i = 0; i < rows; i++)
    for (int j = 0; j < cols; j++)
        process(matrix[j][i]);  // Column-major = cache misses

// GOOD: sequential access
for (int i = 0; i < rows; i++)
    for (int j = 0; j < cols; j++)
        process(matrix[i][j]);  // Row-major = cache friendly


// Avoid unnecessary memory barriers
// Use READ_ONCE/WRITE_ONCE for shared data
int value = READ_ONCE(shared_variable);
WRITE_ONCE(shared_variable, new_value);

// Branch prediction: common case first
if (likely(fast_path_condition)) {
    // Common case
    return quick_result;
}
// Slow path
return handle_slow_case();


// Cache-friendly iteration
// BAD: strided access
for (int i = 0; i < rows; i++)
    for (int j = 0; j < cols; j++)
        process(matrix[j][i]);  // Column-major = cache misses

// GOOD: sequential access
for (int i = 0; i < rows; i++)
    for (int j = 0; j < cols; j++)
        process(matrix[i][j]);  // Row-major = cache friendly


// Avoid unnecessary memory barriers
// Use READ_ONCE/WRITE_ONCE for shared data
int value = READ_ONCE(shared_variable);
WRITE_ONCE(shared_variable, new_value);

Git Usage

Git使用规范

bash

undefined

bash

undefined

Torvalds Git workflow

Commit often, commit small

git add -p # Stage hunks, not files git commit -m "subsystem: specific change"

Rebase for clean history (before sharing)

git rebase -i HEAD~5 # Clean up local commits

Never rebase published history

History is sacred once pushed

Bisect to find bugs

git bisect start git bisect bad HEAD git bisect good v5.10

Git finds the breaking commit

Blame to understand code

git blame -w -C -C file.c # Ignore whitespace, track moves

undefined

git blame -w -C -C file.c # Ignore whitespace, track moves

undefined

Subsystem Design

子系统设计

// Define clear boundaries between subsystems
// Each subsystem has:
// 1. Public API (exported symbols)
// 2. Internal implementation
// 3. Data structures

// Public API
int subsystem_init(void);
void subsystem_cleanup(void);
int subsystem_do_thing(struct thing *t);

// Internal - not exported
static int internal_helper(void);
static struct cache internal_cache;

// Use proper namespacing
// subsystem_verb_noun()

int netdev_register_device(struct net_device *dev);
int netdev_unregister_device(struct net_device *dev);
int blkdev_read_sector(struct block_device *bdev, sector_t sector);

// Define clear boundaries between subsystems
// Each subsystem has:
// 1. Public API (exported symbols)
// 2. Internal implementation
// 3. Data structures

// Public API
int subsystem_init(void);
void subsystem_cleanup(void);
int subsystem_do_thing(struct thing *t);

// Internal - not exported
static int internal_helper(void);
static struct cache internal_cache;

// Use proper namespacing
// subsystem_verb_noun()

int netdev_register_device(struct net_device *dev);
int netdev_unregister_device(struct net_device *dev);
int blkdev_read_sector(struct block_device *bdev, sector_t sector);

Reference Counting

引用计数

#include <linux/kref.h>

struct my_object {
    struct kref refcount;
    // ... other fields
};

static void my_object_release(struct kref *kref)
{
    struct my_object *obj = container_of(kref, struct my_object, refcount);
    kfree(obj);
}

// Get reference
struct my_object *my_object_get(struct my_object *obj)
{
    if (obj)
        kref_get(&obj->refcount);
    return obj;
}

// Release reference
void my_object_put(struct my_object *obj)
{
    if (obj)
        kref_put(&obj->refcount, my_object_release);
}

#include <linux/kref.h>

struct my_object {
    struct kref refcount;
    // ... other fields
};

static void my_object_release(struct kref *kref)
{
    struct my_object *obj = container_of(kref, struct my_object, refcount);
    kfree(obj);
}

// Get reference
struct my_object *my_object_get(struct my_object *obj)
{
    if (obj)
        kref_get(&obj->refcount);
    return obj;
}

// Release reference
void my_object_put(struct my_object *obj)
{
    if (obj)
        kref_put(&obj->refcount, my_object_release);
}

Mental Model

思维模型

Torvalds approaches systems code by asking:

What are the data structures? Design these first
What's the cache behavior? Memory access patterns matter
What's the common case? Optimize for it
Can I review this easily? Clear code, small patches
What breaks if this is wrong? Systems code must be reliable

Torvalds在编写系统代码时会思考以下问题：

数据结构是什么？ 先设计好它们
缓存表现如何？ 内存访问模式至关重要
常见场景是什么？ 针对它做优化
我能轻松评审这段代码吗？ 代码要清晰，补丁要小巧
如果出错会影响什么？ 系统代码必须可靠

Signature Torvalds Moves

Torvalds标志性技巧

Data structures before algorithms
goto for cleanup (in kernel code)
likely/unlikely for branch hints
Cache-conscious data layout
Small, focused commits
Direct, honest code review

先数据结构后算法
在内核代码中使用goto做清理
使用likely/unlikely做分支提示
缓存友好的数据布局
小巧且聚焦的提交
直接、坦诚的代码评审