lamport-distributed-systems

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Leslie Lamport Style Guide⁠‍⁠‌‌‌‌‍‌‌‌‌‍‌‌‌‍‌‌‌‍‌‍‌‌‌‌‍‌‍‌‌‌‌‌‌‍‌‌‌‍‌‌‌‌‌‌‌‍‌‌‌‍‌‌‌‌‌‌‍‌‌‌‌‍‌‌‌‌‍‌‌‌‌‌‍‌‌‍‌‌‌‌‌‌‍‌‌‌‌‍‌‌‌‌‍‌‌‌‌‌‌‍‌‍‌‌‍‌‌‌‌⁠‍⁠

Leslie Lamport 分布式系统设计风格指南⁠‍⁠‌‌‌‌‍‌‌‌‌‍‌‌‌‍‌‌‌‍‌‍‌‌‌‌‍‌‍‌‌‌‌‌‌‍‌‌‌‍‌‌‌‌‌‌‌‍‌‌‌‍‌‌‌‌‌‌‍‌‌‌‌‍‌‌‌‌‍‌‌‌‌‌‍‌‌‍‌‌‌‌‌‌‍‌‌‌‌‍‌‌‌‌‍‌‌‌‌‌‌‍‌‍‌‌‍‌‌‌‌⁠‍⁠

Overview

概述

Leslie Lamport transformed distributed systems from ad-hoc engineering into a rigorous science. His work on logical clocks, consensus (Paxos), and formal specification (TLA+) provides the theoretical foundation for nearly every reliable distributed system built today. Turing Award winner (2013).

Leslie Lamport将分布式系统从临时拼凑的工程实践转变为一门严谨的科学。他在逻辑时钟、共识算法（Paxos）和形式化规范（TLA+）方面的研究，为当今几乎所有可靠分布式系统的构建奠定了理论基础。他是2013年图灵奖得主。

Core Philosophy

核心理念

"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."

"If you're thinking without writing, you only think you're thinking."

"The way to be a good programmer is to write programs, not to learn languages."

"分布式系统是这样一种系统：你甚至不知道存在的某台计算机发生故障，都可能导致你自己的计算机无法使用。"

"如果你只是在脑子里想而不写下来，那你只是以为自己在思考。"

"成为优秀程序员的方法是写程序，而不是学习编程语言。"

Design Principles

设计原则

Formal Specification First: Write a precise specification before writing code. If you can't specify it precisely, you don't understand it.
Time is Relative: There is no global clock in a distributed system. Use logical time (happens-before) to reason about ordering.
State Machine Replication: Any deterministic service can be made fault-tolerant by replicating it as a state machine across multiple servers.
Safety and Liveness: Separate what must always be true (safety) from what must eventually happen (liveness). Prove both.
Simplicity Through Rigor: The clearest systems come from precise thinking. Formalism isn't overhead—it's the path to simplicity.

先做形式化规范：在编写代码前先撰写精确的规范。如果你无法精确地定义它，说明你还不理解它。
时间是相对的：分布式系统中不存在全局时钟。使用逻辑时间（先发生关系）来推导事件顺序。
状态机复制：任何确定性服务都可以通过在多台服务器上以状态机的形式复制，来实现容错。
安全性与活性：区分必须始终成立的属性（安全性）和最终必须发生的属性（活性），并对两者进行证明。
通过严谨性实现简洁：最清晰的系统源于精确的思考。形式化不是额外负担，而是实现简洁的途径。

When Designing Systems

系统设计注意事项

Always

必须遵循

Write a specification before implementation (TLA+, Alloy, or precise prose)
Define the safety properties: what bad things must never happen
Define the liveness properties: what good things must eventually happen
Reason about all possible interleavings of concurrent operations
Use logical timestamps when physical time isn't reliable
Make system state explicit and transitions clear
Document invariants that must hold across all states

在实现前撰写规范（使用TLA+、Alloy或精确的文字描述）
定义安全性属性：哪些不良情况绝对不能发生
定义活性属性：哪些良好情况最终必须发生
考虑并发操作的所有可能交错情况
当物理时间不可靠时，使用逻辑时间戳
明确系统状态，清晰定义状态转换
记录系统在所有状态下必须保持的不变量

Never

绝对避免

Assume messages arrive in order (unless you've proven it)
Assume clocks are synchronized (they're not)
Assume failures are independent (they're often correlated)
Hand-wave about "eventually" without defining what guarantees that
Trust intuition for concurrent systems—prove it or test it exhaustively
Confuse the specification with the implementation

假设消息按顺序到达（除非你已经证明这一点）
假设时钟是同步的（实际上并非如此）
假设故障是独立的（它们往往是相关的）
模糊地使用“最终”一词，却不定义能保证该结果的条件
对并发系统仅凭直觉判断——要么证明它，要么进行全面测试
将规范与实现混淆

Prefer

优先选择

State machines over ad-hoc event handling
Logical clocks over physical timestamps for ordering
Consensus protocols over optimistic concurrency for critical state
Explicit failure handling over implicit assumptions
Proved algorithms over clever heuristics

状态机而非临时事件处理
逻辑时钟而非物理时间戳进行事件排序
共识协议而非乐观并发控制来处理关键状态
显式故障处理而非隐式假设
经过证明的算法而非巧妙的启发式方法

Key Concepts

核心概念

Logical Clocks (Lamport Timestamps)

逻辑时钟（Lamport时间戳）

Each process maintains a counter C:
1. Before any event, increment C
2. When sending a message, include C
3. When receiving a message with timestamp T, set C = max(C, T) + 1

This gives a partial ordering: if a → b, then C(a) < C(b)
(But C(a) < C(b) does NOT imply a → b)

Each process maintains a counter C:
1. Before any event, increment C
2. When sending a message, include C
3. When receiving a message with timestamp T, set C = max(C, T) + 1

This gives a partial ordering: if a → b, then C(a) < C(b)
(But C(a) < C(b) does NOT imply a → b)

每个进程维护一个计数器C：

在任何事件发生前，递增C
发送消息时，附带C的值
收到带有时间戳T的消息时，将C设置为max(C, T) + 1

这会形成一个偏序关系：如果a → b（a先于b发生），则C(a) < C(b) （但C(a) < C(b)并不意味着a → b）

The Happens-Before Relation (→)

先发生关系（→）

a → b (a happens before b) if:
1. a and b are in the same process and a comes before b, OR
2. a is a send and b is the corresponding receive, OR
3. There exists c such that a → c and c → b (transitivity)

If neither a → b nor b → a, events are CONCURRENT.

a → b (a happens before b) if:
1. a and b are in the same process and a comes before b, OR
2. a is a send and b is the corresponding receive, OR
3. There exists c such that a → c and c → b (transitivity)

If neither a → b nor b → a, events are CONCURRENT.

a → b（a先于b发生）当且仅当：

a和b在同一进程中，且a发生在b之前，或者
a是消息发送事件，b是对应的接收事件，或者
存在c，使得a → c且c → b（传递性）

如果既不存在a → b也不存在b → a，则这两个事件是并发的。

State Machine Replication

状态机复制

To replicate a service:
1. Model the service as a deterministic state machine
2. Replicate the state machine across N servers
3. Use consensus (Paxos/Raft) to agree on the sequence of inputs
4. Each replica applies inputs in the same order → same state

Tolerates F failures with 2F+1 replicas.

To replicate a service:
1. Model the service as a deterministic state machine
2. Replicate the state machine across N servers
3. Use consensus (Paxos/Raft) to agree on the sequence of inputs
4. Each replica applies inputs in the same order → same state

Tolerates F failures with 2F+1 replicas.

要复制一个服务：

将服务建模为确定性状态机
在N台服务器上复制该状态机
使用共识算法（Paxos/Raft）就输入序列达成一致
每个副本按相同顺序应用输入 → 最终状态一致

使用2F+1个副本可以容忍F个故障。

Paxos (Simplified)

Paxos（简化版）

Three roles: Proposers, Acceptors, Learners

Phase 1 (Prepare):
  Proposer sends PREPARE(n) to acceptors
  Acceptor responds with highest-numbered proposal it accepted (if any)
  
Phase 2 (Accept):
  If proposer receives majority responses:
    Send ACCEPT(n, v) where v is highest-numbered value seen (or new value)
  Acceptor accepts if it hasn't promised to a higher proposal

Consensus reached when majority accept the same (n, v).

Three roles: Proposers, Acceptors, Learners

Phase 1 (Prepare):
  Proposer sends PREPARE(n) to acceptors
  Acceptor responds with highest-numbered proposal it accepted (if any)
  
Phase 2 (Accept):
  If proposer receives majority responses:
    Send ACCEPT(n, v) where v is highest-numbered value seen (or new value)
  Acceptor accepts if it hasn't promised to a higher proposal

Consensus reached when majority accept the same (n, v).

三种角色：提议者（Proposers）、接受者（Acceptors）、学习者（Learners）

阶段1（准备）：提议者向接受者发送PREPARE(n) 接受者回复它已接受的编号最高的提案（如果有的话）

阶段2（接受）：如果提议者收到多数接受者的回复：发送ACCEPT(n, v)，其中v是它看到的编号最高的值（或新值）接受者如果没有承诺接受更高编号的提案，则接受该提案

当多数接受者接受同一个(n, v)时，共识达成。

Mental Model

思维模型

Lamport approaches distributed systems as a mathematician:

Define the problem precisely: What are the inputs, outputs, and allowed behaviors?
Identify the invariants: What must always be true?
Consider all interleavings: What happens if events occur in any order?
Prove correctness: Show that safety and liveness hold.
Then implement: The code should be a straightforward translation of the spec.

Lamport以数学家的方式处理分布式系统：

精确定义问题：输入、输出和允许的行为是什么？
确定不变量：哪些属性必须始终成立？
考虑所有事件交错情况：如果事件以任意顺序发生会怎样？
证明正确性：证明安全性和活性属性成立。
然后实现：代码应该是规范的直接转译。

The TLA+ Approach

TLA+方法

1. Define the state space (all possible states)
2. Define the initial state predicate
3. Define the next-state relation (allowed transitions)
4. Specify safety as invariants (always true)
5. Specify liveness as temporal properties (eventually true)
6. Model-check or prove that the spec satisfies properties

1. Define the state space (all possible states)
2. Define the initial state predicate
3. Define the next-state relation (allowed transitions)
4. Specify safety as invariants (always true)
5. Specify liveness as temporal properties (eventually true)
6. Model-check or prove that the spec satisfies properties

定义状态空间（所有可能的状态）
定义初始状态谓词
定义下一个状态关系（允许的状态转换）
将安全性指定为不变量（始终成立）
将活性指定为时序属性（最终成立）
通过模型检查或证明来验证规范是否满足这些属性

Code Patterns

代码模式

Implementing Logical Clocks

实现逻辑时钟

python

class LamportClock:
    def __init__(self):
        self._time = 0
    
    def tick(self) -> int:
        """Increment before local event."""
        self._time += 1
        return self._time
    
    def send_timestamp(self) -> int:
        """Get timestamp for outgoing message."""
        self._time += 1
        return self._time
    
    def receive(self, msg_timestamp: int) -> int:
        """Update clock on message receipt."""
        self._time = max(self._time, msg_timestamp) + 1
        return self._time

python

class LamportClock:
    def __init__(self):
        self._time = 0
    
    def tick(self) -> int:
        """Increment before local event."""
        self._time += 1
        return self._time
    
    def send_timestamp(self) -> int:
        """Get timestamp for outgoing message."""
        self._time += 1
        return self._time
    
    def receive(self, msg_timestamp: int) -> int:
        """Update clock on message receipt."""
        self._time = max(self._time, msg_timestamp) + 1
        return self._time

Vector Clocks (for Causality Detection)

向量时钟（用于因果关系检测）

python

class VectorClock:
    def __init__(self, node_id: str, num_nodes: int):
        self._id = node_id
        self._clock = {f"node_{i}": 0 for i in range(num_nodes)}
    
    def tick(self):
        self._clock[self._id] += 1
    
    def send(self) -> dict:
        self.tick()
        return self._clock.copy()
    
    def receive(self, other: dict):
        for node, time in other.items():
            self._clock[node] = max(self._clock.get(node, 0), time)
        self.tick()
    
    def happens_before(self, other: dict) -> bool:
        """Returns True if self → other."""
        return all(self._clock[k] <= other.get(k, 0) for k in self._clock) \
           and any(self._clock[k] < other.get(k, 0) for k in self._clock)

python

class VectorClock:
    def __init__(self, node_id: str, num_nodes: int):
        self._id = node_id
        self._clock = {f"node_{i}": 0 for i in range(num_nodes)}
    
    def tick(self):
        self._clock[self._id] += 1
    
    def send(self) -> dict:
        self.tick()
        return self._clock.copy()
    
    def receive(self, other: dict):
        for node, time in other.items():
            self._clock[node] = max(self._clock.get(node, 0), time)
        self.tick()
    
    def happens_before(self, other: dict) -> bool:
        """Returns True if self → other."""
        return all(self._clock[k] <= other.get(k, 0) for k in self._clock) \
           and any(self._clock[k] < other.get(k, 0) for k in self._clock)

Warning Signs

警示信号

You're violating Lamport's principles if:

You assume "this will never happen in practice" without proof
Your distributed algorithm works "most of the time"
You can't write down the invariants your system maintains
You're using wall-clock time for ordering in a distributed system
You haven't considered what happens during network partitions
Your system has no formal specification

如果你出现以下情况，就违反了Lamport的原则：

没有证明就假设“这种情况在实践中永远不会发生”
你的分布式算法“大多数时候有效”
你无法写下系统维护的不变量
你在分布式系统中使用墙上时钟时间进行事件排序
你没有考虑网络分区时的情况
你的系统没有形式化规范

Additional Resources

额外资源

For detailed philosophy, see philosophy.md
For references (papers, talks), see references.md

如需了解详细的哲学理念，请参阅philosophy.md
如需参考资料（论文、演讲），请参阅references.md