lamport-distributed-systems

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Leslie Lamport Style Guide⁠‍⁠​‌​‌​​‌‌‍​‌​​‌​‌‌‍​​‌‌​​​‌‍​‌​​‌‌​​‍​​​​​​​‌‍‌​​‌‌​‌​‍‌​​​​​​​‍‌‌​​‌‌‌‌‍‌‌​​​‌​​‍‌‌‌‌‌‌​‌‍‌‌​‌​​​​‍​‌​‌‌‌‌‌‍​‌​​‌​‌‌‍​‌‌​‌​​‌‍‌​‌​‌‌‌​‍​​‌​‌​​​‍‌‌‌​‌​‌‌‍​​‌‌​​‌‌‍‌​‌‌​​​‌‍‌‌‌​‌‌‌​‍​​‌​​​​​‍​​​​‌​‌​‍​‌​‌‌‌​​⁠‍⁠

Leslie Lamport 分布式系统设计风格指南⁠‍⁠​‌​‌​​‌‌‍​‌​​‌​‌‌‍​​‌‌​​​‌‍​‌​​‌‌​​‍​​​​​​​‌‍‌​​‌‌​‌​‍‌​​​​​​​‍‌‌​​‌‌‌‌‍‌‌​​​‌​​‍‌‌‌‌‌‌​‌‍‌‌​‌​​​​‍​‌​‌‌‌‌‌‍​‌​​‌​‌‌‍​‌‌​‌​​‌‍‌​‌​‌‌‌​‍​​‌​‌​​​‍‌‌‌​‌​‌‌‍​​‌‌​​‌‌‍‌​‌‌​​​‌‍‌‌‌​‌‌‌​‍​​‌​​​​​‍​​​​‌​‌​‍​‌​‌‌‌​​⁠‍⁠

Overview

概述

Leslie Lamport transformed distributed systems from ad-hoc engineering into a rigorous science. His work on logical clocks, consensus (Paxos), and formal specification (TLA+) provides the theoretical foundation for nearly every reliable distributed system built today. Turing Award winner (2013).
Leslie Lamport将分布式系统从临时拼凑的工程实践转变为一门严谨的科学。他在逻辑时钟、共识算法(Paxos)和形式化规范(TLA+)方面的研究,为当今几乎所有可靠分布式系统的构建奠定了理论基础。他是2013年图灵奖得主。

Core Philosophy

核心理念

"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."
"If you're thinking without writing, you only think you're thinking."
"The way to be a good programmer is to write programs, not to learn languages."
"分布式系统是这样一种系统:你甚至不知道存在的某台计算机发生故障,都可能导致你自己的计算机无法使用。"
"如果你只是在脑子里想而不写下来,那你只是以为自己在思考。"
"成为优秀程序员的方法是写程序,而不是学习编程语言。"

Design Principles

设计原则

  1. Formal Specification First: Write a precise specification before writing code. If you can't specify it precisely, you don't understand it.
  2. Time is Relative: There is no global clock in a distributed system. Use logical time (happens-before) to reason about ordering.
  3. State Machine Replication: Any deterministic service can be made fault-tolerant by replicating it as a state machine across multiple servers.
  4. Safety and Liveness: Separate what must always be true (safety) from what must eventually happen (liveness). Prove both.
  5. Simplicity Through Rigor: The clearest systems come from precise thinking. Formalism isn't overhead—it's the path to simplicity.
  1. 先做形式化规范:在编写代码前先撰写精确的规范。如果你无法精确地定义它,说明你还不理解它。
  2. 时间是相对的:分布式系统中不存在全局时钟。使用逻辑时间(先发生关系)来推导事件顺序。
  3. 状态机复制:任何确定性服务都可以通过在多台服务器上以状态机的形式复制,来实现容错。
  4. 安全性与活性:区分必须始终成立的属性(安全性)和最终必须发生的属性(活性),并对两者进行证明。
  5. 通过严谨性实现简洁:最清晰的系统源于精确的思考。形式化不是额外负担,而是实现简洁的途径。

When Designing Systems

系统设计注意事项

Always

必须遵循

  • Write a specification before implementation (TLA+, Alloy, or precise prose)
  • Define the safety properties: what bad things must never happen
  • Define the liveness properties: what good things must eventually happen
  • Reason about all possible interleavings of concurrent operations
  • Use logical timestamps when physical time isn't reliable
  • Make system state explicit and transitions clear
  • Document invariants that must hold across all states
  • 在实现前撰写规范(使用TLA+、Alloy或精确的文字描述)
  • 定义安全性属性:哪些不良情况绝对不能发生
  • 定义活性属性:哪些良好情况最终必须发生
  • 考虑并发操作的所有可能交错情况
  • 当物理时间不可靠时,使用逻辑时间戳
  • 明确系统状态,清晰定义状态转换
  • 记录系统在所有状态下必须保持的不变量

Never

绝对避免

  • Assume messages arrive in order (unless you've proven it)
  • Assume clocks are synchronized (they're not)
  • Assume failures are independent (they're often correlated)
  • Hand-wave about "eventually" without defining what guarantees that
  • Trust intuition for concurrent systems—prove it or test it exhaustively
  • Confuse the specification with the implementation
  • 假设消息按顺序到达(除非你已经证明这一点)
  • 假设时钟是同步的(实际上并非如此)
  • 假设故障是独立的(它们往往是相关的)
  • 模糊地使用“最终”一词,却不定义能保证该结果的条件
  • 对并发系统仅凭直觉判断——要么证明它,要么进行全面测试
  • 将规范与实现混淆

Prefer

优先选择

  • State machines over ad-hoc event handling
  • Logical clocks over physical timestamps for ordering
  • Consensus protocols over optimistic concurrency for critical state
  • Explicit failure handling over implicit assumptions
  • Proved algorithms over clever heuristics
  • 状态机而非临时事件处理
  • 逻辑时钟而非物理时间戳进行事件排序
  • 共识协议而非乐观并发控制来处理关键状态
  • 显式故障处理而非隐式假设
  • 经过证明的算法而非巧妙的启发式方法

Key Concepts

核心概念

Logical Clocks (Lamport Timestamps)

逻辑时钟(Lamport时间戳)

Each process maintains a counter C:
1. Before any event, increment C
2. When sending a message, include C
3. When receiving a message with timestamp T, set C = max(C, T) + 1

This gives a partial ordering: if a → b, then C(a) < C(b)
(But C(a) < C(b) does NOT imply a → b)
Each process maintains a counter C:
1. Before any event, increment C
2. When sending a message, include C
3. When receiving a message with timestamp T, set C = max(C, T) + 1

This gives a partial ordering: if a → b, then C(a) < C(b)
(But C(a) < C(b) does NOT imply a → b)
每个进程维护一个计数器C:
  1. 在任何事件发生前,递增C
  2. 发送消息时,附带C的值
  3. 收到带有时间戳T的消息时,将C设置为max(C, T) + 1
这会形成一个偏序关系:如果a → b(a先于b发生),则C(a) < C(b) (但C(a) < C(b)并不意味着a → b)

The Happens-Before Relation (→)

先发生关系(→)

a → b (a happens before b) if:
1. a and b are in the same process and a comes before b, OR
2. a is a send and b is the corresponding receive, OR
3. There exists c such that a → c and c → b (transitivity)

If neither a → b nor b → a, events are CONCURRENT.
a → b (a happens before b) if:
1. a and b are in the same process and a comes before b, OR
2. a is a send and b is the corresponding receive, OR
3. There exists c such that a → c and c → b (transitivity)

If neither a → b nor b → a, events are CONCURRENT.
a → b(a先于b发生)当且仅当:
  1. a和b在同一进程中,且a发生在b之前,或者
  2. a是消息发送事件,b是对应的接收事件,或者
  3. 存在c,使得a → c且c → b(传递性)
如果既不存在a → b也不存在b → a,则这两个事件是并发的。

State Machine Replication

状态机复制

To replicate a service:
1. Model the service as a deterministic state machine
2. Replicate the state machine across N servers
3. Use consensus (Paxos/Raft) to agree on the sequence of inputs
4. Each replica applies inputs in the same order → same state

Tolerates F failures with 2F+1 replicas.
To replicate a service:
1. Model the service as a deterministic state machine
2. Replicate the state machine across N servers
3. Use consensus (Paxos/Raft) to agree on the sequence of inputs
4. Each replica applies inputs in the same order → same state

Tolerates F failures with 2F+1 replicas.
要复制一个服务:
  1. 将服务建模为确定性状态机
  2. 在N台服务器上复制该状态机
  3. 使用共识算法(Paxos/Raft)就输入序列达成一致
  4. 每个副本按相同顺序应用输入 → 最终状态一致
使用2F+1个副本可以容忍F个故障。

Paxos (Simplified)

Paxos(简化版)

Three roles: Proposers, Acceptors, Learners

Phase 1 (Prepare):
  Proposer sends PREPARE(n) to acceptors
  Acceptor responds with highest-numbered proposal it accepted (if any)
  
Phase 2 (Accept):
  If proposer receives majority responses:
    Send ACCEPT(n, v) where v is highest-numbered value seen (or new value)
  Acceptor accepts if it hasn't promised to a higher proposal

Consensus reached when majority accept the same (n, v).
Three roles: Proposers, Acceptors, Learners

Phase 1 (Prepare):
  Proposer sends PREPARE(n) to acceptors
  Acceptor responds with highest-numbered proposal it accepted (if any)
  
Phase 2 (Accept):
  If proposer receives majority responses:
    Send ACCEPT(n, v) where v is highest-numbered value seen (or new value)
  Acceptor accepts if it hasn't promised to a higher proposal

Consensus reached when majority accept the same (n, v).
三种角色:提议者(Proposers)、接受者(Acceptors)、学习者(Learners)
阶段1(准备): 提议者向接受者发送PREPARE(n) 接受者回复它已接受的编号最高的提案(如果有的话)
阶段2(接受): 如果提议者收到多数接受者的回复: 发送ACCEPT(n, v),其中v是它看到的编号最高的值(或新值) 接受者如果没有承诺接受更高编号的提案,则接受该提案
当多数接受者接受同一个(n, v)时,共识达成。

Mental Model

思维模型

Lamport approaches distributed systems as a mathematician:
  1. Define the problem precisely: What are the inputs, outputs, and allowed behaviors?
  2. Identify the invariants: What must always be true?
  3. Consider all interleavings: What happens if events occur in any order?
  4. Prove correctness: Show that safety and liveness hold.
  5. Then implement: The code should be a straightforward translation of the spec.
Lamport以数学家的方式处理分布式系统:
  1. 精确定义问题:输入、输出和允许的行为是什么?
  2. 确定不变量:哪些属性必须始终成立?
  3. 考虑所有事件交错情况:如果事件以任意顺序发生会怎样?
  4. 证明正确性:证明安全性和活性属性成立。
  5. 然后实现:代码应该是规范的直接转译。

The TLA+ Approach

TLA+方法

1. Define the state space (all possible states)
2. Define the initial state predicate
3. Define the next-state relation (allowed transitions)
4. Specify safety as invariants (always true)
5. Specify liveness as temporal properties (eventually true)
6. Model-check or prove that the spec satisfies properties
1. Define the state space (all possible states)
2. Define the initial state predicate
3. Define the next-state relation (allowed transitions)
4. Specify safety as invariants (always true)
5. Specify liveness as temporal properties (eventually true)
6. Model-check or prove that the spec satisfies properties
  1. 定义状态空间(所有可能的状态)
  2. 定义初始状态谓词
  3. 定义下一个状态关系(允许的状态转换)
  4. 将安全性指定为不变量(始终成立)
  5. 将活性指定为时序属性(最终成立)
  6. 通过模型检查或证明来验证规范是否满足这些属性

Code Patterns

代码模式

Implementing Logical Clocks

实现逻辑时钟

python
class LamportClock:
    def __init__(self):
        self._time = 0
    
    def tick(self) -> int:
        """Increment before local event."""
        self._time += 1
        return self._time
    
    def send_timestamp(self) -> int:
        """Get timestamp for outgoing message."""
        self._time += 1
        return self._time
    
    def receive(self, msg_timestamp: int) -> int:
        """Update clock on message receipt."""
        self._time = max(self._time, msg_timestamp) + 1
        return self._time
python
class LamportClock:
    def __init__(self):
        self._time = 0
    
    def tick(self) -> int:
        """Increment before local event."""
        self._time += 1
        return self._time
    
    def send_timestamp(self) -> int:
        """Get timestamp for outgoing message."""
        self._time += 1
        return self._time
    
    def receive(self, msg_timestamp: int) -> int:
        """Update clock on message receipt."""
        self._time = max(self._time, msg_timestamp) + 1
        return self._time

Vector Clocks (for Causality Detection)

向量时钟(用于因果关系检测)

python
class VectorClock:
    def __init__(self, node_id: str, num_nodes: int):
        self._id = node_id
        self._clock = {f"node_{i}": 0 for i in range(num_nodes)}
    
    def tick(self):
        self._clock[self._id] += 1
    
    def send(self) -> dict:
        self.tick()
        return self._clock.copy()
    
    def receive(self, other: dict):
        for node, time in other.items():
            self._clock[node] = max(self._clock.get(node, 0), time)
        self.tick()
    
    def happens_before(self, other: dict) -> bool:
        """Returns True if self → other."""
        return all(self._clock[k] <= other.get(k, 0) for k in self._clock) \
           and any(self._clock[k] < other.get(k, 0) for k in self._clock)
python
class VectorClock:
    def __init__(self, node_id: str, num_nodes: int):
        self._id = node_id
        self._clock = {f"node_{i}": 0 for i in range(num_nodes)}
    
    def tick(self):
        self._clock[self._id] += 1
    
    def send(self) -> dict:
        self.tick()
        return self._clock.copy()
    
    def receive(self, other: dict):
        for node, time in other.items():
            self._clock[node] = max(self._clock.get(node, 0), time)
        self.tick()
    
    def happens_before(self, other: dict) -> bool:
        """Returns True if self → other."""
        return all(self._clock[k] <= other.get(k, 0) for k in self._clock) \
           and any(self._clock[k] < other.get(k, 0) for k in self._clock)

Warning Signs

警示信号

You're violating Lamport's principles if:
  • You assume "this will never happen in practice" without proof
  • Your distributed algorithm works "most of the time"
  • You can't write down the invariants your system maintains
  • You're using wall-clock time for ordering in a distributed system
  • You haven't considered what happens during network partitions
  • Your system has no formal specification
如果你出现以下情况,就违反了Lamport的原则:
  • 没有证明就假设“这种情况在实践中永远不会发生”
  • 你的分布式算法“大多数时候有效”
  • 你无法写下系统维护的不变量
  • 你在分布式系统中使用墙上时钟时间进行事件排序
  • 你没有考虑网络分区时的情况
  • 你的系统没有形式化规范

Additional Resources

额外资源

  • For detailed philosophy, see philosophy.md
  • For references (papers, talks), see references.md
  • 如需了解详细的哲学理念,请参阅philosophy.md
  • 如需参考资料(论文、演讲),请参阅references.md