Loading...
Loading...
Design distributed systems using Leslie Lamport's rigorous approach. Emphasizes formal reasoning, logical time, consensus protocols, and state machine replication. Use when building systems where correctness under concurrency and partial failure is critical.
npx skill4agent add copyleftdev/sk1llz lamport-distributed-systems"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."
"If you're thinking without writing, you only think you're thinking."
"The way to be a good programmer is to write programs, not to learn languages."
Each process maintains a counter C:
1. Before any event, increment C
2. When sending a message, include C
3. When receiving a message with timestamp T, set C = max(C, T) + 1
This gives a partial ordering: if a → b, then C(a) < C(b)
(But C(a) < C(b) does NOT imply a → b)a → b (a happens before b) if:
1. a and b are in the same process and a comes before b, OR
2. a is a send and b is the corresponding receive, OR
3. There exists c such that a → c and c → b (transitivity)
If neither a → b nor b → a, events are CONCURRENT.To replicate a service:
1. Model the service as a deterministic state machine
2. Replicate the state machine across N servers
3. Use consensus (Paxos/Raft) to agree on the sequence of inputs
4. Each replica applies inputs in the same order → same state
Tolerates F failures with 2F+1 replicas.Three roles: Proposers, Acceptors, Learners
Phase 1 (Prepare):
Proposer sends PREPARE(n) to acceptors
Acceptor responds with highest-numbered proposal it accepted (if any)
Phase 2 (Accept):
If proposer receives majority responses:
Send ACCEPT(n, v) where v is highest-numbered value seen (or new value)
Acceptor accepts if it hasn't promised to a higher proposal
Consensus reached when majority accept the same (n, v).1. Define the state space (all possible states)
2. Define the initial state predicate
3. Define the next-state relation (allowed transitions)
4. Specify safety as invariants (always true)
5. Specify liveness as temporal properties (eventually true)
6. Model-check or prove that the spec satisfies propertiesclass LamportClock:
def __init__(self):
self._time = 0
def tick(self) -> int:
"""Increment before local event."""
self._time += 1
return self._time
def send_timestamp(self) -> int:
"""Get timestamp for outgoing message."""
self._time += 1
return self._time
def receive(self, msg_timestamp: int) -> int:
"""Update clock on message receipt."""
self._time = max(self._time, msg_timestamp) + 1
return self._timeclass VectorClock:
def __init__(self, node_id: str, num_nodes: int):
self._id = node_id
self._clock = {f"node_{i}": 0 for i in range(num_nodes)}
def tick(self):
self._clock[self._id] += 1
def send(self) -> dict:
self.tick()
return self._clock.copy()
def receive(self, other: dict):
for node, time in other.items():
self._clock[node] = max(self._clock.get(node, 0), time)
self.tick()
def happens_before(self, other: dict) -> bool:
"""Returns True if self → other."""
return all(self._clock[k] <= other.get(k, 0) for k in self._clock) \
and any(self._clock[k] < other.get(k, 0) for k in self._clock)