Chapter 10: Consistency and Consensus
ddia-2e consistency consensus linearizability raft paxos logical-clocks
Status: Notes complete
Overview
Chapter 10 addresses the most theoretically deep questions in distributed systems: what does “correct behavior” even mean when data is spread across multiple nodes? It progresses from the strongest consistency guarantee (linearizability) through weaker models, then introduces the ID generation and logical clock problem as a bridge to the central topic — consensus. The chapter culminates in practical consensus algorithms (Raft, Paxos), their coordination service implementations (ZooKeeper, etcd), and the equivalence between consensus and total order broadcast.
This is the 2nd edition’s equivalent of 1st edition Chapter 9, with two major structural additions:
- ID Generators and Logical Clocks is now a standalone section covering Lamport clocks, vector clocks, Hybrid Logical Clocks, and practical ID generation strategies (Snowflake IDs, ULIDs, TSIDs)
- Linearizable ID Generators — a new section connecting logical clock theory to practical distributed ID systems
Core thesis: Many distributed system problems reduce to consensus. If you can solve consensus, you can solve leader election, atomic broadcast, atomic commit, and distributed locking. But consensus is expensive — not every problem requires it, and identifying the minimum consistency level needed is a key design skill.
Key Concepts
Linearizability
What Makes a System Linearizable?
Linearizability (also called atomic consistency or strong consistency): The strongest consistency guarantee for individual objects. An operation appears to execute instantaneously and atomically at some single point in real time between its invocation and its completion. Once any operation completes, all subsequent operations by any client on any node must observe the new state.
Informal definition: Make a distributed system behave “as if” there’s only a single copy of the data, served by a single, infinitely fast CPU.
Formal property (recency guarantee): If a write W completes before a read R begins, R must return the value written by W (or a later write). There is no “going back” to a previous value.
Linearizability illustrated with a timeline:
CLIENT 1: ├── write(x=1) ──┤
CLIENT 2: ├─────── read(x) ─────────┤
CLIENT 3: ├── read(x) ──┤
If write(x=1) completes before CLIENT 2's read STARTS:
CLIENT 2 MUST see x=1 (not x=0)
CLIENT 3 MUST see x=1 (its read starts after the write completed)
LINEARIZABLE: All clients see the effect of the write immediately after it completes.
NON-LINEARIZABLE: CLIENT 2 reads x=0 even though write(x=1) completed before it started.
Register operations (the building block of linearizability proofs):
read(x)— returns the current valuewrite(x, val)— sets value; returns OKcompare-and-swap(x, expected, new)— atomic CAS; returns success/fail
Linearizability vs serializability:
- Serializability: Transaction isolation property. Concurrent transactions appear to execute in some serial order. That order can be in the “past” (transactions reordered even if they ran concurrently).
- Linearizability: Recency guarantee for individual operations. The “linearization point” must be within the real-time span of the operation.
- Strict serializability = serializability + linearizability = the strongest possible guarantee (used by Spanner, FoundationDB)
Linearizable? Serializable? Used by
────────────────────────────────────────────────────────────────
Single-threaded Yes Yes Single-node DB
Snapshot Isolation No No (SI ≠ Ser.) PostgreSQL default
Serializability No Yes CockroachDB SSI
Linearizability Yes No (per-object) etcd, ZooKeeper
Strict Ser. Yes Yes Spanner, FoundationDB
────────────────────────────────────────────────────────────────
Relying on Linearizability
When linearizability is required:
-
Locking and leader election: All nodes must agree on exactly one holder of the lock at any moment. Without linearizability, two nodes might both believe they hold the lock.
-
Uniqueness constraints: Enforcing “only one user can register with this username” requires that the check and the reservation are atomic. Non-linearizable storage allows two nodes to see “username not taken” simultaneously and both register it.
-
Cross-channel timing: A user uploads a photo, then sends a link to a friend via email. The email and the photo go through different systems. If the friend clicks the link before the photo is replicated, they see a 404. Linearizability guarantees that after the upload completes, all subsequent reads return the photo.
-
Bank account invariants: “Account balance must never go negative” — checking and debiting must appear atomic. Non-linearizable reads allow two concurrent withdrawals to each check the balance independently and both succeed.
Implementing Linearizable Systems
Single-leader replication with synchronous writes: The leader handles all writes; reads go to the leader or synchronously replicated followers. Linearizable if (and only if) every read goes to the leader and leadership is stable.
Multi-leader replication: NOT linearizable — multiple leaders accept writes concurrently; conflicts are possible. (By definition, if two nodes can both accept writes, the “single timeline” property is broken.)
Leaderless replication (Dynamo-style): NOT inherently linearizable. Even with quorum reads/writes (r + w > n), clock skew and concurrent writes can result in stale reads. “Last write wins” with timestamps is not linearizable.
Consensus algorithms (Raft, Paxos): Linearizable — all writes go through a leader elected by consensus; the leader replicates before acknowledging; reads from the leader are linearizable.
RAM on a single machine: Linearizable for single-core CPUs. Multi-core CPUs with cache coherence are only linearizable if you use memory barriers / fences — without them, CPU instruction reordering breaks linearizability even locally.
The Cost of Linearizability
CAP theorem (Brewer, 2000):
- C = Consistency (actually means Linearizability in CAP)
- A = Availability (every request gets a non-error response)
- P = Partition tolerance (system continues operating during network partition)
The real choice: Partition tolerance is not optional — network partitions happen. So the real decision during a partition is: CP or AP?
- CP systems (choose consistency): Return an error/timeout during partition. No stale reads. ZooKeeper, etcd, HBase, CockroachDB.
- AP systems (choose availability): Return a response (potentially stale) during partition. Cassandra, DynamoDB (eventually consistent mode), Riak.
CAP is often misunderstood:
- CAP only applies during a partition — not during normal operation (when you can have both C and A)
- CAP’s “consistency” = linearizability — not ACID consistency (an unrelated concept)
- CAP doesn’t say anything about latency during normal operation
PACELC theorem (Daniel Abadi, 2012) — more complete than CAP:
- During Partition: choose Availability vs Consistency
- Else (no partition): choose Latency vs Consistency
- Even without partitions, strong consistency requires coordination → latency cost
- Spanner is PA/EL: prefers availability during partition (read from local replica), but pays latency for consistency during normal operation
PACELC Placement:
─────────────────────────────────────────────────────
System P:A/C E:L/C Notes
─────────────────────────────────────────────────────
Spanner PC EC TrueTime; global consistency
CockroachDB PC EC Raft-based; strict serializable
etcd / ZooKeeper PC EC Linearizable; blocks during partition
DynamoDB PA EL Configurable; AP by default
Cassandra PA EL Tunable consistency; AP by default
MySQL async replica PA EL Stale reads from replicas OK
─────────────────────────────────────────────────────
ID Generators and Logical Clocks
(New in 2nd Edition) This section bridges the clock discussion from Chapter 9 with the practical problem of generating unique, ordered identifiers in distributed systems.
Logical Clocks
Why logical clocks exist: Physical clocks (wall clocks) are unreliable for ordering events across nodes (as established in Ch9). Logical clocks capture causal ordering — “if A happened before B, A’s timestamp is smaller than B’s” — without requiring synchronized physical time.
Lamport clocks (Leslie Lamport, 1978):
- Each node maintains a counter
L - Rule 1: Before sending a message, increment
L; includeLin the message - Rule 2: On receiving a message with clock value
m:L = max(L, m) + 1 - Rule 3: Local events increment
Lby 1
Lamport clock properties:
- If A → B (A happened before B causally):
L(A) < L(B)— guaranteed - If
L(A) < L(B): We cannot conclude A → B (A might be concurrent with B) - Lamport clocks provide total order (every pair of events is comparable) but not causal precision
Lamport Clock Example:
─────────────────────────────────────────────────────
Node 1: L=1 L=2 L=3 L=4 L=6
send → ─────────────── recv ←─ send →
↑ ↓
Node 2: L=1 L=3 ─────── ───── L=5 recv
recv ← send →
─────────────────────────────────────────────────────
Message from Node 1 (L=2) arrives at Node 2: L2 = max(L2=2, L1=2) + 1 = 3
Vector clocks (Fidge/Mattern, 1988):
- Each node maintains a vector
V[N]whereNis the number of nodes - Rule: Increment
V[i]on each event at nodei; merge on message receive:V[j] = max(V[i], V[received]) + 1 - Property:
V(A) < V(B)if and only if A causally precedes B (A → B) - Can detect concurrent events: if neither
V(A) ≤ V(B)norV(B) ≤ V(A), A and B are concurrent
Comparison of clock types:
Total Order Causal Precision Size Physical Time
─────────────────────────────────────────────────────────────────────────
Lamport clocks Yes No (imprecise) O(1) No
Vector clocks No Yes (precise) O(N) No
Wall clocks (NTP) Yes No (unreliable) O(1) Yes (±10ms)
Hybrid Logical Yes Partial O(1) Approximate
Clocks (HLC)
TrueTime Yes Yes O(1) Bounded (±7ms)
─────────────────────────────────────────────────────────────────────────
Hybrid Logical Clocks (HLC):
- Combines wall clock with a logical counter:
HLC = (physical_time, logical_counter) - Always monotonically increasing
- Stays within bounded skew of physical wall time
- On send:
HLC.physical = max(HLC.physical, wall_clock); HLC.logical++ - On receive:
HLC.physical = max(HLC.physical, msg.physical, wall_clock); HLC.logical = max(HLC.logical, msg.logical) + 1 - Used by: CockroachDB (for transaction timestamps), YugabyteDB
Linearizable ID Generators
The problem: Many systems need globally unique, monotonically increasing IDs for ordering events, database rows, or messages. In a distributed system, no single node can be the sole ID authority without becoming a bottleneck and single point of failure.
ID generation strategies compared:
| Strategy | Example | Globally Unique? | Monotonic? | Sortable? | DB-Index Friendly? | Distributed? |
|---|---|---|---|---|---|---|
| Auto-increment | MySQL AUTO_INCREMENT | No (per-node) | Yes (per-node) | Yes | Yes (sequential) | No |
| UUID v4 | 550e8400-e29b-41d4-... | Yes | No | No | No (random insert) | Yes |
| UUID v7 | Time-ordered UUID | Yes | Approximate | Yes | Good | Yes |
| Snowflake ID | Twitter Snowflake | Yes | Approximate | Yes | Yes | Yes (per-node) |
| ULID | 01ARZ3NDEKTSV4RRFFQ69 | Yes | Approximate | Yes | Yes | Yes |
| TSID | 64-bit time+sequence | Yes | Approximate | Yes | Yes | Yes |
| Logical clock ID | Lamport-based | Yes (if unique node IDs) | Causal | Causal | Depends | Yes |
Snowflake ID (Twitter, 2010 — widely adopted):
64-bit structure:
┌──────────────────────────────┬─────────────────┬──────────────────┐
│ 41 bits: milliseconds since │ 10 bits: node │ 12 bits: seq no. │
│ epoch (Jan 1, 2010) │ ID │ per millisecond │
└──────────────────────────────┴─────────────────┴──────────────────┘
- Each node generates up to 4096 IDs/ms without coordination
- IDs from the same node are monotonic within each millisecond
- IDs from different nodes are approximately time-ordered (within clock skew)
- Used by: Twitter, Discord, Instagram, LinkedIn (Snowflake variants)
ULID (Universally Unique Lexicographically Sortable Identifier):
26 characters: 10 chars timestamp (ms) + 16 chars random
01ARZ3NDEKTSV4RRFFQ69G3BVZB
^^^^^^^^^^ ^^^^^^^^^^^^^^^^
48-bit ms 80-bit random
- Lexicographically sortable (unlike UUID v4)
- No coordination required between nodes
- Approximately time-ordered (random within same millisecond)
- Same timestamp uniqueness: uses random bits (collision risk with very high throughput)
TSID (Time-Sorted ID):
64-bit structure (similar to Snowflake but without fixed node ID):
┌──────────────────────────────┬──────────────────────────────────┐
│ 42 bits: timestamp │ 22 bits: random or node+sequence │
└──────────────────────────────┴──────────────────────────────────┘
- Smaller than ULID (64-bit int vs 128-bit string)
- Fits in PostgreSQL bigint or MySQL bigint
- Used by Vladimyr Agafonkin's TSID library; adopted in some cloud services
When to use each:
Low-throughput, single DB: AUTO_INCREMENT (simplest)
Distributed, any throughput: UUID v7 or ULID (good default)
High-throughput, single DC: Snowflake ID (max performance)
Need causal ordering: Logical clock IDs (Lamport-based)
Need physical time + order: HLC-based IDs (CockroachDB approach)
Consensus
The Many Faces of Consensus
Consensus definition: A group of nodes must agree on a single value. Once they decide, the decision is final.
Formal properties:
- Uniform agreement: All non-faulty nodes decide on the same value
- Validity: The decided value was proposed by some node (no value appears from nowhere)
- Integrity: Each node decides at most once (no changing your mind)
- Termination (liveness): All non-faulty nodes eventually decide
Problems that reduce to consensus (all equivalent in power):
- Leader election: Agree on one node as leader
- Total order broadcast (atomic broadcast): All nodes deliver messages in the same order
- Atomic commit (2PC): All nodes agree whether to commit or abort a transaction
- Distributed locks: Agree on which node holds the lock
Why “equivalent”: If you can solve consensus, you can implement total order broadcast (each message = one consensus round). If you can implement total order broadcast, you can build a replicated state machine (replay the same messages in the same order on each replica).
Total Order Broadcast (TOB):
- All nodes receive all messages in the same order
- No message is skipped (reliable delivery)
- Used by: ZooKeeper (Zab), Kafka (Raft in KRaft mode), etcd (Raft)
- Equivalence to consensus: TOB = consensus where each message position is one consensus decision
Consensus in Practice
Why Paxos matters historically:
- Paxos (Lamport, 1998 — based on 1989 tech report): The first proven correct consensus algorithm
- Notoriously difficult to understand and implement correctly
- Single-decree Paxos: agree on one value (one round)
- Multi-Paxos: extend to a log of values (add a leader optimization for multiple rounds)
- Industrial implementations: Google Chubby, Apache Zookeeper (conceptually similar)
Raft (Ongaro and Ousterhout, 2014):
- Designed explicitly for understandability — the thesis was literally “In Search of an Understandable Consensus Algorithm”
- Leader-based: all writes go through the leader
- Three sub-problems: leader election, log replication, safety
- More commonly implemented than Paxos in new systems (2026)
Raft leader election:
States: Follower → Candidate → Leader
Follower: Receives heartbeats from leader.
If no heartbeat for election_timeout: → Candidate
Candidate: Increments term. Votes for self. Sends RequestVote RPC.
If majority responds "yes": → Leader
If another leader emerges (higher term): → Follower
If split vote (timeout): → Candidate again (new election)
Leader: Sends heartbeats (AppendEntries with no entries) to all followers.
If follower's term > leader's term: → Follower (step down)
Election safety: Only ONE leader can be elected per term.
A candidate needs majority votes; two candidates can't both get majority.
Raft log replication:
Client → Leader → AppendEntries RPC → Followers
← Acknowledge (majority)
Leader marks entry COMMITTED
Leader applies to state machine; sends response to client
Followers apply on next AppendEntries (via commit index)
COMMITTED = written to majority of nodes. Even if leader crashes now,
the committed entry WILL survive (majority still has it).
Raft safety guarantee: A leader always has all committed entries. Raft’s election algorithm ensures a candidate can only become leader if its log is at least as up-to-date as any majority’s logs.
Paxos vs Raft comparison:
Property Paxos (Multi-Paxos) Raft
────────────────────────────────────────────────────────────
Understandability Very hard Designed to be easy
Leader Optional optimization Required
Log structure Multiple "slots" Single append-only log
Membership changes Requires separate Joint consensus or
mechanism single-server changes
Real implementations Few (Chubby, Zab) Many (etcd, CockroachDB,
Kafka KRaft, TiKV, MongoDB)
Election safety Equivalent Equivalent
Throughput Roughly equivalent Roughly equivalent
Preferred in 2026 No (legacy) YES (new systems)
Consensus algorithm limitations:
- Requires a majority of nodes to be healthy and reachable to make progress
- If < majority available: blocks (refuses to process writes) — liveness sacrificed for safety
- Synchronous replication: Every write waits for majority acknowledgment → latency = max(majority latency)
- Cross-datacenter latency: ~100ms round-trip → 100ms minimum write latency for cross-DC consensus
- Single-datacenter: ~1ms write latency (excellent)
Coordination Services
ZooKeeper (Apache) and etcd (CNCF):
- Distributed coordination services built on consensus (Zab and Raft respectively)
- Provide: leader election, distributed locks, distributed configuration, service discovery, barrier synchronization
ZooKeeper core primitives:
- znodes: Files in a hierarchical tree; persistent or ephemeral
- Ephemeral znodes: Automatically deleted when creating client session ends → used for locks and leader registration
- Sequential znodes: ZooKeeper appends a monotonically increasing number to the name → natural fencing token source
- Watches: Clients register to be notified when a znode changes → reactive leader election without polling
Leader election with ZooKeeper:
1. All candidates create sequential ephemeral znode: /election/candidate-00000001
2. Each node reads all candidates; sort by sequence number
3. Node with LOWEST sequence number = current leader
4. Other nodes watch the node with the NEXT lower sequence number
5. When a node's watched-node is deleted:
- Check if it's now the lowest → become leader
- Or watch the next-lower node → wait for turn
6. If leader crashes: its ephemeral znode is deleted;
the next-lowest node's watch fires → it becomes leader
etcd usage patterns (2026 standard):
- Kubernetes: etcd stores all cluster state (pods, services, configmaps, secrets)
- Leader election:
etcd.Election.Campaign()/etcd.Election.Observe() - Distributed lock:
etcd.Mutex.Lock()/Unlock()(uses TTL + compare-and-swap) - Watch for config changes:
etcd.Watch()on a key prefix
Why not implement consensus yourself:
- Paxos and Raft are extremely complex with many subtle edge cases
- Membership changes (adding/removing nodes) are particularly tricky
- Leader election during network partitions requires careful handling
- Use etcd or ZooKeeper: battle-tested, audited, Jepsen-verified implementations
Comparison Tables
Consistency Models (Weakest to Strongest)
| Model | Guarantee | Concurrent writes | Read from stale replica? | Coordination required? |
|---|---|---|---|---|
| Eventual consistency | All replicas converge eventually | Last-write-wins or merge | Yes | No |
| Monotonic reads | You see your own previous reads | Possible concurrent | No (per-client) | No |
| Read-your-writes | You see your own writes | Possible concurrent | No (per-client) | Per-session |
| Consistent prefix | See causally related ops in order | Possible concurrent | No | No |
| Causal consistency | Causally related ops in order for all | Concurrent events may reorder | No | Vector clocks |
| Linearizability | All ops appear instantaneously | Atomic CAS | Never | Yes (quorum) |
| Strict serializability | Linearizable + serializable | Serialized | Never | Yes (2PL/SSI) |
Consensus Algorithm Comparison
| Property | Paxos | Raft | Zab (ZooKeeper) | PBFT | Tendermint |
|---|---|---|---|---|---|
| Fault model | Crash-recovery | Crash-recovery | Crash-recovery | Byzantine | Byzantine |
| Leader required | Optional (Multi-Paxos) | Yes | Yes | No | Yes |
| Max faulty nodes | f < n/2 | f < n/2 | f < n/2 | f < n/3 | f < n/3 |
| Latency (normal) | 2 round trips | 2 round trips | 2 round trips | 3 round trips | 2+ round trips |
| Understandability | Very hard | Moderate | Hard | Very hard | Hard |
| Use in 2026 | Legacy | Standard | ZooKeeper only | Blockchain | Blockchain |
ID Generation Strategies
| Strategy | Bits | Unique? | Sortable? | Coordination? | Throughput | Best for |
|---|---|---|---|---|---|---|
| Auto-increment | 32/64 | Per-node | Yes | Single node | Very high | Single-node DB |
| UUID v4 | 128 | Yes | No | None | Unlimited | Any distributed |
| UUID v7 | 128 | Yes | Yes (time) | None | Unlimited | Distributed, sortable |
| Snowflake | 64 | Yes | Approx. | Per-node | 4096/ms/node | High-throughput |
| ULID | 128 | Yes | Yes (time) | None | Unlimited | Human-readable |
| TSID | 64 | Yes | Yes (time) | None | Unlimited | SQL (fits bigint) |
| Lamport ID | variable | Yes (+ node ID) | Causal | None | Unlimited | Event sourcing |
Important Points Summary
- Linearizability is the strongest consistency guarantee: all operations appear to execute atomically at a single point in real time. It requires coordination on every operation and is not achievable when the system prefers availability during a partition.
- CAP theorem is a partition-time choice — CP or AP. During normal operation you can have both. “Consistency” in CAP means linearizability, not ACID consistency.
- PACELC extends CAP — even without partitions, strong consistency costs latency. Spanner is PC/EC: pays latency for consistency even without partitions.
- Causal consistency is often sufficient: many user-facing requirements reduce to “if you see B, you must have seen A (which caused B)” — not full linearizability.
- Lamport clocks provide total causal order; vector clocks provide precise causal tracking (at O(N) space cost). Both are free — no hardware required.
- Hybrid Logical Clocks (HLC) combine wall time with logical counters: monotonic, causal, and close to physical time. Used by CockroachDB and YugabyteDB.
- Snowflake IDs are the industry standard for distributed ID generation: 41 bits time + 10 bits node + 12 bits sequence = globally unique, approximately time-sorted, fits in 64-bit integer.
- Raft has replaced Paxos in virtually all new distributed systems (2026): etcd, CockroachDB, TiKV, Kafka KRaft, MongoDB replication.
- Consensus requires majority quorum: < majority available → system blocks rather than risk split-brain. This is a deliberate safety-over-liveness trade-off.
- Use etcd/ZooKeeper for coordination — don’t implement Raft yourself. These are Jepsen-verified, battle-tested, and handle membership changes correctly.
Modern Context (2026)
Raft is now the universal standard:
- etcd (Kubernetes coordination): Raft-based, ~10,000 writes/sec
- CockroachDB, TiDB, YugabyteDB: Raft per shard/range
- Kafka KRaft (Kafka 3.3+): Eliminated ZooKeeper dependency; Raft-based controller quorum
- MongoDB (4.0+): Raft-based replication replacing oplog-based
- TiKV (TiDB’s storage layer): Raft per region (similar to CockroachDB’s ranges)
- Virtually all new distributed systems choose Raft over Paxos
Extended Paxos variants (still active in research and specialized use):
- Flexible Paxos: Quorum sizes can vary per request (larger read quorum = smaller write quorum); used in some geo-distributed systems
- EPaxos (Egalitarian Paxos): Leaderless, parallel consensus for non-conflicting operations; lower latency than leader-based in geo-distributed settings
- CASPaxos: Single-decree Paxos with compare-and-swap semantics; used in distributed state machines
Consensus-as-a-service:
- etcd: Standard Kubernetes control plane; hosted by EKS, GKE, AKS
- Google Cloud Spanner: Fully managed consensus with TrueTime (Paxos internally)
- CockroachDB Serverless: Managed Raft consensus/replication
- TiDB Cloud: Managed Raft + TiFlash columnar engine
ID generation landscape (2026):
- UUID v7 (RFC 9562, 2024): Time-ordered UUIDs are now standardized; expected to replace v4 in new systems
- ULID widespread adoption in event-driven systems and event sourcing (sortable, URL-safe)
- Snowflake variants used at: Discord (incremented by 4096 for sharding), Instagram (Postgres sequence + shard ID), LinkedIn (Li.Li variant)
- TSIDs becoming popular in new microservice stacks (smaller than ULID; fits native SQL integer)
HLC in production databases:
- CockroachDB: Uses HLC for MVCC timestamp assignment and distributed transaction ordering
- YugabyteDB: Similar HLC usage for consistency in geo-distributed transactions
- Spanner: Uses TrueTime (physically bounded clock) — equivalent to a perfect HLC
- This means many production distributed databases no longer use raw Lamport clocks
Questions for Reflection
- A distributed system uses quorum reads and writes (
r + w > n). Is this system linearizable? If not, describe a specific scenario where a non-linearizable read occurs. - A developer argues: “We have three replicas, so we can tolerate one failure and still serve reads from the remaining two.” What does this argument assume about consistency? Under what conditions does it break?
- Explain why total order broadcast is equivalent to consensus. If you had a perfect total order broadcast implementation, how would you use it to implement a distributed lock?
- Why does Raft’s leader election require the candidate’s log to be “at least as up-to-date” as a majority of nodes? What goes wrong if you skip this check?
- Compare Snowflake IDs and ULIDs for a high-throughput event logging system that: (a) needs IDs to fit in a database bigint column, (b) has 1000 nodes generating events. Which would you choose and why?
- A system uses linearizability for all operations but experiences 200ms write latency for cross-datacenter operations. A colleague suggests switching to causal consistency to reduce latency. What specific consistency guarantees would be lost, and which use cases can tolerate this change?
Related Resources
- ch09-trouble-with-distributed-systems — The catalog of failures that make consistency hard
- ch08-transactions — ACID and serializable isolation within a single database
- ch06-replication — How replicas handle leader election and failover
- ch09-consistency-and-consensus — 1st edition equivalent; compare for what’s new
- ch08-trouble-with-distributed-systems — 1st edition Chapter 8
Last Updated: 2026-05-29