Chapter 10: Consistency and Consensus

ddia-2e consistency consensus linearizability raft paxos logical-clocks

Status: Notes complete

Overview

Chapter 10 addresses the most theoretically deep questions in distributed systems: what does “correct behavior” even mean when data is spread across multiple nodes? It progresses from the strongest consistency guarantee (linearizability) through weaker models, then introduces the ID generation and logical clock problem as a bridge to the central topic — consensus. The chapter culminates in practical consensus algorithms (Raft, Paxos), their coordination service implementations (ZooKeeper, etcd), and the equivalence between consensus and total order broadcast.

This is the 2nd edition’s equivalent of 1st edition Chapter 9, with two major structural additions:

ID Generators and Logical Clocks is now a standalone section covering Lamport clocks, vector clocks, Hybrid Logical Clocks, and practical ID generation strategies (Snowflake IDs, ULIDs, TSIDs)
Linearizable ID Generators — a new section connecting logical clock theory to practical distributed ID systems

Core thesis: Many distributed system problems reduce to consensus. If you can solve consensus, you can solve leader election, atomic broadcast, atomic commit, and distributed locking. But consensus is expensive — not every problem requires it, and identifying the minimum consistency level needed is a key design skill.

Key Concepts

Linearizability

What Makes a System Linearizable?

Linearizability (also called atomic consistency or strong consistency): The strongest consistency guarantee for individual objects. An operation appears to execute instantaneously and atomically at some single point in real time between its invocation and its completion. Once any operation completes, all subsequent operations by any client on any node must observe the new state.

Informal definition: Make a distributed system behave “as if” there’s only a single copy of the data, served by a single, infinitely fast CPU.

Formal property (recency guarantee): If a write W completes before a read R begins, R must return the value written by W (or a later write). There is no “going back” to a previous value.

Linearizability illustrated with a timeline:

CLIENT 1: ├── write(x=1) ──┤
CLIENT 2:     ├─────── read(x) ─────────┤
CLIENT 3:                   ├── read(x) ──┤

If write(x=1) completes before CLIENT 2's read STARTS:
  CLIENT 2 MUST see x=1 (not x=0)
  CLIENT 3 MUST see x=1 (its read starts after the write completed)

LINEARIZABLE: All clients see the effect of the write immediately after it completes.
NON-LINEARIZABLE: CLIENT 2 reads x=0 even though write(x=1) completed before it started.

Register operations (the building block of linearizability proofs):

read(x) — returns the current value
write(x, val) — sets value; returns OK
compare-and-swap(x, expected, new) — atomic CAS; returns success/fail

Linearizability vs serializability:

Serializability: Transaction isolation property. Concurrent transactions appear to execute in some serial order. That order can be in the “past” (transactions reordered even if they ran concurrently).
Linearizability: Recency guarantee for individual operations. The “linearization point” must be within the real-time span of the operation.
Strict serializability = serializability + linearizability = the strongest possible guarantee (used by Spanner, FoundationDB)

                    Linearizable?   Serializable?   Used by
────────────────────────────────────────────────────────────────
Single-threaded     Yes             Yes             Single-node DB
Snapshot Isolation  No              No (SI ≠ Ser.)  PostgreSQL default
Serializability     No              Yes             CockroachDB SSI
Linearizability     Yes             No (per-object) etcd, ZooKeeper
Strict Ser.         Yes             Yes             Spanner, FoundationDB
────────────────────────────────────────────────────────────────

Relying on Linearizability

When linearizability is required:

Locking and leader election: All nodes must agree on exactly one holder of the lock at any moment. Without linearizability, two nodes might both believe they hold the lock.
Uniqueness constraints: Enforcing “only one user can register with this username” requires that the check and the reservation are atomic. Non-linearizable storage allows two nodes to see “username not taken” simultaneously and both register it.
Cross-channel timing: A user uploads a photo, then sends a link to a friend via email. The email and the photo go through different systems. If the friend clicks the link before the photo is replicated, they see a 404. Linearizability guarantees that after the upload completes, all subsequent reads return the photo.
Bank account invariants: “Account balance must never go negative” — checking and debiting must appear atomic. Non-linearizable reads allow two concurrent withdrawals to each check the balance independently and both succeed.

Implementing Linearizable Systems

Single-leader replication with synchronous writes: The leader handles all writes; reads go to the leader or synchronously replicated followers. Linearizable if (and only if) every read goes to the leader and leadership is stable.

Multi-leader replication: NOT linearizable — multiple leaders accept writes concurrently; conflicts are possible. (By definition, if two nodes can both accept writes, the “single timeline” property is broken.)

Leaderless replication (Dynamo-style): NOT inherently linearizable. Even with quorum reads/writes (r + w > n), clock skew and concurrent writes can result in stale reads. “Last write wins” with timestamps is not linearizable.

Consensus algorithms (Raft, Paxos): Linearizable — all writes go through a leader elected by consensus; the leader replicates before acknowledging; reads from the leader are linearizable.

RAM on a single machine: Linearizable for single-core CPUs. Multi-core CPUs with cache coherence are only linearizable if you use memory barriers / fences — without them, CPU instruction reordering breaks linearizability even locally.

The Cost of Linearizability

CAP theorem (Brewer, 2000):

C = Consistency (actually means Linearizability in CAP)
A = Availability (every request gets a non-error response)
P = Partition tolerance (system continues operating during network partition)

The real choice: Partition tolerance is not optional — network partitions happen. So the real decision during a partition is: CP or AP?

CP systems (choose consistency): Return an error/timeout during partition. No stale reads. ZooKeeper, etcd, HBase, CockroachDB.
AP systems (choose availability): Return a response (potentially stale) during partition. Cassandra, DynamoDB (eventually consistent mode), Riak.

CAP is often misunderstood:

CAP only applies during a partition — not during normal operation (when you can have both C and A)
CAP’s “consistency” = linearizability — not ACID consistency (an unrelated concept)
CAP doesn’t say anything about latency during normal operation

PACELC theorem (Daniel Abadi, 2012) — more complete than CAP:

During Partition: choose Availability vs Consistency
Else (no partition): choose Latency vs Consistency
Even without partitions, strong consistency requires coordination → latency cost
Spanner is PA/EL: prefers availability during partition (read from local replica), but pays latency for consistency during normal operation

PACELC Placement:
─────────────────────────────────────────────────────
System              P:A/C   E:L/C   Notes
─────────────────────────────────────────────────────
Spanner             PC      EC      TrueTime; global consistency
CockroachDB         PC      EC      Raft-based; strict serializable
etcd / ZooKeeper    PC      EC      Linearizable; blocks during partition
DynamoDB            PA      EL      Configurable; AP by default
Cassandra           PA      EL      Tunable consistency; AP by default
MySQL async replica PA      EL      Stale reads from replicas OK
─────────────────────────────────────────────────────

ID Generators and Logical Clocks

(New in 2nd Edition) This section bridges the clock discussion from Chapter 9 with the practical problem of generating unique, ordered identifiers in distributed systems.

Logical Clocks

Why logical clocks exist: Physical clocks (wall clocks) are unreliable for ordering events across nodes (as established in Ch9). Logical clocks capture causal ordering — “if A happened before B, A’s timestamp is smaller than B’s” — without requiring synchronized physical time.

Lamport clocks (Leslie Lamport, 1978):

Each node maintains a counter L
Rule 1: Before sending a message, increment L; include L in the message
Rule 2: On receiving a message with clock value m: L = max(L, m) + 1
Rule 3: Local events increment L by 1

Lamport clock properties:

If A → B (A happened before B causally): L(A) < L(B) — guaranteed
If L(A) < L(B): We cannot conclude A → B (A might be concurrent with B)
Lamport clocks provide total order (every pair of events is comparable) but not causal precision

Lamport Clock Example:
─────────────────────────────────────────────────────
Node 1:   L=1    L=2     L=3     L=4     L=6
          send → ─────────────── recv ←─ send →
                               ↑                ↓
Node 2:   L=1    L=3     ─────── ───── L=5     recv
                recv ←                  send →
─────────────────────────────────────────────────────
Message from Node 1 (L=2) arrives at Node 2: L2 = max(L2=2, L1=2) + 1 = 3

Vector clocks (Fidge/Mattern, 1988):

Each node maintains a vector V[N] where N is the number of nodes
Rule: Increment V[i] on each event at node i; merge on message receive: V[j] = max(V[i], V[received]) + 1
Property: V(A) < V(B) if and only if A causally precedes B (A → B)
Can detect concurrent events: if neither V(A) ≤ V(B) nor V(B) ≤ V(A), A and B are concurrent

Comparison of clock types:

                    Total Order   Causal Precision   Size    Physical Time
─────────────────────────────────────────────────────────────────────────
Lamport clocks      Yes           No (imprecise)     O(1)    No
Vector clocks       No            Yes (precise)      O(N)    No
Wall clocks (NTP)   Yes           No (unreliable)    O(1)    Yes (±10ms)
Hybrid Logical      Yes           Partial            O(1)    Approximate
Clocks (HLC)
TrueTime            Yes           Yes                O(1)    Bounded (±7ms)
─────────────────────────────────────────────────────────────────────────

Hybrid Logical Clocks (HLC):

Combines wall clock with a logical counter: HLC = (physical_time, logical_counter)
Always monotonically increasing
Stays within bounded skew of physical wall time
On send: HLC.physical = max(HLC.physical, wall_clock); HLC.logical++
On receive: HLC.physical = max(HLC.physical, msg.physical, wall_clock); HLC.logical = max(HLC.logical, msg.logical) + 1
Used by: CockroachDB (for transaction timestamps), YugabyteDB

Linearizable ID Generators

The problem: Many systems need globally unique, monotonically increasing IDs for ordering events, database rows, or messages. In a distributed system, no single node can be the sole ID authority without becoming a bottleneck and single point of failure.

ID generation strategies compared:

Strategy	Example	Globally Unique?	Monotonic?	Sortable?	DB-Index Friendly?	Distributed?
Auto-increment	MySQL `AUTO_INCREMENT`	No (per-node)	Yes (per-node)	Yes	Yes (sequential)	No
UUID v4	`550e8400-e29b-41d4-...`	Yes	No	No	No (random insert)	Yes
UUID v7	Time-ordered UUID	Yes	Approximate	Yes	Good	Yes
Snowflake ID	Twitter Snowflake	Yes	Approximate	Yes	Yes	Yes (per-node)
ULID	`01ARZ3NDEKTSV4RRFFQ69`	Yes	Approximate	Yes	Yes	Yes
TSID	64-bit time+sequence	Yes	Approximate	Yes	Yes	Yes
Logical clock ID	Lamport-based	Yes (if unique node IDs)	Causal	Causal	Depends	Yes

Snowflake ID (Twitter, 2010 — widely adopted):

64-bit structure:
┌──────────────────────────────┬─────────────────┬──────────────────┐
│ 41 bits: milliseconds since  │ 10 bits: node   │ 12 bits: seq no. │
│ epoch (Jan 1, 2010)          │ ID              │ per millisecond  │
└──────────────────────────────┴─────────────────┴──────────────────┘

- Each node generates up to 4096 IDs/ms without coordination
- IDs from the same node are monotonic within each millisecond
- IDs from different nodes are approximately time-ordered (within clock skew)
- Used by: Twitter, Discord, Instagram, LinkedIn (Snowflake variants)

ULID (Universally Unique Lexicographically Sortable Identifier):

26 characters: 10 chars timestamp (ms) + 16 chars random
  01ARZ3NDEKTSV4RRFFQ69G3BVZB
  ^^^^^^^^^^   ^^^^^^^^^^^^^^^^
  48-bit ms    80-bit random
- Lexicographically sortable (unlike UUID v4)
- No coordination required between nodes
- Approximately time-ordered (random within same millisecond)
- Same timestamp uniqueness: uses random bits (collision risk with very high throughput)

TSID (Time-Sorted ID):

64-bit structure (similar to Snowflake but without fixed node ID):
┌──────────────────────────────┬──────────────────────────────────┐
│ 42 bits: timestamp           │ 22 bits: random or node+sequence │
└──────────────────────────────┴──────────────────────────────────┘
- Smaller than ULID (64-bit int vs 128-bit string)
- Fits in PostgreSQL bigint or MySQL bigint
- Used by Vladimyr Agafonkin's TSID library; adopted in some cloud services

When to use each:

Low-throughput, single DB:   AUTO_INCREMENT (simplest)
Distributed, any throughput: UUID v7 or ULID (good default)
High-throughput, single DC:  Snowflake ID (max performance)
Need causal ordering:        Logical clock IDs (Lamport-based)
Need physical time + order:  HLC-based IDs (CockroachDB approach)

Consensus

The Many Faces of Consensus

Consensus definition: A group of nodes must agree on a single value. Once they decide, the decision is final.

Formal properties:

Uniform agreement: All non-faulty nodes decide on the same value
Validity: The decided value was proposed by some node (no value appears from nowhere)
Integrity: Each node decides at most once (no changing your mind)
Termination (liveness): All non-faulty nodes eventually decide

Problems that reduce to consensus (all equivalent in power):

Leader election: Agree on one node as leader
Total order broadcast (atomic broadcast): All nodes deliver messages in the same order
Atomic commit (2PC): All nodes agree whether to commit or abort a transaction
Distributed locks: Agree on which node holds the lock

Why “equivalent”: If you can solve consensus, you can implement total order broadcast (each message = one consensus round). If you can implement total order broadcast, you can build a replicated state machine (replay the same messages in the same order on each replica).

Total Order Broadcast (TOB):

All nodes receive all messages in the same order
No message is skipped (reliable delivery)
Used by: ZooKeeper (Zab), Kafka (Raft in KRaft mode), etcd (Raft)
Equivalence to consensus: TOB = consensus where each message position is one consensus decision

Consensus in Practice

Why Paxos matters historically:

Paxos (Lamport, 1998 — based on 1989 tech report): The first proven correct consensus algorithm
Notoriously difficult to understand and implement correctly
Single-decree Paxos: agree on one value (one round)
Multi-Paxos: extend to a log of values (add a leader optimization for multiple rounds)
Industrial implementations: Google Chubby, Apache Zookeeper (conceptually similar)

Raft (Ongaro and Ousterhout, 2014):

Designed explicitly for understandability — the thesis was literally “In Search of an Understandable Consensus Algorithm”
Leader-based: all writes go through the leader
Three sub-problems: leader election, log replication, safety
More commonly implemented than Paxos in new systems (2026)

Raft leader election:

States: Follower → Candidate → Leader

Follower: Receives heartbeats from leader. 
  If no heartbeat for election_timeout: → Candidate

Candidate: Increments term. Votes for self. Sends RequestVote RPC.
  If majority responds "yes": → Leader
  If another leader emerges (higher term): → Follower
  If split vote (timeout): → Candidate again (new election)

Leader: Sends heartbeats (AppendEntries with no entries) to all followers.
  If follower's term > leader's term: → Follower (step down)

Election safety: Only ONE leader can be elected per term.
  A candidate needs majority votes; two candidates can't both get majority.

Raft log replication:

Client → Leader → AppendEntries RPC → Followers
                                    ← Acknowledge (majority)
         Leader marks entry COMMITTED
         Leader applies to state machine; sends response to client
         Followers apply on next AppendEntries (via commit index)

COMMITTED = written to majority of nodes. Even if leader crashes now,
            the committed entry WILL survive (majority still has it).

Raft safety guarantee: A leader always has all committed entries. Raft’s election algorithm ensures a candidate can only become leader if its log is at least as up-to-date as any majority’s logs.

Paxos vs Raft comparison:

Property            Paxos (Multi-Paxos)     Raft
────────────────────────────────────────────────────────────
Understandability   Very hard               Designed to be easy
Leader              Optional optimization   Required
Log structure       Multiple "slots"        Single append-only log
Membership changes  Requires separate       Joint consensus or
                    mechanism               single-server changes
Real implementations Few (Chubby, Zab)      Many (etcd, CockroachDB,
                                            Kafka KRaft, TiKV, MongoDB)
Election safety     Equivalent              Equivalent
Throughput          Roughly equivalent      Roughly equivalent
Preferred in 2026   No (legacy)             YES (new systems)

Consensus algorithm limitations:

Requires a majority of nodes to be healthy and reachable to make progress
If < majority available: blocks (refuses to process writes) — liveness sacrificed for safety
Synchronous replication: Every write waits for majority acknowledgment → latency = max(majority latency)
Cross-datacenter latency: ~100ms round-trip → 100ms minimum write latency for cross-DC consensus
Single-datacenter: ~1ms write latency (excellent)

Coordination Services

ZooKeeper (Apache) and etcd (CNCF):

Distributed coordination services built on consensus (Zab and Raft respectively)
Provide: leader election, distributed locks, distributed configuration, service discovery, barrier synchronization

ZooKeeper core primitives:

znodes: Files in a hierarchical tree; persistent or ephemeral
Ephemeral znodes: Automatically deleted when creating client session ends → used for locks and leader registration
Sequential znodes: ZooKeeper appends a monotonically increasing number to the name → natural fencing token source
Watches: Clients register to be notified when a znode changes → reactive leader election without polling

Leader election with ZooKeeper:

1. All candidates create sequential ephemeral znode: /election/candidate-00000001
2. Each node reads all candidates; sort by sequence number
3. Node with LOWEST sequence number = current leader
4. Other nodes watch the node with the NEXT lower sequence number
5. When a node's watched-node is deleted:
   - Check if it's now the lowest → become leader
   - Or watch the next-lower node → wait for turn
6. If leader crashes: its ephemeral znode is deleted;
   the next-lowest node's watch fires → it becomes leader

etcd usage patterns (2026 standard):

Kubernetes: etcd stores all cluster state (pods, services, configmaps, secrets)
Leader election: etcd.Election.Campaign() / etcd.Election.Observe()
Distributed lock: etcd.Mutex.Lock() / Unlock() (uses TTL + compare-and-swap)
Watch for config changes: etcd.Watch() on a key prefix

Why not implement consensus yourself:

Paxos and Raft are extremely complex with many subtle edge cases
Membership changes (adding/removing nodes) are particularly tricky
Leader election during network partitions requires careful handling
Use etcd or ZooKeeper: battle-tested, audited, Jepsen-verified implementations

Comparison Tables

Consistency Models (Weakest to Strongest)

Model	Guarantee	Concurrent writes	Read from stale replica?	Coordination required?
Eventual consistency	All replicas converge eventually	Last-write-wins or merge	Yes	No
Monotonic reads	You see your own previous reads	Possible concurrent	No (per-client)	No
Read-your-writes	You see your own writes	Possible concurrent	No (per-client)	Per-session
Consistent prefix	See causally related ops in order	Possible concurrent	No	No
Causal consistency	Causally related ops in order for all	Concurrent events may reorder	No	Vector clocks
Linearizability	All ops appear instantaneously	Atomic CAS	Never	Yes (quorum)
Strict serializability	Linearizable + serializable	Serialized	Never	Yes (2PL/SSI)

Consensus Algorithm Comparison

Property	Paxos	Raft	Zab (ZooKeeper)	PBFT	Tendermint
Fault model	Crash-recovery	Crash-recovery	Crash-recovery	Byzantine	Byzantine
Leader required	Optional (Multi-Paxos)	Yes	Yes	No	Yes
Max faulty nodes	f < n/2	f < n/2	f < n/2	f < n/3	f < n/3
Latency (normal)	2 round trips	2 round trips	2 round trips	3 round trips	2+ round trips
Understandability	Very hard	Moderate	Hard	Very hard	Hard
Use in 2026	Legacy	Standard	ZooKeeper only	Blockchain	Blockchain

ID Generation Strategies

Strategy	Bits	Unique?	Sortable?	Coordination?	Throughput	Best for
Auto-increment	32/64	Per-node	Yes	Single node	Very high	Single-node DB
UUID v4	128	Yes	No	None	Unlimited	Any distributed
UUID v7	128	Yes	Yes (time)	None	Unlimited	Distributed, sortable
Snowflake	64	Yes	Approx.	Per-node	4096/ms/node	High-throughput
ULID	128	Yes	Yes (time)	None	Unlimited	Human-readable
TSID	64	Yes	Yes (time)	None	Unlimited	SQL (fits bigint)
Lamport ID	variable	Yes (+ node ID)	Causal	None	Unlimited	Event sourcing

Important Points Summary

Linearizability is the strongest consistency guarantee: all operations appear to execute atomically at a single point in real time. It requires coordination on every operation and is not achievable when the system prefers availability during a partition.
CAP theorem is a partition-time choice — CP or AP. During normal operation you can have both. “Consistency” in CAP means linearizability, not ACID consistency.
PACELC extends CAP — even without partitions, strong consistency costs latency. Spanner is PC/EC: pays latency for consistency even without partitions.
Causal consistency is often sufficient: many user-facing requirements reduce to “if you see B, you must have seen A (which caused B)” — not full linearizability.
Lamport clocks provide total causal order; vector clocks provide precise causal tracking (at O(N) space cost). Both are free — no hardware required.
Hybrid Logical Clocks (HLC) combine wall time with logical counters: monotonic, causal, and close to physical time. Used by CockroachDB and YugabyteDB.
Snowflake IDs are the industry standard for distributed ID generation: 41 bits time + 10 bits node + 12 bits sequence = globally unique, approximately time-sorted, fits in 64-bit integer.
Raft has replaced Paxos in virtually all new distributed systems (2026): etcd, CockroachDB, TiKV, Kafka KRaft, MongoDB replication.
Consensus requires majority quorum: < majority available → system blocks rather than risk split-brain. This is a deliberate safety-over-liveness trade-off.
Use etcd/ZooKeeper for coordination — don’t implement Raft yourself. These are Jepsen-verified, battle-tested, and handle membership changes correctly.

Modern Context (2026)

Raft is now the universal standard:

etcd (Kubernetes coordination): Raft-based, ~10,000 writes/sec
CockroachDB, TiDB, YugabyteDB: Raft per shard/range
Kafka KRaft (Kafka 3.3+): Eliminated ZooKeeper dependency; Raft-based controller quorum
MongoDB (4.0+): Raft-based replication replacing oplog-based
TiKV (TiDB’s storage layer): Raft per region (similar to CockroachDB’s ranges)
Virtually all new distributed systems choose Raft over Paxos

Extended Paxos variants (still active in research and specialized use):

Flexible Paxos: Quorum sizes can vary per request (larger read quorum = smaller write quorum); used in some geo-distributed systems
EPaxos (Egalitarian Paxos): Leaderless, parallel consensus for non-conflicting operations; lower latency than leader-based in geo-distributed settings
CASPaxos: Single-decree Paxos with compare-and-swap semantics; used in distributed state machines

Consensus-as-a-service:

etcd: Standard Kubernetes control plane; hosted by EKS, GKE, AKS
Google Cloud Spanner: Fully managed consensus with TrueTime (Paxos internally)
CockroachDB Serverless: Managed Raft consensus/replication
TiDB Cloud: Managed Raft + TiFlash columnar engine

ID generation landscape (2026):

UUID v7 (RFC 9562, 2024): Time-ordered UUIDs are now standardized; expected to replace v4 in new systems
ULID widespread adoption in event-driven systems and event sourcing (sortable, URL-safe)
Snowflake variants used at: Discord (incremented by 4096 for sharding), Instagram (Postgres sequence + shard ID), LinkedIn (Li.Li variant)
TSIDs becoming popular in new microservice stacks (smaller than ULID; fits native SQL integer)

HLC in production databases:

CockroachDB: Uses HLC for MVCC timestamp assignment and distributed transaction ordering
YugabyteDB: Similar HLC usage for consistency in geo-distributed transactions
Spanner: Uses TrueTime (physically bounded clock) — equivalent to a perfect HLC
This means many production distributed databases no longer use raw Lamport clocks

Questions for Reflection

A distributed system uses quorum reads and writes (r + w > n). Is this system linearizable? If not, describe a specific scenario where a non-linearizable read occurs.
A developer argues: “We have three replicas, so we can tolerate one failure and still serve reads from the remaining two.” What does this argument assume about consistency? Under what conditions does it break?
Explain why total order broadcast is equivalent to consensus. If you had a perfect total order broadcast implementation, how would you use it to implement a distributed lock?
Why does Raft’s leader election require the candidate’s log to be “at least as up-to-date” as a majority of nodes? What goes wrong if you skip this check?
Compare Snowflake IDs and ULIDs for a high-throughput event logging system that: (a) needs IDs to fit in a database bigint column, (b) has 1000 nodes generating events. Which would you choose and why?
A system uses linearizability for all operations but experiences 200ms write latency for cross-datacenter operations. A colleague suggests switching to causal consistency to reduce latency. What specific consistency guarantees would be lost, and which use cases can tolerate this change?

ch09-trouble-with-distributed-systems — The catalog of failures that make consistency hard
ch08-transactions — ACID and serializable isolation within a single database
ch06-replication — How replicas handle leader election and failover
ch09-consistency-and-consensus — 1st edition equivalent; compare for what’s new
ch08-trouble-with-distributed-systems — 1st edition Chapter 8

Last Updated: 2026-05-29

Study Notes by Niladri & AI

Explorer

ch10-consistency-and-consensus

Chapter 10: Consistency and Consensus

Overview

Key Concepts

Linearizability

What Makes a System Linearizable?

Relying on Linearizability

Implementing Linearizable Systems

The Cost of Linearizability

ID Generators and Logical Clocks

Logical Clocks

Linearizable ID Generators

Consensus

The Many Faces of Consensus

Consensus in Practice

Coordination Services

Comparison Tables

Consistency Models (Weakest to Strongest)

Consensus Algorithm Comparison

ID Generation Strategies

Important Points Summary

Modern Context (2026)

Questions for Reflection

Graph View

Table of Contents

Backlinks

Study Notes by Niladri & AI

Explorer

ch10-consistency-and-consensus

Chapter 10: Consistency and Consensus

Overview

Key Concepts

Linearizability

What Makes a System Linearizable?

Relying on Linearizability

Implementing Linearizable Systems

The Cost of Linearizability

ID Generators and Logical Clocks

Logical Clocks

Linearizable ID Generators

Consensus

The Many Faces of Consensus

Consensus in Practice

Coordination Services

Comparison Tables

Consistency Models (Weakest to Strongest)

Consensus Algorithm Comparison

ID Generation Strategies

Important Points Summary

Modern Context (2026)

Questions for Reflection

Related Resources

Graph View

Table of Contents

Backlinks