Chapter 9 Cheat Sheet - Consistency and Consensus

One-Line Summaries

Concept	One-Liner
Linearizability	System behaves as if single-copy; reads always return most recent write
Causal consistency	Causally related events seen in order; concurrent events may differ
Eventual consistency	Replicas converge eventually; no ordering guarantees during lag
CAP theorem	During network partition: choose consistency OR availability, not both
PACELC	CAP + latency trade-off even without partitions
Total order broadcast	All nodes see all messages in same order; equivalent to consensus
Consensus	Nodes agree on one value; once decided, cannot change
Raft	Leader-based consensus algorithm; designed for understandability
Paxos	Original consensus algorithm; complex but proven correct
ZooKeeper/etcd	Distributed coordination services built on consensus
2PC	Two-Phase Commit — atomic commit across multiple nodes; coordinator is SPOF

Consistency Model Spectrum

Strongest ──────────────────────────────────────────── Weakest

Linearizable → Strict Serial → Serializable → Snapshot → Causal → Monotonic → Eventual

Linearizable: Reads always see most recent write; global total order
Strict serial: Linearizable + serializable (strongest)
Serializable: Transactions appear serial; correct outcome
Snapshot:     Each transaction sees consistent point-in-time view
Causal:       Causally related events in order; concurrent can differ
Monotonic:    Read your own writes; no time reversal
Eventual:     Replicas converge when writes stop; anything goes during lag

Linearizability vs Serializability

Linearizability:                    Serializability:
─────────────────────────────       ──────────────────────────────────────────
Recency guarantee                   Isolation guarantee for transactions
On single objects                   On multiple-operation transactions
Real-time ordering required         Any serial order acceptable (even past)
Implemented at: storage level       Implemented at: transaction isolation level
Used for: locks, elections          Used for: concurrent database transactions
Cost: coordination on every op      Cost: locking or abort overhead

STRICT SERIALIZABILITY = both properties together (most expensive, most correct)

CAP Theorem Explained

Network partition happens (you can't prevent this):

CP (Consistent + Partition-tolerant):
  During partition: REFUSE requests (return error)
  After partition heals: respond normally
  Examples: ZooKeeper, etcd, HBase
  Use when: data correctness > availability
  
AP (Available + Partition-tolerant):
  During partition: RESPOND with possibly stale data
  After partition heals: reconcile conflicts
  Examples: Cassandra, DynamoDB, CouchDB
  Use when: availability > consistency
  
"You can't opt out of partition tolerance in real distributed systems"
"CAP consistency = linearizability, NOT ACID consistency"

PACELC Diagram

                    ┌─ Partition? ─── Yes ──→ Latency (L) vs Consistency (C)
                    │                                         ↗        ↘
Does partition      │                              Lower latency   Stronger consistency
happen?             │
                    └─ No partition ──→ Availability (A) vs Consistency (C)
                    
Real systems labeled as PA/EL or PC/EC:
PA/EL: Cassandra (available during partition; low latency normally)
PC/EC: Spanner (consistent during partition; consistent normally, high latency)
PA/EC: Amazon DynamoDB (available during partition; consistent by default)

How Raft Works

Three node states: FOLLOWER, CANDIDATE, LEADER

Normal operation:
  All nodes: FOLLOWER (receiving heartbeats from LEADER)
  Client → LEADER (only leader accepts writes)
  LEADER → send log entry to all FOLLOWERS (AppendEntries RPC)
  FOLLOWERS → send acknowledgment
  Once majority ack'd → entry COMMITTED
  LEADER → send commit to FOLLOWERS

Leader election (leader timeout):
  FOLLOWER timeout (no heartbeat) → becomes CANDIDATE
  CANDIDATE → broadcasts RequestVote to all nodes
  Each node votes once (first-come-first-served)
  If CANDIDATE gets majority votes → becomes new LEADER
  LEADER → immediately sends heartbeat to suppress other elections

Log replication:
  LEADER has most up-to-date log (guaranteed by election voting)
  All writes flow through LEADER
  Committed = written to majority of nodes

Total Order Broadcast vs Consensus

Total Order Broadcast:              Consensus (single-decree):
──────────────────────────         ───────────────────────────────
All nodes agree on message ORDER   All nodes agree on ONE VALUE
Delivered reliably (no loss)       Irreversible once decided
Sequence of decisions              Single decision

They are equivalent:
  TOB → Consensus: each message position = one value agreed on
  Consensus → TOB: each consensus round decides next message in order

Why it matters:
  Consensus = leader election = atomic commit = total order broadcast
  Solve one, you solve them all

ZooKeeper / etcd Primitives

ZooKeeper hierarchy:
/
├── /leaders/
│   └── /leaders/cluster1 (ephemeral, sequential)
└── /locks/
    └── /locks/resource_A (ephemeral)

Ephemeral node: deleted when client session ends
  → Use for: locks, leader presence

Sequential node: auto-numbered (resource-0001, resource-0002)
  → Use for: fencing tokens (ever-increasing), queue ordering

Watch: callback when node changes/created/deleted
  → Use for: detecting leader changes, config updates

Lock implementation:
  1. CREATE /locks/my-lock EPHEMERAL
  2. If success → acquired
  3. If fail → WATCH the existing node
  4. When watch fires (node deleted) → retry step 1

Key Trade-offs

Decision	Pro	Con	When to Use
Linearizable	Correct, no stale reads	High latency, CP (blocks on partition)	Locks, unique constraints, elections
Causal	Preserves causality, available	Weaker than linear	Social feeds, collaborative docs
Eventual	Highest availability/perf	No ordering guarantee	Analytics, caches, non-critical
Raft	Understandable, widely implemented	Leader bottleneck	Strongly consistent distributed state
Paxos	Proven correct, variants optimized	Complex, hard to implement	Foundation for Raft/Zab
2PC	Atomic across nodes	Blocking on coordinator failure	Cross-DB atomic commit

Red Flags

❌ Assuming you can have CAP “all three” (you can’t during partitions)
❌ Using linearizability everywhere (very expensive; often overkill)
❌ Implementing Paxos/Raft yourself (use ZooKeeper/etcd instead)
❌ 2PC coordinator with no failover plan (SPOF → blocking)
❌ Confusing CAP “consistency” with ACID “consistency” (different things!)

Green Flags

✅ Identify what consistency model each operation actually needs
✅ Use causal consistency where linearizability isn’t required
✅ Use etcd/ZooKeeper for distributed locks and leader election
✅ Understand which systems are CP vs AP for your use case
✅ Know the PACELC trade-off: even without partitions, strong consistency costs latency

Modern Additions (2026)

Raft dominance:
├─ etcd (Kubernetes), CockroachDB, TiDB, YugabyteDB all use Raft
├─ Kafka KRaft: eliminated ZooKeeper dependency (Kafka 3.3+)
└─ Raft now has decades of production hardening

Flexible Paxos:
├─ Read and write quorums can be different sizes (w + r > n still required)
├─ Optimizes for specific read/write ratios
└─ Used in some geo-distributed systems

Consensus for cloud coordination:
├─ AWS DynamoDB Global Tables: multi-region replication with conflict resolution
├─ Google Cloud Spanner: external consistency globally
└─ CockroachDB Serverless: managed consensus at global scale

Interview Response Templates

When Asked About CAP Theorem

“CAP says that during a network partition, you can’t have both perfect availability and linearizable consistency. But this is specific to partition scenarios — normally you have both. It’s really a choice between CP (refuse requests to stay consistent, like ZooKeeper) and AP (serve possibly stale data to stay available, like Cassandra). In practice the PACELC theorem is more useful: it also captures the latency-consistency trade-off even without partitions.”

When Asked About Consensus

“Consensus is fundamental — it underlies leader election, distributed locks, atomic commit, and total order broadcast. Raft is the standard implementation today: a leader accepts all writes, replicates to followers, and commits once a majority acknowledge. Once you have total order broadcast (all nodes see all messages in the same order), you can build almost any coordination primitive. Use etcd or ZooKeeper rather than implementing Raft yourself.”

Quick Revision Time: 5 minutes
Interview Prep: 15 minutes
Last Updated: 2026-04-13

Study Notes by Niladri & AI

Explorer

ch09-cheatsheet