Chapter 7 Cheat Sheet - Transactions

One-Line Summaries

ConceptOne-Liner
AtomicityAll-or-nothing: transaction commits fully or rolls back entirely
ConsistencyApplication invariants hold (app’s responsibility, not DB’s)
IsolationConcurrent transactions behave as if they ran serially
DurabilityCommitted data persists through crashes
Read CommittedNo dirty reads/writes; most DBs’ default minimum level
Snapshot IsolationEach transaction sees consistent snapshot; MVCC-based
SerializableStrongest isolation; results identical to serial execution
MVCCMulti-Version Concurrency Control — readers don’t block writers
2PLTwo-Phase Locking — writers block readers; slow but strong
SSISerializable Snapshot Isolation — optimistic, detects conflicts at commit
Write skewTwo transactions each read, each write validly, but together violate invariant

Isolation Level Comparison

LevelDirty ReadNon-Repeatable ReadPhantom ReadWrite SkewLost Update
Read Uncommitted❌ Possible❌ Possible❌ Possible❌ Possible❌ Possible
Read Committed✅ Prevented❌ Possible❌ Possible❌ Possible❌ Possible
Snapshot Isolation✅ Prevented✅ Prevented❌ Possible❌ Possible✅ Prevented*
Serializable (2PL)✅ Prevented✅ Prevented✅ Prevented✅ Prevented✅ Prevented
Serializable (SSI)✅ Prevented✅ Prevented✅ Prevented✅ Prevented✅ Prevented

*Snapshot isolation prevents lost updates via automatic detection (PostgreSQL) but not all DBs do this

Concurrency Bugs Explained

Dirty Read:
  Txn A: WRITE x=2 (not yet committed)
  Txn B: READ x → sees 2 (dirty!)
  Txn A: ROLLBACK
  Txn B: used a value that never existed

Dirty Write:
  Txn A: WRITE invoice (uncommitted)
  Txn B: WRITE invoice address (overwrites A's uncommitted write!)
  Txn A: COMMIT (incorrect invoice address)

Non-Repeatable Read (Read Skew):
  Txn A: READ balance=500
  Txn B: UPDATE balance=0, COMMIT
  Txn A: READ balance=0 (different value in same transaction!)

Lost Update:
  Txn A: READ counter=5
  Txn B: READ counter=5
  Txn A: WRITE counter=6 (5+1)
  Txn B: WRITE counter=6 (5+1, should be 7!)
  Txn A's update is LOST

Write Skew:
  Invariant: at least 2 doctors on call
  Txn A (Dr. Alice): READ count=2 → ok to go off-call → DELETE alice
  Txn B (Dr. Bob):   READ count=2 → ok to go off-call → DELETE bob
  COMMIT both → count=0 (invariant violated!)

Phantom Read:
  Txn A: SELECT * WHERE room=101 AND date='2026-04-01' → 0 rows
  Txn B: INSERT booking(room=101, date='2026-04-01'), COMMIT
  Txn A: INSERT booking(room=101, date='2026-04-01') → double booking!

Lost Update Solutions

1. Atomic DB operations (best):
   ❌  read v=5, v=v+1, write v=6
   ✅  UPDATE counter SET value = value + 1 WHERE id = X

2. SELECT FOR UPDATE (explicit lock):
   BEGIN;
   SELECT balance FROM accounts WHERE id=1 FOR UPDATE;  ← acquires lock
   UPDATE accounts SET balance = balance - 100 WHERE id=1;
   COMMIT;  ← releases lock

3. Compare-and-Set (optimistic):
   UPDATE wiki SET content = 'new' WHERE content = 'old';
   ← Fails if content changed since we read it; app retries

4. Automatic detection (PostgreSQL snapshot isolation):
   Database detects lost update pattern → aborts one transaction

Three Approaches to Serializability

1. Actual Serial Execution (VoltDB, Redis)
   ──────────────────────────────────────
   Execute one transaction at a time, single thread
   ✅ Truly serializable, simple
   ❌ Limited by single CPU core throughput
   Requires: all data in RAM, transactions as short stored procedures

2. Two-Phase Locking (2PL)
   ────────────────────────
   Expanding phase: acquire locks (shared for reads, exclusive for writes)
   Shrinking phase: release all locks at commit
   Writers block readers; readers block writers
   ✅ Strong guarantees, prevents all anomalies
   ❌ Much slower (lock contention), deadlocks possible
   ❌ Predicate locks needed for phantoms (expensive)

3. Serializable Snapshot Isolation (SSI)
   ──────────────────────────────────────
   Built on snapshot isolation + conflict detection at commit
   Optimistic: let transactions run freely, detect conflicts at end
   Readers don't block writers; writers don't block readers
   ✅ Excellent performance when contention is low
   ❌ Higher abort rate under high contention
   Used by: PostgreSQL, CockroachDB

MVCC Mechanics

Row versions:
  user_id=1: [created_txn=1, deleted_txn=∞, balance=1000]  ← v1
  user_id=1: [created_txn=5, deleted_txn=∞, balance=900]   ← v2 (after withdrawal)

Transaction T7 reads:
  Sees rows where: created_txn ≤ 7 AND deleted_txn > 7
  → Sees balance=900 (v2, created by txn 5 < 7, not deleted)

Transaction T3 reads (started before T5):
  Sees rows where: created_txn ≤ 3 AND deleted_txn > 3
  → Sees balance=1000 (v1, created by txn 1 < 3, not deleted by txn 3)

Key: Readers see snapshot from their start time — unaffected by concurrent writes
VACUUM/GC: Background process removes versions no longer needed by any active transaction

Two-Phase Commit (Distributed)

COORDINATOR                 PARTICIPANT A         PARTICIPANT B
    │                            │                     │
    │──── "Prepare" ────────────→│                     │
    │──── "Prepare" ─────────────────────────────────→ │
    │                            │ ← "Yes, ready"      │
    │                            │            "Yes, ready" →
    │ (All said yes: commit!)    │                     │
    │──── "Commit" ─────────────→│                     │
    │──── "Commit" ─────────────────────────────────→ │
    │                            │ ← "Done"            │
    │                            │               "Done" →
    │ 
    
If any says No → send Abort to all
If coordinator fails after Prepare but before Commit → participants stuck holding locks (in-doubt)

Key Trade-offs

DecisionProConWhen to Use
Read CommittedGood performanceAllows read skew, write skewMost web apps (default in many DBs)
Snapshot IsolationGood for reads, MVCCAllows write skewAnalytics, long reads
Serializable (2PL)Prevents all anomaliesSlow, deadlocksLow-concurrency, correctness critical
Serializable (SSI)Fast when low contentionAborts under high contentionHigh-concurrency, correctness critical
Serial executionSimplest to reason aboutSingle-thread limitIn-memory, high-QPS short transactions

Red Flags

❌ Assuming “ACID database” means you’re safe from all concurrency bugs
❌ Read-modify-write without atomic operations or locking (lost update risk)
❌ Snapshot isolation for write skew-sensitive operations without extra precautions
❌ Long-running interactive transactions (hold locks, increase deadlock risk)
❌ Two-phase commit without understanding the coordinator failure modes

Green Flags

✅ Use atomic DB operations (UPDATE x = x + 1) instead of read-modify-write
✅ Use SELECT FOR UPDATE for pessimistic locking when needed
✅ Test your isolation level against your specific concurrency scenarios
✅ Use serializable isolation for correctness-critical operations (SSI in PostgreSQL)
✅ Keep transactions short (reduce lock hold time, reduce deadlock risk)

Modern Additions (2026)

Multi-document transactions (2026 standard):
├─ MongoDB 4.0+: ACID multi-document, multi-collection transactions
├─ DynamoDB: TransactWriteItems (up to 25 items)
└─ Cassandra: Lightweight transactions (LWT) + batch writes

Saga pattern (microservices):
├─ Sequence of local transactions with compensating actions
├─ Orchestration (central saga orchestrator) vs Choreography (event-based)
└─ Eventual consistency across microservice boundaries

Globally distributed serializable:
├─ CockroachDB, Spanner, YugabyteDB: SSI at global scale
└─ Most modern distributed DBs default to serializable (2026)

Interview Response Templates

When Asked About ACID

“Atomicity means all-or-nothing — a transaction either commits fully or rolls back. Consistency is really the application’s job — invariants the business requires. Isolation is the interesting one: the degree to which concurrent transactions see each other’s work. Durability means committed data persists. The most nuanced is isolation — most databases default to a level weaker than full serializability for performance reasons.”

When Asked About Isolation Levels

“Read Committed (default in most DBs) prevents dirty reads/writes but allows read skew. Snapshot Isolation (MVCC-based) gives each transaction a consistent view at its start time, preventing read skew, but write skew is still possible. Serializable is the strongest — SSI (used by PostgreSQL) is optimistic and adds minimal overhead at low contention. For most web apps, snapshot isolation is the right balance.”


Quick Revision Time: 5 minutes
Interview Prep: 15 minutes
Last Updated: 2026-04-13