Chapter 7 Cheat Sheet - Transactions
One-Line Summaries
| Concept | One-Liner |
|---|---|
| Atomicity | All-or-nothing: transaction commits fully or rolls back entirely |
| Consistency | Application invariants hold (app’s responsibility, not DB’s) |
| Isolation | Concurrent transactions behave as if they ran serially |
| Durability | Committed data persists through crashes |
| Read Committed | No dirty reads/writes; most DBs’ default minimum level |
| Snapshot Isolation | Each transaction sees consistent snapshot; MVCC-based |
| Serializable | Strongest isolation; results identical to serial execution |
| MVCC | Multi-Version Concurrency Control — readers don’t block writers |
| 2PL | Two-Phase Locking — writers block readers; slow but strong |
| SSI | Serializable Snapshot Isolation — optimistic, detects conflicts at commit |
| Write skew | Two transactions each read, each write validly, but together violate invariant |
Isolation Level Comparison
| Level | Dirty Read | Non-Repeatable Read | Phantom Read | Write Skew | Lost Update |
|---|---|---|---|---|---|
| Read Uncommitted | ❌ Possible | ❌ Possible | ❌ Possible | ❌ Possible | ❌ Possible |
| Read Committed | ✅ Prevented | ❌ Possible | ❌ Possible | ❌ Possible | ❌ Possible |
| Snapshot Isolation | ✅ Prevented | ✅ Prevented | ❌ Possible | ❌ Possible | ✅ Prevented* |
| Serializable (2PL) | ✅ Prevented | ✅ Prevented | ✅ Prevented | ✅ Prevented | ✅ Prevented |
| Serializable (SSI) | ✅ Prevented | ✅ Prevented | ✅ Prevented | ✅ Prevented | ✅ Prevented |
*Snapshot isolation prevents lost updates via automatic detection (PostgreSQL) but not all DBs do this
Concurrency Bugs Explained
Dirty Read:
Txn A: WRITE x=2 (not yet committed)
Txn B: READ x → sees 2 (dirty!)
Txn A: ROLLBACK
Txn B: used a value that never existed
Dirty Write:
Txn A: WRITE invoice (uncommitted)
Txn B: WRITE invoice address (overwrites A's uncommitted write!)
Txn A: COMMIT (incorrect invoice address)
Non-Repeatable Read (Read Skew):
Txn A: READ balance=500
Txn B: UPDATE balance=0, COMMIT
Txn A: READ balance=0 (different value in same transaction!)
Lost Update:
Txn A: READ counter=5
Txn B: READ counter=5
Txn A: WRITE counter=6 (5+1)
Txn B: WRITE counter=6 (5+1, should be 7!)
Txn A's update is LOST
Write Skew:
Invariant: at least 2 doctors on call
Txn A (Dr. Alice): READ count=2 → ok to go off-call → DELETE alice
Txn B (Dr. Bob): READ count=2 → ok to go off-call → DELETE bob
COMMIT both → count=0 (invariant violated!)
Phantom Read:
Txn A: SELECT * WHERE room=101 AND date='2026-04-01' → 0 rows
Txn B: INSERT booking(room=101, date='2026-04-01'), COMMIT
Txn A: INSERT booking(room=101, date='2026-04-01') → double booking!
Lost Update Solutions
1. Atomic DB operations (best):
❌ read v=5, v=v+1, write v=6
✅ UPDATE counter SET value = value + 1 WHERE id = X
2. SELECT FOR UPDATE (explicit lock):
BEGIN;
SELECT balance FROM accounts WHERE id=1 FOR UPDATE; ← acquires lock
UPDATE accounts SET balance = balance - 100 WHERE id=1;
COMMIT; ← releases lock
3. Compare-and-Set (optimistic):
UPDATE wiki SET content = 'new' WHERE content = 'old';
← Fails if content changed since we read it; app retries
4. Automatic detection (PostgreSQL snapshot isolation):
Database detects lost update pattern → aborts one transaction
Three Approaches to Serializability
1. Actual Serial Execution (VoltDB, Redis)
──────────────────────────────────────
Execute one transaction at a time, single thread
✅ Truly serializable, simple
❌ Limited by single CPU core throughput
Requires: all data in RAM, transactions as short stored procedures
2. Two-Phase Locking (2PL)
────────────────────────
Expanding phase: acquire locks (shared for reads, exclusive for writes)
Shrinking phase: release all locks at commit
Writers block readers; readers block writers
✅ Strong guarantees, prevents all anomalies
❌ Much slower (lock contention), deadlocks possible
❌ Predicate locks needed for phantoms (expensive)
3. Serializable Snapshot Isolation (SSI)
──────────────────────────────────────
Built on snapshot isolation + conflict detection at commit
Optimistic: let transactions run freely, detect conflicts at end
Readers don't block writers; writers don't block readers
✅ Excellent performance when contention is low
❌ Higher abort rate under high contention
Used by: PostgreSQL, CockroachDB
MVCC Mechanics
Row versions:
user_id=1: [created_txn=1, deleted_txn=∞, balance=1000] ← v1
user_id=1: [created_txn=5, deleted_txn=∞, balance=900] ← v2 (after withdrawal)
Transaction T7 reads:
Sees rows where: created_txn ≤ 7 AND deleted_txn > 7
→ Sees balance=900 (v2, created by txn 5 < 7, not deleted)
Transaction T3 reads (started before T5):
Sees rows where: created_txn ≤ 3 AND deleted_txn > 3
→ Sees balance=1000 (v1, created by txn 1 < 3, not deleted by txn 3)
Key: Readers see snapshot from their start time — unaffected by concurrent writes
VACUUM/GC: Background process removes versions no longer needed by any active transaction
Two-Phase Commit (Distributed)
COORDINATOR PARTICIPANT A PARTICIPANT B
│ │ │
│──── "Prepare" ────────────→│ │
│──── "Prepare" ─────────────────────────────────→ │
│ │ ← "Yes, ready" │
│ │ "Yes, ready" →
│ (All said yes: commit!) │ │
│──── "Commit" ─────────────→│ │
│──── "Commit" ─────────────────────────────────→ │
│ │ ← "Done" │
│ │ "Done" →
│
If any says No → send Abort to all
If coordinator fails after Prepare but before Commit → participants stuck holding locks (in-doubt)
Key Trade-offs
| Decision | Pro | Con | When to Use |
|---|---|---|---|
| Read Committed | Good performance | Allows read skew, write skew | Most web apps (default in many DBs) |
| Snapshot Isolation | Good for reads, MVCC | Allows write skew | Analytics, long reads |
| Serializable (2PL) | Prevents all anomalies | Slow, deadlocks | Low-concurrency, correctness critical |
| Serializable (SSI) | Fast when low contention | Aborts under high contention | High-concurrency, correctness critical |
| Serial execution | Simplest to reason about | Single-thread limit | In-memory, high-QPS short transactions |
Red Flags
❌ Assuming “ACID database” means you’re safe from all concurrency bugs
❌ Read-modify-write without atomic operations or locking (lost update risk)
❌ Snapshot isolation for write skew-sensitive operations without extra precautions
❌ Long-running interactive transactions (hold locks, increase deadlock risk)
❌ Two-phase commit without understanding the coordinator failure modes
Green Flags
✅ Use atomic DB operations (UPDATE x = x + 1) instead of read-modify-write
✅ Use SELECT FOR UPDATE for pessimistic locking when needed
✅ Test your isolation level against your specific concurrency scenarios
✅ Use serializable isolation for correctness-critical operations (SSI in PostgreSQL)
✅ Keep transactions short (reduce lock hold time, reduce deadlock risk)
Modern Additions (2026)
Multi-document transactions (2026 standard):
├─ MongoDB 4.0+: ACID multi-document, multi-collection transactions
├─ DynamoDB: TransactWriteItems (up to 25 items)
└─ Cassandra: Lightweight transactions (LWT) + batch writes
Saga pattern (microservices):
├─ Sequence of local transactions with compensating actions
├─ Orchestration (central saga orchestrator) vs Choreography (event-based)
└─ Eventual consistency across microservice boundaries
Globally distributed serializable:
├─ CockroachDB, Spanner, YugabyteDB: SSI at global scale
└─ Most modern distributed DBs default to serializable (2026)
Interview Response Templates
When Asked About ACID
“Atomicity means all-or-nothing — a transaction either commits fully or rolls back. Consistency is really the application’s job — invariants the business requires. Isolation is the interesting one: the degree to which concurrent transactions see each other’s work. Durability means committed data persists. The most nuanced is isolation — most databases default to a level weaker than full serializability for performance reasons.”
When Asked About Isolation Levels
“Read Committed (default in most DBs) prevents dirty reads/writes but allows read skew. Snapshot Isolation (MVCC-based) gives each transaction a consistent view at its start time, preventing read skew, but write skew is still possible. Serializable is the strongest — SSI (used by PostgreSQL) is optimistic and adds minimal overhead at low contention. For most web apps, snapshot isolation is the right balance.”
Quick Revision Time: 5 minutes
Interview Prep: 15 minutes
Last Updated: 2026-04-13