Chapter 8 Cheat Sheet — Transactions

ddia-2e transactions acid isolation cheatsheet

Quick Revision Time: 5 minutes | Interview Prep: 15 minutes


One-Line Summaries

ConceptOne-Liner
TransactionGroup of reads/writes that either all commit or all rollback
AtomicityAll-or-nothing; abortability is the key property
IsolationConcurrent transactions behave as if serial; weaker levels = better performance
DurabilityCommitted data survives crashes (WAL + replication)
MVCCMultiple row versions enable snapshot reads without blocking writers
Snapshot IsolationEach transaction sees consistent DB state from its start time
Lost updateConcurrent read-modify-write where one write clobbers the other
Write skewEach txn’s write is valid alone; combined they violate an invariant
2PLPessimistic locks; readers block writers; deadlocks possible
SSIOptimistic; run freely, detect conflicts at commit, abort if stale
2PCDistributed atomic commit; coordinator is SPOF; blocking on failure

ACID Properties Reference

PropertyKey WordWhat It PreventsWho’s ResponsibleImplementation
AtomicityAbortabilityPartial writes on crash or errorDatabaseWAL (write-ahead log); rollback journal
ConsistencyInvariantsApplication logic violationsApplicationDB enforces constraints (FK, UNIQUE, NOT NULL)
IsolationSerializabilityConcurrency anomaliesDatabase + app level selectionLocks (2PL) or MVCC (Snapshot/SSI)
DurabilityPersistenceData loss on crashDatabase + infraWAL flush before commit; multi-node replication

Isolation Levels Quick Reference

WEAKEST ──────────────────────────────────────────────────── STRONGEST

Read        Read        Repeatable   Snapshot     Serializable  Serializable
Uncommitted Committed   Read         Isolation    (2PL)         (SSI)

Dirty read: ❌ OK        ✅ Prevented ✅ Prevented ✅ Prevented  ✅ Prevented  ✅ Prevented
Non-repeat: ❌ OK        ❌ OK        ✅ Prevented ✅ Prevented  ✅ Prevented  ✅ Prevented
Phantom:    ❌ OK        ❌ OK        ❌ OK*       ✅ Prevented  ✅ Prevented  ✅ Prevented
Write skew: ❌ OK        ❌ OK        ❌ OK        ❌ OK         ✅ Prevented  ✅ Prevented
Lost update:❌ OK        ❌ OK        ❌ OK        Detected(PG)  ✅ Prevented  ✅ Prevented

*MySQL Repeatable Read with gap locks prevents some phantoms but not all write skew

Concurrency Anomaly Diagrams

Dirty Read (prevented by Read Committed):
─────────────────────────────────────────────────────────────
Txn A:  ───WRITE(x=200)──────ROLLBACK──────────
Txn B:          ──READ(x)→200────────────────────
                             ↑ Txn B sees uncommitted value that will vanish!

Lost Update (read-modify-write race):
─────────────────────────────────────────────────────────────
Txn A:  ─READ(ctr=5)──────────WRITE(6)──COMMIT──
Txn B:        ─READ(ctr=5)──────WRITE(6)──COMMIT──
                                                   ↑ ctr=6, should be 7!

Read Skew (prevented by Snapshot Isolation):
─────────────────────────────────────────────────────────────
Txn A:  ─READ(alice=500)──────────────READ(bob=600)──
Txn B:            ──WRITE(alice=400)─WRITE(bob=600)─COMMIT
                  ↑ A sees alice=500 from before B, bob=600 from after B
                    Total=1100, not 1000 → inconsistent snapshot

Write Skew (requires Serializability to prevent):
─────────────────────────────────────────────────────────────
Txn A:  ─READ(oncall_count=2)──────DELETE alice──COMMIT──
Txn B:  ─READ(oncall_count=2)──────DELETE bob───COMMIT──
                                                         ↑ oncall_count=0!
Both reads valid; both writes valid alone; together = invariant violated.

Preventing Lost Updates — Decision Tree

Choose the right lost-update prevention:
│
├─ Simple numeric increment/decrement?
│  └─ Atomic operation: UPDATE t SET val = val + 1 WHERE key = X
│
├─ Complex business logic between read and write?
│  └─ SELECT FOR UPDATE: acquires exclusive lock before read
│     BEGIN; SELECT ... FOR UPDATE; [logic]; UPDATE ...; COMMIT;
│
├─ Optimistic: low conflict expected, retry acceptable?
│  └─ Compare-and-set (CAS):
│     UPDATE t SET val=new WHERE key=X AND val=old_val
│     Check: 0 rows affected → retry
│
└─ Using PostgreSQL serializable or repeatable read?
   └─ Automatic detection: DB aborts conflicting txn; app retries

Two-Phase Locking (2PL) vs SSI

2PL (Pessimistic):                    SSI (Optimistic):
──────────────────────────────        ──────────────────────────────
Acquire lock before reading/writing   Run freely with snapshot isolation
Hold all locks until commit           Track read-write dependencies

Reader acquires SHARED lock           Reader: no lock needed (MVCC)
Writer acquires EXCLUSIVE lock        Writer: check if reads are stale at commit

Readers BLOCK writers                 Readers NEVER block writers
Writers BLOCK readers                 Writers NEVER block readers

Deadlocks POSSIBLE                    Deadlocks IMPOSSIBLE

High contention → lock queues         High contention → abort storms
Low contention → moderate overhead    Low contention → near-snapshot performance

Phantoms: index-range/gap locks       Phantoms: dependency tracking

Used by: MySQL InnoDB, SQL Server     Used by: PostgreSQL, CockroachDB, YugabyteDB

Best when: high contention,           Best when: low-medium contention,
           write-heavy                          read-heavy, modern OLTP

MVCC Mechanics

Row versions for account_balance (alice):
──────────────────────────────────────────────────────────────
 Version | Value | created_by_txn | deleted_by_txn
─────────┼───────┼────────────────┼────────────────
    1    |  500  |      txn_42    |    txn_100
    2    |  400  |      txn_100   |    NULL (current)

Txn_99 (started before txn_100 committed):
  → Reads version 1 (created_by=42 ≤ 99; deleted_by=100 > 99) → sees 500

Txn_101 (started after txn_100 committed):
  → Reads version 2 (created_by=100 ≤ 101; deleted_by=NULL) → sees 400

RESULT: Readers see a consistent snapshot; writers don't block readers
COST:   Storage for old versions; VACUUM/garbage collection needed

Two-Phase Commit (2PC)

Normal flow:
──────────────────────────────────────────────────────────────
Coordinator                Node A           Node B           Node C
    │───prepare txn_99──────►│               │                │
    │───prepare txn_99───────│──────────────►│                │
    │───prepare txn_99───────│───────────────│───────────────►│
    │◄──YES, prepared────────│               │                │
    │◄──YES, prepared────────│───────────────│                │
    │◄──YES, prepared────────│───────────────│────────────────│
    │───COMMIT txn_99────────►│               │                │
    │───COMMIT txn_99─────────────────────────►│               │
    │───COMMIT txn_99──────────────────────────────────────────►│

Failure scenario (coordinator crashes after prepare, before commit):
──────────────────────────────────────────────────────────────
Coordinator  ──── prepare → all YES ──── CRASH
                                              ↑
                              Nodes A, B, C: BLOCKED
                              Holding locks; cannot commit or rollback
                              Must wait for coordinator to recover (minutes-hours)
                              = IN-DOUBT TRANSACTIONS

Isolation Level Default by Database

DatabaseDefault Isolation LevelSerializable Algorithm
PostgreSQLRead CommittedSSI (since 9.1)
MySQL InnoDBRepeatable Read2PL + gap locks
OracleRead CommittedSnapshot Isolation
SQL ServerRead Committed2PL (default) / RCSI (optional)
CockroachDBSerializableSSI (distributed)
SpannerSerializableTrueTime + 2PL/Paxos
MongoDBRead Committed (single doc)No cross-doc by default; optional ACID txns
DynamoDBEventual (default)Serializable via TransactWriteItems API

Write Skew Prevention Requirements

Write skew arises when:
  1. Two transactions read the same data
  2. Each makes a write that is valid given what it read
  3. But combined, the writes violate an invariant

CANNOT prevent with:
  - Read Committed
  - Snapshot Isolation / Repeatable Read
  - (The reads are internally consistent; conflict is cross-transaction)

CAN prevent with:
  - Serializable (2PL): predicate/range locks on the read set
  - Serializable (SSI): dependency tracking detects that the read basis is stale

Workaround without full serializability:
  - Explicit SELECT FOR UPDATE on all rows that form the invariant check
    SELECT count(*) FROM on_call FOR UPDATE; -- locks entire on_call result
  - Only works if the rows exist; doesn't prevent phantoms

Saga Pattern (Microservices Alternative to Distributed Transactions)

Problem: Service A, B, C each have their own DB; no shared transaction
         2PC across services is too expensive and tight coupling

Saga pattern:
  Step 1: Service A local txn → success → emit event
  Step 2: Service B local txn → success → emit event  
  Step 3: Service C local txn → FAILURE
          → Compensating txn in Service B (undo step 2)
          → Compensating txn in Service A (undo step 1)

Properties:
  ✅ Atomicity: eventually (via compensation)
  ❌ Isolation: NO — concurrent sagas can interleave and interfere
  ✅ Durability: each local txn is durable
  ✅ Consistency: eventually (if compensations are correct)

Requires:
  - Idempotent operations (retries must not double-apply)
  - Compensating transactions for every step
  - Persistent workflow state (Temporal, AWS Step Functions)

Not suitable for: financial ledgers, inventory management where isolation is critical
Suitable for: order fulfillment, travel booking, multi-service workflows with loose coupling

Key Trade-offs Table

DecisionProsConsUse When
Read CommittedLow overhead, good concurrencyRead skew, write skew, lost updatesRead-heavy, tolerable occasional staleness
Snapshot IsolationGood concurrency, no read skewWrite skew still possibleMost OLTP reads; analytics snapshots
Serializable (SSI)Full correctness, good throughputAborts under high contentionModern OLTP; correctness-critical apps
Serializable (2PL)Predictable under write-heavy loadDeadlocks, lock contention, slowHigh write contention (legacy systems)
Actual serial executionPerfectly serializableSingle-threaded throughput capSmall in-memory datasets, VoltDB-style
Saga patternWorks across services, high availabilityNo isolation; complex compensationMicroservices long-running workflows
2PCTrue distributed atomicityBlocks on coordinator failure; slowOnly when truly needed + coordinator HA

Modern Additions (2026)

PostgreSQL SSI (mature):
├─ SET TRANSACTION ISOLATION LEVEL SERIALIZABLE → uses SSI, not 2PL
├─ Performance ~10-15% overhead vs snapshot isolation for most OLTP
└─ No longer valid excuse to avoid serializable isolation in PostgreSQL

NoSQL transactions:
├─ MongoDB 4.0+: multi-document ACID transactions
├─ DynamoDB: TransactWriteItems (up to 25 items, serializable)
└─ Cassandra: Lightweight Transactions (Paxos-based, single partition)

CockroachDB / YugabyteDB (NewSQL):
├─ Distributed SSI across all shards, globally
├─ No 2PC coordinator (distributed agreement via Raft + HLC)
└─ Serializable at global scale with ~5-10ms cross-region latency

Temporal (durable execution for sagas):
├─ Persists workflow state in database; resumes after crashes
├─ Makes saga compensation logic explicit and testable
└─ Becoming standard for microservices cross-service workflows

Aurora innovations:
├─ Shared log-structured storage: all writers share one log
├─ Transactions within cluster are local (no distributed coord)
└─ Fast multi-writer ACID without 2PC overhead

Interview Response Templates

”What isolation level should I use in PostgreSQL?”

“For most OLTP applications, I’d recommend starting with SERIALIZABLE in PostgreSQL, which uses SSI — not 2PL. The overhead is typically under 15% vs snapshot isolation, and you get full correctness guarantees. You’d downgrade to REPEATABLE READ (snapshot isolation) only if you’ve profiled and confirmed SSI’s abort rate is unacceptable for your specific workload. I’d avoid READ COMMITTED for business-critical operations that span multiple reads — it allows read skew and write skew."

"How would you handle a distributed transaction across microservices?”

“True ACID across microservice boundaries isn’t practical — 2PC creates tight coupling and blocks on coordinator failure. I’d use the saga pattern: decompose the workflow into local transactions per service, each with a compensating action. The orchestrator (Temporal or Step Functions) drives the sequence and handles failures by triggering compensating transactions. The key trade-off is that sagas lack isolation — concurrent sagas can interfere. For operations where this matters (e.g., inventory reservations), I’d add an explicit reservation/hold mechanism to create a soft isolation boundary.”


Last Updated: 2026-05-29