Chapter 8 Flashcards — Transactions
flashcards ddia-2e chapter8 transactions
ACID Fundamentals
What does each letter of ACID stand for, and what is the most nuanced/misunderstood property?
?
Atomicity: All operations in a transaction succeed or none do; the key word is abortability — the ability to rollback a partial transaction. Not about concurrency.
Consistency: Application-defined invariants (e.g., account balances sum to zero). The application’s responsibility, not the database’s. The DB enforces constraints (FK, UNIQUE, NOT NULL) but cannot enforce arbitrary business logic. Kleppmann argues “C” was added to make the acronym work.
Isolation: Concurrent transactions behave as if they ran serially. Serializability is the ideal; weaker levels are used for performance. The most nuanced property — the rest of the chapter explores its degrees.
Durability: Once committed, data persists through crashes and power failures. Implemented via WAL (write-ahead log) + disk flush before commit acknowledgment, and in distributed systems, replication to multiple nodes.
Most misunderstood: Consistency — it is NOT a database property in the usual sense. Write skew, phantom reads, and dirty reads are isolation problems, not consistency problems.
Why is “Consistency” in ACID misleading, and who is actually responsible for it?
?
- The “C” in ACID refers to application-level invariants: rules the application defines (e.g., “account balance never negative”, “total money conserved”, “at most one doctor on call”)
- These are not things the database can automatically enforce — the database has no knowledge of what these invariants mean
- The database enforces structural constraints (NOT NULL, UNIQUE, FOREIGN KEY) but not semantic consistency
- An application can violate “consistency” even with perfectly-working atomicity, isolation, and durability
- Conclusion: C is the application’s responsibility; ACID transactions help the application maintain consistency by providing A, I, and D, but the actual invariants must be correctly encoded in the application logic
Isolation Anomalies
What is a dirty read, and which isolation level prevents it?
?
- Dirty read: Reading data written by a transaction that has NOT yet committed
- If the writing transaction later rolls back, the reading transaction has seen data that “never existed”
- Example: Txn A writes balance=200 (not committed); Txn B reads balance=200; Txn A rolls back → B saw phantom data
- Prevented by: Read Committed isolation (and all stronger levels)
- Implementation: Database remembers the old committed value until the new transaction commits; other transactions read the old value, not the in-progress write
What is read skew (non-repeatable read), and how does snapshot isolation prevent it?
?
- Read skew: Within a single transaction, reading the same row twice yields different values because another transaction committed a write in between
- Bank backup example: Txn A reads alice=500; Txn B transfers $100 (alice=400, bob=600) and commits; Txn A reads bob=600 → sees total=1100, not 1000. An inconsistent view of the world.
- Prevented by: Snapshot Isolation (and stronger levels)
- How snapshot isolation works: Each transaction gets a consistent snapshot of the database as of its start time. All reads within the transaction see the world as it was when the transaction began. Writes by other transactions that commit after the snapshot point are invisible.
- Implementation: MVCC — multiple row versions with transaction IDs; each transaction reads only versions that were committed before its snapshot timestamp
What is a lost update, and what are the four ways to prevent it?
?
- Lost update: Two concurrent transactions read the same value, both modify it, and one write clobbers the other
- Example: Counter = 5. Txn A reads 5, Txn B reads 5, Txn A writes 6, Txn B writes 6 → counter = 6, should be 7
Four prevention strategies:
- Atomic operations:
UPDATE t SET counter = counter + 1— DB handles atomically, no app-level read needed. Best when the operation is expressible as a single DB operation. - Explicit locking (
SELECT FOR UPDATE): Lock the row before reading; other transactions wait. Best for complex application logic between read and write. - Compare-and-set (CAS):
UPDATE t SET val=new WHERE val=old— fails if value changed; app checks affected rows and retries. Best for optimistic low-contention scenarios. - Automatic detection (PostgreSQL): Snapshot isolation with lost-update detection; DB aborts conflicting transaction; app retries. Works transparently without explicit locking.
What is write skew and why doesn’t snapshot isolation prevent it?
?
- Write skew: Two transactions each read the same data, each makes a write valid when considered alone, but their combined effect violates an application invariant
- Doctor on-call example: Invariant = at least 1 doctor on call. Both doctors read on_call_count=2, both decide they can go off call. Both delete themselves from on_call. Result: 0 doctors. Each individual write was valid; together they violate the invariant.
- Why snapshot isolation fails: Each transaction’s reads are consistent with its own snapshot. The conflict is between the read set of one transaction and the write set of another. Snapshot isolation tracks neither; it only detects that the same row was written by two concurrent transactions (lost update), not that one transaction’s read assumption was invalidated by another’s write.
- Prevention requires serializability: 2PL uses predicate/range locks on the read set; SSI tracks read-write dependencies and aborts transactions whose reads were invalidated
What is a phantom read and what mechanism is required to prevent it?
?
- Phantom read: A transaction queries a set of rows matching a predicate (e.g.,
WHERE room=101 AND time=14:00); another transaction inserts a new row matching that predicate; the first transaction re-executing the same query sees the new “phantom” row - Meeting room booking example: Txn A checks for conflicting bookings (finds 0), Txn B checks for conflicting bookings (finds 0), Txn A inserts booking, Txn B inserts booking → double booking. B inserted a phantom that A didn’t see.
- Why snapshot isolation is insufficient: Snapshot isolation prevents reads of existing rows from changing, but cannot prevent insertion of NEW rows that match a query predicate
- Prevention mechanisms:
- Predicate locks (2PL): Lock the predicate itself — any insert/update matching the predicate must wait
- Index-range locks: Lock a range in an index (MySQL gap locks, PostgreSQL range locks) — coarser than predicate locks but practical
- SSI dependency tracking: Detect that an insert by a concurrent transaction would have affected the current transaction’s read
Isolation Level Comparisons
Compare Read Committed vs Snapshot Isolation — what does each prevent and allow?
?
Read Committed:
- Prevents: dirty reads, dirty writes
- Allows: read skew (non-repeatable reads), phantom reads, write skew, lost updates
- How: Rows store old committed value; readers see last committed value; writers hold row-level locks until commit
- Default in: PostgreSQL, Oracle, SQL Server, DB2
Snapshot Isolation:
- Prevents: dirty reads, dirty writes, read skew (non-repeatable reads), phantom reads (for read queries)
- Allows: write skew, lost updates (unless explicitly detected as in PostgreSQL)
- How: MVCC — each transaction reads from a consistent snapshot at start time; multiple row versions maintained
- NOT in SQL standard (named “Repeatable Read” in PostgreSQL, “Snapshot” in SQL Server RCSI)
Key insight: Snapshot isolation is much stronger than Read Committed for read consistency, but still insufficient for applications with invariants that span multiple transactions (write skew vulnerability).
What is the difference between 2PL and SSI as serializable isolation implementations?
?
Two-Phase Locking (2PL) — pessimistic:
- Acquires locks before reading or writing; holds until commit
- Readers block writers; writers block readers
- Deadlocks possible (detection + transaction abort needed)
- High contention → lock queues → low throughput
- Phantoms prevented via index-range/gap locks
- Used by: MySQL InnoDB, SQL Server
Serializable Snapshot Isolation (SSI) — optimistic:
- Transactions run freely under snapshot isolation (MVCC)
- Database tracks read-write dependencies (which reads were based on data since modified)
- At commit time: if stale reads detected → abort transaction; application retries
- Readers never block writers; writers never block readers
- Deadlocks impossible
- High contention → abort storms → low throughput (but for different reason than 2PL)
- Phantoms prevented via dependency tracking
- Used by: PostgreSQL (since 9.1), CockroachDB, YugabyteDB
When to prefer 2PL: Write-heavy workloads with high contention; when predictable throughput is more important than peak throughput
When to prefer SSI: Read-heavy to balanced workloads with moderate contention; modern OLTP; correctness without lock management complexity
Implementation Mechanisms
How does MVCC (Multi-Version Concurrency Control) work to enable snapshot isolation?
?
- Core idea: Instead of overwriting rows in place, create a new version of the row for each update; keep old versions accessible
- Row version metadata: Each version has
created_by_txn_idanddeleted_by_txn_id(or timestamps) - Snapshot read rule: Transaction with snapshot_txn_id reads a row version if:
created_by_txn_id ≤ snapshot_txn_id(version existed at snapshot time)deleted_by_txn_id > snapshot_txn_idORdeleted_by_txn_id = NULL(not yet deleted at snapshot time)
- Concurrent operation: Writers create new versions; readers read old versions. No blocking between readers and writers.
- Garbage collection: Old versions no longer readable by any active transaction are removed by background process (VACUUM in PostgreSQL, background compaction in others)
- Storage cost: Old versions consume space until GC runs; long-running transactions prevent GC → table bloat (famous PostgreSQL VACUUM issue)
What are predicate locks and index-range locks, and why are index-range locks preferred?
?
Predicate locks:
- Lock ALL data matching a given query condition, including data that doesn’t yet exist
- Example:
SELECT * WHERE room=101 AND time=14:00acquires a predicate lock on that condition - Any transaction attempting to insert/update a row matching the predicate must wait
- Theoretically perfect — prevents all phantoms
- Problem: Every read requires checking against all outstanding predicate locks; very expensive with many active transactions
Index-range locks (practical alternative):
- Lock a range in an index rather than the exact predicate
- Example: Lock the index range for
room=101in the room_time index → blocks all inserts for any time in room 101 - Coarser than necessary (may lock more than the exact predicate) but prevents all relevant phantoms
- Checking requires only inspecting the relevant index range → efficient
- Used by: MySQL InnoDB (gap locks), PostgreSQL (range locks in indexes)
Why index-range locks are preferred: They’re much cheaper to check than scanning all active predicate locks; the over-locking (false positives) rarely matters in practice because conflicting operations in the same time+room combination are exactly what you want to prevent
MVCC and Snapshot Isolation
What is actual serial execution and when does it make sense?
?
- Concept: Execute transactions one at a time, on a single thread, in serial order — truly serial, not simulated-serial
- Why it became viable:
- RAM is cheap enough to hold working dataset entirely in memory (in-memory DBs like VoltDB, H-Store)
- Disk I/O is no longer the bottleneck for many workloads; single-thread CPU can handle millions of simple ops/sec
- Eliminating interactive transactions (user waits mid-transaction) removes the milliseconds-to-seconds latency that made single-threading impractical
- Requirements:
- Entire dataset (or hot working set) fits in RAM
- Transactions are short stored procedures (pre-compiled, sent as single network round trip)
- No external calls or user interaction within transactions
- Throughput: Capped at single-core speed; scale by partitioning data across cores/nodes (each shard runs a single thread)
- Used by: VoltDB, H-Store, Redis (single-threaded command processing), Datomic, FoundationDB (coordinators)
Distributed Transactions
How does two-phase commit (2PC) work and what is its critical failure mode?
?
Protocol:
- Phase 1 (Prepare): Coordinator sends
PREPARE txn_Xto all participant nodes. Each participant writes the transaction to its WAL (promise to commit if asked), then respondsYESorNO - Phase 2 (Commit/Abort): If ALL participants responded YES → coordinator sends
COMMIT; if ANY said NO or timed out → sendsABORT. Participants execute accordingly. - Atomicity guarantee: If coordinator says commit, ALL nodes commit. If abort, ALL nodes abort. No partial outcomes.
Critical failure mode — in-doubt transactions:
- Coordinator crashes AFTER sending
PREPARE(all nodes said YES) but BEFORE sendingCOMMIT - All participants have promised to commit; they are now in-doubt — they cannot decide without the coordinator
- Participants hold locks (and block other transactions) until the coordinator recovers
- Recovery time: minutes to hours; operator may need to manually resolve
- This is the fundamental problem with 2PC: it’s a blocking protocol
Performance cost: 2 network round trips minimum (prepare + commit); participants hold locks during entire prepare phase; geo-distributed = latency amplification per round trip
What is the Saga pattern and what ACID property does it sacrifice?
?
- Saga: A long-running business transaction decomposed into a sequence of local transactions, one per service, each with a corresponding compensating transaction
- Example: Book trip = [reserve flight local txn] + [reserve hotel local txn] + [charge card local txn]. If hotel fails, compensate: [cancel flight] + [uncharge card].
- What it sacrifices: Isolation — there is no isolation between concurrent sagas. While one saga has reserved a flight but not yet booked the hotel, a concurrent saga can see and interact with that reserved flight.
- Why isolation is sacrificed: Sagas span multiple services with independent databases; true isolation would require distributed locking or 2PC across services
- Requirements for correct sagas:
- Idempotency: Compensating and forward transactions must be safe to retry
- Commutativity: Order of compensations should not matter
- Persistent state: Workflow state stored durably (Temporal, AWS Step Functions) so crashes don’t lose progress
- Suitable for: Order fulfillment, travel booking, multi-service workflows; NOT for financial ledgers or inventory where isolation is critical
Modern Context
What is the state of serializable isolation in modern databases (2026)?
?
PostgreSQL: SSI available since 9.1 (2011); mature and widely used. SET TRANSACTION ISOLATION LEVEL SERIALIZABLE uses SSI (not 2PL). Performance overhead ~10-15% vs snapshot isolation for typical OLTP. No longer valid to say “serializable is too slow” for PostgreSQL.
MySQL InnoDB: Still uses 2PL + gap locks for serializable; higher overhead than SSI. Default is Repeatable Read (which in MySQL provides stronger phantom prevention via gap locks than the SQL standard requires).
CockroachDB: Distributed SSI across all nodes as the DEFAULT and ONLY isolation level. Globally serializable. Uses Hybrid Logical Clocks (HLC) for timestamp ordering. No 2PC coordinator.
Google Spanner: TrueTime API (GPS + atomic clocks with bounded uncertainty) enables externally consistent (stronger than serializable) transactions globally. 2PC across Paxos groups, but coordinator failure handled via Paxos leader election.
Key message: By 2026, “serializable is too expensive” is outdated — SSI-based serializable isolation is efficient and available in major databases. Applications that sacrificed correctness for performance due to old 2PL implementations should re-evaluate.
How does CockroachDB achieve distributed serializable transactions without a 2PC coordinator bottleneck?
?
- Hybrid Logical Clocks (HLC): Each node maintains a logical timestamp that combines wall-clock time and a logical counter, synchronized across the cluster. Provides a total ordering of transactions without requiring a central coordinator for timestamp assignment.
- Distributed SSI: Every read and write is tagged with an HLC timestamp. At commit time, CockroachDB checks if any concurrent committed transaction modified data that the current transaction read (SSI-style dependency check), distributed across all nodes.
- Raft per range: Each data range (shard) is managed by a Raft consensus group. Writes are committed through Raft (majority quorum). No single coordinator owns all ranges.
- No 2PC coordinator SPOF: Instead of one coordinator that can fail and leave in-doubt transactions, each range’s Raft leader handles its part of the transaction independently. If a node fails, Raft elects a new leader for its ranges.
- Result: Globally serializable transactions without the 2PC blocking failure mode; cross-shard transactions have higher latency (multiple Raft rounds) but are correct and non-blocking.
Concurrency Scenarios
An application checks “SELECT count(*) FROM seats WHERE flight_id=101 AND status=‘available’” and if > 0, does “UPDATE seats SET status=‘sold’ WHERE id=X AND status=‘available’”. Under snapshot isolation, is this safe from double-booking?
?
- Short answer: The CAS-style UPDATE (
WHERE status='available') provides safety IF the seat has a specific row that gets marked ‘sold’. - Analysis:
- The initial SELECT count check can be stale (snapshot isolation), but that’s just advisory
- The UPDATE with
WHERE status='available'is effectively a compare-and-set: it only succeeds if the seat is still available at update time - Row-level lock is acquired on the specific seat row at update time
- If two transactions both try to UPDATE the same seat, only one will succeed (row-level locking); the other will find
status='sold'and update 0 rows - Application should check affected rows: if 0, retry or return “no seats available”
- Safe from double-booking: Yes, for a specific seat. The CAS-pattern is safe here.
- Write skew risk: Would arise if selecting ANY available seat (not a specific one), where both transactions pick different seats based on the count > 0 check, but both seats were the last available — both proceed to book different seats and now 0 remain but oversold by 2.
Two concurrent transactions both check SELECT COUNT(*) FROM doctors WHERE on_call = TRUE (count=2) and then do UPDATE doctors SET on_call = FALSE WHERE id = ?. Under snapshot isolation, why can this lead to 0 on-call doctors?
?
This is the classic write skew example:
- Both Txn A and Txn B start with the same snapshot; both read on_call count = 2
- Both independently decide: “2 doctors on call, the invariant (min 1) allows me to go off call”
- Txn A updates doctor_alice:
on_call = FALSE→ commits - Txn B updates doctor_bob:
on_call = FALSE→ commits - Result: 0 doctors on call; invariant violated
Why snapshot isolation fails:
- Txn A and Txn B each read a different row (alice vs bob)
- Neither transaction wrote to the row the other transaction read
- No dirty read, no read skew, no lost update
- But both reads were based on the SAME aggregate state that both transactions concurrently invalidated
Why serializability prevents it:
- 2PL:
SELECT COUNT ... FOR UPDATEwould acquire a predicate/range lock on the entire on_call set; the second transaction would wait - SSI: Detects that Txn B’s read of the on_call count was invalidated by Txn A’s commit; aborts Txn B; Txn B retries and sees count=1, decides not to go off call
Fix without serializability: SELECT * FROM doctors WHERE on_call = TRUE FOR UPDATE — locks all on-call doctor rows, preventing concurrent updates to any of them
Quick Facts
What are the isolation level defaults for the four most common databases?
?
| Database | Default Isolation | Notes |
|---|---|---|
| PostgreSQL | Read Committed | Serializable = SSI (excellent); upgrade to SERIALIZABLE for correctness-critical ops |
| MySQL InnoDB | Repeatable Read | 2PL + gap locks; stronger phantom prevention than standard RR |
| Oracle | Read Committed | Own implementation of snapshot isolation as “Serializable” (which is actually SI, not 2PL serializable) |
| SQL Server | Read Committed | RCSI (Read Committed Snapshot Isolation) optional for MVCC-style RC |
Practical advice: PostgreSQL → use SERIALIZABLE (SSI, low overhead). MySQL → use SERIALIZABLE only if you need true phantom prevention; be aware it uses 2PL. Oracle “Serializable” is actually snapshot isolation — doesn’t prevent write skew.
What is the difference between “Repeatable Read” in PostgreSQL vs MySQL?
?
PostgreSQL Repeatable Read:
- Implemented as Snapshot Isolation (MVCC-based)
- Prevents: dirty reads, read skew, most phantom reads (for read-only queries)
- Does NOT prevent: write skew
- Does detect and prevent: lost updates (auto-detects via snapshot)
- No locking; readers never block writers
MySQL InnoDB Repeatable Read:
- Implemented with locking + gap locks
- Prevents: dirty reads, read skew
- Prevents phantoms via gap locks (stronger than standard Repeatable Read)
- Does NOT prevent: write skew
- Uses NEXT-KEY locks (row lock + gap lock) on indexed range queries
- Writers can block readers in some cases
Key difference: MySQL’s Repeatable Read uses gap locks (2PL-style) to prevent phantoms within the read set; PostgreSQL’s uses pure MVCC. Both leave write skew unresolved. The same name (“Repeatable Read”) means different things in different databases — this is why Kleppmann says the SQL standard is ambiguous and confusing.
Total Cards: 20
Review Time: ~30 minutes
Priority: HIGH
Last Updated: 2026-05-29