Chapter 8 Flashcards — Transactions

flashcards ddia-2e chapter8 transactions

ACID Fundamentals

What does each letter of ACID stand for, and what is the most nuanced/misunderstood property?
?
Atomicity: All operations in a transaction succeed or none do; the key word is abortability — the ability to rollback a partial transaction. Not about concurrency.

Consistency: Application-defined invariants (e.g., account balances sum to zero). The application’s responsibility, not the database’s. The DB enforces constraints (FK, UNIQUE, NOT NULL) but cannot enforce arbitrary business logic. Kleppmann argues “C” was added to make the acronym work.

Isolation: Concurrent transactions behave as if they ran serially. Serializability is the ideal; weaker levels are used for performance. The most nuanced property — the rest of the chapter explores its degrees.

Durability: Once committed, data persists through crashes and power failures. Implemented via WAL (write-ahead log) + disk flush before commit acknowledgment, and in distributed systems, replication to multiple nodes.

Most misunderstood: Consistency — it is NOT a database property in the usual sense. Write skew, phantom reads, and dirty reads are isolation problems, not consistency problems.

Why is “Consistency” in ACID misleading, and who is actually responsible for it?
?

The “C” in ACID refers to application-level invariants: rules the application defines (e.g., “account balance never negative”, “total money conserved”, “at most one doctor on call”)
These are not things the database can automatically enforce — the database has no knowledge of what these invariants mean
The database enforces structural constraints (NOT NULL, UNIQUE, FOREIGN KEY) but not semantic consistency
An application can violate “consistency” even with perfectly-working atomicity, isolation, and durability
Conclusion: C is the application’s responsibility; ACID transactions help the application maintain consistency by providing A, I, and D, but the actual invariants must be correctly encoded in the application logic

Isolation Anomalies

What is a dirty read, and which isolation level prevents it?
?

Dirty read: Reading data written by a transaction that has NOT yet committed
If the writing transaction later rolls back, the reading transaction has seen data that “never existed”
Example: Txn A writes balance=200 (not committed); Txn B reads balance=200; Txn A rolls back → B saw phantom data
Prevented by: Read Committed isolation (and all stronger levels)
Implementation: Database remembers the old committed value until the new transaction commits; other transactions read the old value, not the in-progress write

What is read skew (non-repeatable read), and how does snapshot isolation prevent it?
?

Read skew: Within a single transaction, reading the same row twice yields different values because another transaction committed a write in between
Bank backup example: Txn A reads alice=500; Txn B transfers $100 (alice=400, bob=600) and commits; Txn A reads bob=600 → sees total=1100, not 1000. An inconsistent view of the world.
Prevented by: Snapshot Isolation (and stronger levels)
How snapshot isolation works: Each transaction gets a consistent snapshot of the database as of its start time. All reads within the transaction see the world as it was when the transaction began. Writes by other transactions that commit after the snapshot point are invisible.
Implementation: MVCC — multiple row versions with transaction IDs; each transaction reads only versions that were committed before its snapshot timestamp

What is a lost update, and what are the four ways to prevent it?
?

Lost update: Two concurrent transactions read the same value, both modify it, and one write clobbers the other
Example: Counter = 5. Txn A reads 5, Txn B reads 5, Txn A writes 6, Txn B writes 6 → counter = 6, should be 7

Four prevention strategies:

Atomic operations: UPDATE t SET counter = counter + 1 — DB handles atomically, no app-level read needed. Best when the operation is expressible as a single DB operation.
Explicit locking (SELECT FOR UPDATE): Lock the row before reading; other transactions wait. Best for complex application logic between read and write.
Compare-and-set (CAS): UPDATE t SET val=new WHERE val=old — fails if value changed; app checks affected rows and retries. Best for optimistic low-contention scenarios.
Automatic detection (PostgreSQL): Snapshot isolation with lost-update detection; DB aborts conflicting transaction; app retries. Works transparently without explicit locking.

What is write skew and why doesn’t snapshot isolation prevent it?
?

Write skew: Two transactions each read the same data, each makes a write valid when considered alone, but their combined effect violates an application invariant
Doctor on-call example: Invariant = at least 1 doctor on call. Both doctors read on_call_count=2, both decide they can go off call. Both delete themselves from on_call. Result: 0 doctors. Each individual write was valid; together they violate the invariant.
Why snapshot isolation fails: Each transaction’s reads are consistent with its own snapshot. The conflict is between the read set of one transaction and the write set of another. Snapshot isolation tracks neither; it only detects that the same row was written by two concurrent transactions (lost update), not that one transaction’s read assumption was invalidated by another’s write.
Prevention requires serializability: 2PL uses predicate/range locks on the read set; SSI tracks read-write dependencies and aborts transactions whose reads were invalidated

What is a phantom read and what mechanism is required to prevent it?
?

Phantom read: A transaction queries a set of rows matching a predicate (e.g., WHERE room=101 AND time=14:00); another transaction inserts a new row matching that predicate; the first transaction re-executing the same query sees the new “phantom” row
Meeting room booking example: Txn A checks for conflicting bookings (finds 0), Txn B checks for conflicting bookings (finds 0), Txn A inserts booking, Txn B inserts booking → double booking. B inserted a phantom that A didn’t see.
Why snapshot isolation is insufficient: Snapshot isolation prevents reads of existing rows from changing, but cannot prevent insertion of NEW rows that match a query predicate
Prevention mechanisms:
- Predicate locks (2PL): Lock the predicate itself — any insert/update matching the predicate must wait
- Index-range locks: Lock a range in an index (MySQL gap locks, PostgreSQL range locks) — coarser than predicate locks but practical
- SSI dependency tracking: Detect that an insert by a concurrent transaction would have affected the current transaction’s read

Isolation Level Comparisons

Compare Read Committed vs Snapshot Isolation — what does each prevent and allow?
?
Read Committed:

Prevents: dirty reads, dirty writes
Allows: read skew (non-repeatable reads), phantom reads, write skew, lost updates
How: Rows store old committed value; readers see last committed value; writers hold row-level locks until commit
Default in: PostgreSQL, Oracle, SQL Server, DB2

Snapshot Isolation:

Prevents: dirty reads, dirty writes, read skew (non-repeatable reads), phantom reads (for read queries)
Allows: write skew, lost updates (unless explicitly detected as in PostgreSQL)
How: MVCC — each transaction reads from a consistent snapshot at start time; multiple row versions maintained
NOT in SQL standard (named “Repeatable Read” in PostgreSQL, “Snapshot” in SQL Server RCSI)

Key insight: Snapshot isolation is much stronger than Read Committed for read consistency, but still insufficient for applications with invariants that span multiple transactions (write skew vulnerability).

What is the difference between 2PL and SSI as serializable isolation implementations?
?
Two-Phase Locking (2PL) — pessimistic:

Acquires locks before reading or writing; holds until commit
Readers block writers; writers block readers
Deadlocks possible (detection + transaction abort needed)
High contention → lock queues → low throughput
Phantoms prevented via index-range/gap locks
Used by: MySQL InnoDB, SQL Server

Serializable Snapshot Isolation (SSI) — optimistic:

Transactions run freely under snapshot isolation (MVCC)
Database tracks read-write dependencies (which reads were based on data since modified)
At commit time: if stale reads detected → abort transaction; application retries
Readers never block writers; writers never block readers
Deadlocks impossible
High contention → abort storms → low throughput (but for different reason than 2PL)
Phantoms prevented via dependency tracking
Used by: PostgreSQL (since 9.1), CockroachDB, YugabyteDB

When to prefer 2PL: Write-heavy workloads with high contention; when predictable throughput is more important than peak throughput
When to prefer SSI: Read-heavy to balanced workloads with moderate contention; modern OLTP; correctness without lock management complexity

Implementation Mechanisms

How does MVCC (Multi-Version Concurrency Control) work to enable snapshot isolation?
?

Core idea: Instead of overwriting rows in place, create a new version of the row for each update; keep old versions accessible
Row version metadata: Each version has created_by_txn_id and deleted_by_txn_id (or timestamps)
Snapshot read rule: Transaction with snapshot_txn_id reads a row version if:
- created_by_txn_id ≤ snapshot_txn_id (version existed at snapshot time)
- deleted_by_txn_id > snapshot_txn_id OR deleted_by_txn_id = NULL (not yet deleted at snapshot time)
Concurrent operation: Writers create new versions; readers read old versions. No blocking between readers and writers.
Garbage collection: Old versions no longer readable by any active transaction are removed by background process (VACUUM in PostgreSQL, background compaction in others)
Storage cost: Old versions consume space until GC runs; long-running transactions prevent GC → table bloat (famous PostgreSQL VACUUM issue)

What are predicate locks and index-range locks, and why are index-range locks preferred?
?
Predicate locks:

Lock ALL data matching a given query condition, including data that doesn’t yet exist
Example: SELECT * WHERE room=101 AND time=14:00 acquires a predicate lock on that condition
Any transaction attempting to insert/update a row matching the predicate must wait
Theoretically perfect — prevents all phantoms
Problem: Every read requires checking against all outstanding predicate locks; very expensive with many active transactions

Index-range locks (practical alternative):

Lock a range in an index rather than the exact predicate
Example: Lock the index range for room=101 in the room_time index → blocks all inserts for any time in room 101
Coarser than necessary (may lock more than the exact predicate) but prevents all relevant phantoms
Checking requires only inspecting the relevant index range → efficient
Used by: MySQL InnoDB (gap locks), PostgreSQL (range locks in indexes)

Why index-range locks are preferred: They’re much cheaper to check than scanning all active predicate locks; the over-locking (false positives) rarely matters in practice because conflicting operations in the same time+room combination are exactly what you want to prevent

MVCC and Snapshot Isolation

What is actual serial execution and when does it make sense?
?

Concept: Execute transactions one at a time, on a single thread, in serial order — truly serial, not simulated-serial
Why it became viable:
1. RAM is cheap enough to hold working dataset entirely in memory (in-memory DBs like VoltDB, H-Store)
2. Disk I/O is no longer the bottleneck for many workloads; single-thread CPU can handle millions of simple ops/sec
3. Eliminating interactive transactions (user waits mid-transaction) removes the milliseconds-to-seconds latency that made single-threading impractical
Requirements:
- Entire dataset (or hot working set) fits in RAM
- Transactions are short stored procedures (pre-compiled, sent as single network round trip)
- No external calls or user interaction within transactions
Throughput: Capped at single-core speed; scale by partitioning data across cores/nodes (each shard runs a single thread)
Used by: VoltDB, H-Store, Redis (single-threaded command processing), Datomic, FoundationDB (coordinators)

Distributed Transactions

How does two-phase commit (2PC) work and what is its critical failure mode?
?
Protocol:

Phase 1 (Prepare): Coordinator sends PREPARE txn_X to all participant nodes. Each participant writes the transaction to its WAL (promise to commit if asked), then responds YES or NO
Phase 2 (Commit/Abort): If ALL participants responded YES → coordinator sends COMMIT; if ANY said NO or timed out → sends ABORT. Participants execute accordingly.
Atomicity guarantee: If coordinator says commit, ALL nodes commit. If abort, ALL nodes abort. No partial outcomes.

Critical failure mode — in-doubt transactions:

Coordinator crashes AFTER sending PREPARE (all nodes said YES) but BEFORE sending COMMIT
All participants have promised to commit; they are now in-doubt — they cannot decide without the coordinator
Participants hold locks (and block other transactions) until the coordinator recovers
Recovery time: minutes to hours; operator may need to manually resolve
This is the fundamental problem with 2PC: it’s a blocking protocol

Performance cost: 2 network round trips minimum (prepare + commit); participants hold locks during entire prepare phase; geo-distributed = latency amplification per round trip

What is the Saga pattern and what ACID property does it sacrifice?
?

Saga: A long-running business transaction decomposed into a sequence of local transactions, one per service, each with a corresponding compensating transaction
Example: Book trip = [reserve flight local txn] + [reserve hotel local txn] + [charge card local txn]. If hotel fails, compensate: [cancel flight] + [uncharge card].
What it sacrifices: Isolation — there is no isolation between concurrent sagas. While one saga has reserved a flight but not yet booked the hotel, a concurrent saga can see and interact with that reserved flight.
Why isolation is sacrificed: Sagas span multiple services with independent databases; true isolation would require distributed locking or 2PC across services
Requirements for correct sagas:
- Idempotency: Compensating and forward transactions must be safe to retry
- Commutativity: Order of compensations should not matter
- Persistent state: Workflow state stored durably (Temporal, AWS Step Functions) so crashes don’t lose progress
Suitable for: Order fulfillment, travel booking, multi-service workflows; NOT for financial ledgers or inventory where isolation is critical

Modern Context

What is the state of serializable isolation in modern databases (2026)?
?
PostgreSQL: SSI available since 9.1 (2011); mature and widely used. SET TRANSACTION ISOLATION LEVEL SERIALIZABLE uses SSI (not 2PL). Performance overhead ~10-15% vs snapshot isolation for typical OLTP. No longer valid to say “serializable is too slow” for PostgreSQL.

MySQL InnoDB: Still uses 2PL + gap locks for serializable; higher overhead than SSI. Default is Repeatable Read (which in MySQL provides stronger phantom prevention via gap locks than the SQL standard requires).

CockroachDB: Distributed SSI across all nodes as the DEFAULT and ONLY isolation level. Globally serializable. Uses Hybrid Logical Clocks (HLC) for timestamp ordering. No 2PC coordinator.

Google Spanner: TrueTime API (GPS + atomic clocks with bounded uncertainty) enables externally consistent (stronger than serializable) transactions globally. 2PC across Paxos groups, but coordinator failure handled via Paxos leader election.

Key message: By 2026, “serializable is too expensive” is outdated — SSI-based serializable isolation is efficient and available in major databases. Applications that sacrificed correctness for performance due to old 2PL implementations should re-evaluate.

How does CockroachDB achieve distributed serializable transactions without a 2PC coordinator bottleneck?
?

Hybrid Logical Clocks (HLC): Each node maintains a logical timestamp that combines wall-clock time and a logical counter, synchronized across the cluster. Provides a total ordering of transactions without requiring a central coordinator for timestamp assignment.
Distributed SSI: Every read and write is tagged with an HLC timestamp. At commit time, CockroachDB checks if any concurrent committed transaction modified data that the current transaction read (SSI-style dependency check), distributed across all nodes.
Raft per range: Each data range (shard) is managed by a Raft consensus group. Writes are committed through Raft (majority quorum). No single coordinator owns all ranges.
No 2PC coordinator SPOF: Instead of one coordinator that can fail and leave in-doubt transactions, each range’s Raft leader handles its part of the transaction independently. If a node fails, Raft elects a new leader for its ranges.
Result: Globally serializable transactions without the 2PC blocking failure mode; cross-shard transactions have higher latency (multiple Raft rounds) but are correct and non-blocking.

Concurrency Scenarios

An application checks “SELECT count(*) FROM seats WHERE flight_id=101 AND status=‘available’” and if > 0, does “UPDATE seats SET status=‘sold’ WHERE id=X AND status=‘available’”. Under snapshot isolation, is this safe from double-booking?
?

Short answer: The CAS-style UPDATE (WHERE status='available') provides safety IF the seat has a specific row that gets marked ‘sold’.
Analysis:
- The initial SELECT count check can be stale (snapshot isolation), but that’s just advisory
- The UPDATE with WHERE status='available' is effectively a compare-and-set: it only succeeds if the seat is still available at update time
- Row-level lock is acquired on the specific seat row at update time
- If two transactions both try to UPDATE the same seat, only one will succeed (row-level locking); the other will find status='sold' and update 0 rows
- Application should check affected rows: if 0, retry or return “no seats available”
Safe from double-booking: Yes, for a specific seat. The CAS-pattern is safe here.
Write skew risk: Would arise if selecting ANY available seat (not a specific one), where both transactions pick different seats based on the count > 0 check, but both seats were the last available — both proceed to book different seats and now 0 remain but oversold by 2.

Two concurrent transactions both check SELECT COUNT(*) FROM doctors WHERE on_call = TRUE (count=2) and then do UPDATE doctors SET on_call = FALSE WHERE id = ?. Under snapshot isolation, why can this lead to 0 on-call doctors?
?
This is the classic write skew example:

Both Txn A and Txn B start with the same snapshot; both read on_call count = 2
Both independently decide: “2 doctors on call, the invariant (min 1) allows me to go off call”
Txn A updates doctor_alice: on_call = FALSE → commits
Txn B updates doctor_bob: on_call = FALSE → commits
Result: 0 doctors on call; invariant violated

Why snapshot isolation fails:

Txn A and Txn B each read a different row (alice vs bob)
Neither transaction wrote to the row the other transaction read
No dirty read, no read skew, no lost update
But both reads were based on the SAME aggregate state that both transactions concurrently invalidated

Why serializability prevents it:

2PL: SELECT COUNT ... FOR UPDATE would acquire a predicate/range lock on the entire on_call set; the second transaction would wait
SSI: Detects that Txn B’s read of the on_call count was invalidated by Txn A’s commit; aborts Txn B; Txn B retries and sees count=1, decides not to go off call

Fix without serializability: SELECT * FROM doctors WHERE on_call = TRUE FOR UPDATE — locks all on-call doctor rows, preventing concurrent updates to any of them

Quick Facts

What are the isolation level defaults for the four most common databases?
?

Database	Default Isolation	Notes
PostgreSQL	Read Committed	Serializable = SSI (excellent); upgrade to SERIALIZABLE for correctness-critical ops
MySQL InnoDB	Repeatable Read	2PL + gap locks; stronger phantom prevention than standard RR
Oracle	Read Committed	Own implementation of snapshot isolation as “Serializable” (which is actually SI, not 2PL serializable)
SQL Server	Read Committed	RCSI (Read Committed Snapshot Isolation) optional for MVCC-style RC

Practical advice: PostgreSQL → use SERIALIZABLE (SSI, low overhead). MySQL → use SERIALIZABLE only if you need true phantom prevention; be aware it uses 2PL. Oracle “Serializable” is actually snapshot isolation — doesn’t prevent write skew.

What is the difference between “Repeatable Read” in PostgreSQL vs MySQL?
?
PostgreSQL Repeatable Read:

Implemented as Snapshot Isolation (MVCC-based)
Prevents: dirty reads, read skew, most phantom reads (for read-only queries)
Does NOT prevent: write skew
Does detect and prevent: lost updates (auto-detects via snapshot)
No locking; readers never block writers

MySQL InnoDB Repeatable Read:

Implemented with locking + gap locks
Prevents: dirty reads, read skew
Prevents phantoms via gap locks (stronger than standard Repeatable Read)
Does NOT prevent: write skew
Uses NEXT-KEY locks (row lock + gap lock) on indexed range queries
Writers can block readers in some cases

Key difference: MySQL’s Repeatable Read uses gap locks (2PL-style) to prevent phantoms within the read set; PostgreSQL’s uses pure MVCC. Both leave write skew unresolved. The same name (“Repeatable Read”) means different things in different databases — this is why Kleppmann says the SQL standard is ambiguous and confusing.

Total Cards: 20
Review Time: ~30 minutes
Priority: HIGH
Last Updated: 2026-05-29

Study Notes by Niladri & AI

Explorer

ch08-flashcards