Chapter 7 Flashcards - Transactions

flashcards chapter-7 ddia


Basic Concepts

What does each letter of ACID stand for and mean?
?

  • A — Atomicity: All writes in a transaction succeed or none do; transaction is abortable and rolls back on failure. Not about concurrency — about abortability.
  • C — Consistency: Application invariants must hold. Actually the application’s responsibility, not the DB’s — the “C” is misleading and doesn’t really belong in the acronym.
  • I — Isolation: Concurrent transactions behave as if they ran serially. Multiple degrees of isolation exist.
  • D — Durability: Committed data persists through crashes. Achieved via WAL, replication, etc.

Key insight: Isolation is the most nuanced — most databases default to weaker-than-serializable levels

What is a dirty read and how is it prevented?
?

  • Dirty read: Reading data written by a transaction that hasn’t yet committed
    • If that transaction later aborts and rolls back, you read data that “never existed”
  • Example: Txn A writes balance=0 (uncommitted); Txn B reads balance=0; Txn A aborts → balance was never 0
  • Prevention: Read Committed isolation level
    • DB maintains old version of row until writing transaction commits
    • Read requests only see committed data (the old version while new write is in-flight)
  • Found in: All isolation levels above Read Uncommitted

What is read skew (non-repeatable read) and how does snapshot isolation prevent it?
?

  • Read skew: Two reads of the same row within one transaction return different values
    • Can occur in Read Committed: another transaction commits between your two reads
  • Example: Reading your account balance twice during a large transfer — first read sees 0 (the transfer committed in between)
  • Prevention: Snapshot Isolation
    • At transaction start, a consistent snapshot of the DB is taken (via MVCC)
    • All reads within transaction see that snapshot, regardless of concurrent commits
    • Readers don’t block writers; writers don’t block readers

Isolation Levels

What are the four standard SQL isolation levels from weakest to strongest?
?

  1. Read Uncommitted: Can read dirty (uncommitted) writes. Almost never used.
  2. Read Committed: No dirty reads/writes. Default in Oracle 11g, PostgreSQL, SQL Server.
  3. Repeatable Read / Snapshot Isolation: No dirty reads, no non-repeatable reads. MVCC-based. Default in MySQL InnoDB.
  4. Serializable: Strongest. Results identical to running transactions one at a time.

Confusion: MySQL uses “REPEATABLE READ” for snapshot isolation; PostgreSQL uses “REPEATABLE READ” for snapshot too. Names are inconsistent across databases.

What doesn’t improve between 3 and 2: Snapshot isolation doesn’t prevent write skew or phantom reads without additional mechanisms.

What is MVCC and how does it implement snapshot isolation?
?

  • MVCC = Multi-Version Concurrency Control

  • Mechanism:

    • Each write creates a new version of the row, tagged with the writing transaction’s ID
    • Old versions kept until no active transaction needs them
    • Each row has created_by_txn and deleted_by_txn fields
  • Read rule: Transaction T reads a row version where:

    • created_by_txn ≤ T (created before or at T’s start)
    • deleted_by_txn > T (not deleted before T’s start)
  • Key property: Readers never block writers (they read old versions); writers never block readers

  • Cleanup: VACUUM (PostgreSQL) / GC removes versions no longer needed by any active transaction

Concurrency Bugs

What is a lost update and what are the four ways to prevent it?
?

  • Lost update: Two transactions both read a value, both modify it, one write overwrites the other
    • Example: Both read counter=5, both write counter=6 (should be 7)

Prevention methods:

  1. Atomic write operations: UPDATE counter SET value = value + 1 — single operation, no read needed
  2. Explicit locking: SELECT ... FOR UPDATE — acquires row lock; concurrent reads wait
  3. Compare-and-set (CAS): UPDATE WHERE value = old_value — fails if changed; app retries
  4. Automatic detection: Snapshot isolation (PostgreSQL) detects and aborts one conflicting transaction

Best practice: Use atomic operations whenever possible; explicit locks for complex logic; CAS for optimistic concurrency

What is write skew and how does it differ from a lost update?
?

  • Lost update: Two transactions write to the SAME row; one write is lost
  • Write skew: Two transactions read the SAME rows, each writes to DIFFERENT rows, together violating an invariant

Write skew example (on-call doctors):

  1. Both doctors check: SELECT COUNT(*) FROM on_call → both see 2 (safe to go off-call)
  2. Both delete themselves from on_call table
  3. Both commits succeed — now 0 doctors on call (invariant violated!)

Why snapshot isolation doesn’t prevent it: Each transaction’s individual write is valid given its read; the conflict only emerges when both commits succeed

Prevention: Serializable isolation (2PL or SSI); or application-level advisory locks; or re-read before commit

What is a phantom read and why is it hard to prevent?
?

  • Phantom read: Transaction reads a set of rows matching a condition; another transaction inserts a new row matching that condition; first transaction’s re-read sees the new “phantom” row

Example (room booking):

  1. Txn A checks: SELECT * WHERE room=101 AND date='Apr 14' → 0 rows (room available)
  2. Txn B also checks → 0 rows → inserts booking → commits
  3. Txn A inserts booking → DOUBLE BOOKING!

Why hard to prevent: Row-level locks don’t help (the conflicting row didn’t exist when A read)

Solutions:

  • Predicate locks (2PL): Lock all rows matching a condition (even future rows) — expensive
  • Index-range locks: Simplified predicate lock on a range in an index — less precise but cheaper
  • Materializing conflicts: Insert explicit “placeholder” rows to make the phantom lockable
  • SSI: Tracks which snapshots reads were based on; detects if a concurrent write would have changed that read

Serializable Isolation Approaches

Compare actual serial execution, 2PL, and SSI for serializable isolation.
?
Actual serial execution (VoltDB, Redis, Datomic):

  • Single thread executes transactions one at a time
  • ✅ Simplest, no locking overhead, no deadlocks
  • ❌ Single-core throughput limit; requires all data in RAM; transactions must be short stored procedures

Two-Phase Locking (2PL):

  • Writers block readers; readers block writers (pessimistic)
  • Phase 1 (expanding): acquire locks; Phase 2 (shrinking): release at commit
  • ✅ Prevents all anomalies including phantoms (with predicate/index-range locks)
  • ❌ Slow due to lock contention; deadlocks require detection and abort

Serializable Snapshot Isolation (SSI) (PostgreSQL, CockroachDB):

  • Built on snapshot isolation + optimistic conflict detection at commit time
  • Readers don’t block writers; writers don’t block readers
  • ✅ Performance close to snapshot isolation at low contention
  • ❌ Higher abort rate when contention is high; aborted transactions must be retried

How does Two-Phase Locking (2PL) work and what are its failure modes?
?
How it works:

  • Shared lock (for reads): Multiple transactions can hold simultaneously
  • Exclusive lock (for writes): Only one at a time; blocks all other readers and writers
  • Phase 1 (Expanding): Acquire locks as operations proceed (reads: shared; writes: exclusive)
  • Phase 2 (Shrinking): Release all locks only at commit or abort (hold throughout transaction)

Phantoms: Need predicate locks (lock all rows matching a WHERE clause) or index-range locks (lock a range in index)

Deadlocks: Txn A holds lock A, waits for B; Txn B holds lock B, waits for A → deadlock

  • Detection: DB detects cycle in wait-for graph → aborts one transaction
  • Prevention: Transaction ordering (lock in consistent order)

Performance: 2PL is significantly slower than snapshot isolation; lock contention under concurrent writes

Distributed Transactions

How does two-phase commit (2PC) work and what is its main vulnerability?
?
2PC protocol:

  1. Prepare phase: Coordinator sends Prepare to all participants
    • Each participant writes its decision (yes/no) to its own WAL, then responds
    • A yes vote is irrevocable — participant must commit if asked
  2. Commit phase: If ALL voted yes → coordinator writes commit to its WAL → sends Commit to all
    • If any voted no → coordinator sends Abort to all

Main vulnerability: Coordinator failure

  • If coordinator fails AFTER receiving all yes votes but BEFORE sending Commit:
    • All participants are in “in-doubt” state — they promised to commit but haven’t been told to
    • They hold locks; cannot proceed or abort without coordinator’s decision
    • Must wait for coordinator to recover (could be hours)
    • Why 2PC is called “blocking commit protocol”: In-doubt transactions can block indefinitely

Modern alternative: SSI-based distributed transactions (CockroachDB, Spanner) avoid 2PC’s blocking nature

Modern Context (2026)

What is the Saga pattern and when is it used instead of 2PC for distributed transactions?
?

  • Saga: A sequence of local transactions, each in a separate service, with compensating transactions for rollback
  • When used: Across microservice boundaries where distributed ACID transactions (2PC) are too expensive

Two implementation approaches:

  1. Choreography: Each service publishes events; other services react
    • ✅ Decoupled; ❌ Hard to track overall progress
  2. Orchestration: Saga orchestrator calls services in sequence
    • ✅ Clear overall flow; ❌ Central coordinator

Compensating transactions:

  • Cannot “undo” side effects (email sent, payment charged), but can logically reverse them
  • Example: Order cancellation → reverse the shipping, refund payment

Trade-off vs 2PC: Sagas are eventually consistent (compensations can fail); 2PC is atomic but expensive and blocking

When to use Sagas: Cross-service workflows, long-running processes, microservices architecture

How has SSI (Serializable Snapshot Isolation) changed database best practices in 2026?
?

  • 2008: SSI invented (Michael Cahill’s PhD thesis); implemented in PostgreSQL 9.1 (2011)
  • Traditional view: “Serializable is too slow; use lower isolation level + manual fixes”
  • 2026 view: SSI is mature and usable; PostgreSQL SSI adds ~10% overhead at low contention

Practical changes:

  • Many applications that used REPEATABLE READ + manual conflict handling → switched to SERIALIZABLE
  • CockroachDB uses SSI as its only isolation level (everything is serializable)
  • Applications no longer need to manually implement write skew prevention with application locks

Why SSI works well at low contention:

  • No locks held during transaction; reads are free
  • Only at commit time: check if any “stale reads” would have been different
  • Most transactions don’t conflict → commit immediately without overhead

Caveat: High-contention workloads still have high abort rates → 2PL or application-level coordination needed

Interview Scenarios

Design the transaction isolation for a bank account transfer system.
?
Requirements: Move $100 from Alice to Bob atomically; no overdrafts; correct balance always visible

Minimum needed: Serializable isolation (write skew possible if checking balance in snapshot isolation)

Implementation with SSI (PostgreSQL):

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
SELECT balance FROM accounts WHERE id = alice FOR UPDATE;  -- or just SELECT
-- check: balance >= 100
UPDATE accounts SET balance = balance - 100 WHERE id = alice;
UPDATE accounts SET balance = balance + 100 WHERE id = bob;
COMMIT;

Key decisions:

  • Use FOR UPDATE or serializable level to prevent concurrent overdraft
  • Two separate UPDATE statements — both in one transaction (atomic)
  • Check balance within transaction (not before starting)
  • Handle retry logic for SSI aborts

Alternative (higher performance): Single atomic stored procedure with serial execution

You’re seeing phantom booking conflicts in a room reservation system despite using transactions. How do you fix it?
?
Problem: Two users both check availability → both see available → both book → double booking

  • This is a phantom read — the conflict is on a row that didn’t exist when they checked
  • Snapshot isolation won’t prevent it (no row to lock)

Solutions:

Option 1: Serializable isolation (SSI)

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- SSI tracks that this read should detect concurrent inserts
SELECT * FROM bookings WHERE room=101 AND date='2026-04-14';
INSERT INTO bookings VALUES (101, '2026-04-14', user_id);
COMMIT;
-- If concurrent transaction also committed, one will be aborted and retried

Option 2: Materialized conflict row

-- Pre-insert placeholder for each room-date combination
-- Use SELECT FOR UPDATE on the placeholder
SELECT * FROM room_availability WHERE room=101 AND date='2026-04-14' FOR UPDATE;
-- Now there's a real row to lock!
INSERT INTO bookings...;

Option 3: Unique constraint

  • Add UNIQUE(room_id, date) constraint
  • Let DB reject duplicate inserts; handle error in app

Recommended: Unique constraint + application retry (simplest, most robust)

Quick Facts

What isolation level does PostgreSQL default to, and what is the highest available?
?

  • Default: READ COMMITTED (not snapshot/repeatable read)
  • Highest available: SERIALIZABLE (SSI-based since PostgreSQL 9.1)
  • Also available: REPEATABLE READ (snapshot isolation in PostgreSQL)
  • NOT available in PostgreSQL: READ UNCOMMITTED (treated same as READ COMMITTED)

Setting isolation level:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- or for all transactions in session:
SET default_transaction_isolation TO serializable;

Key PostgreSQL implementation detail: PostgreSQL’s SSI does NOT use predicate locking like traditional 2PL — it tracks “anti-dependencies” between transactions and aborts one if a cycle is detected

What is the difference between 2PL and 2PC?
?

  • 2PL = Two-Phase Locking (concurrency control within a single node)

    • Phase 1: Acquire locks as operations execute
    • Phase 2: Release locks at commit/abort
    • Purpose: Prevent concurrent anomalies on single node
    • No network involved
  • 2PC = Two-Phase Commit (atomic commit across multiple nodes/databases)

    • Phase 1 (Prepare): Ask all participants if they can commit
    • Phase 2 (Commit/Abort): Tell all participants to commit or abort based on votes
    • Purpose: Atomic distributed transaction
    • Network protocol between coordinator and participants

Both use “two phases” but serve completely different purposes!

Total Cards: 35
Estimated Review Time: 20-30 minutes
Recommended Frequency: Daily for first week, then spaced repetition
Last Updated: 2026-04-13