Chapter 8: Transactions

ddia-2e transactions acid isolation concurrency

Status: Notes complete

Overview

Transactions are the database’s primary mechanism for managing errors and concurrent access. They bundle multiple read and write operations into a single logical unit: either all operations succeed (commit), or none of them take effect (abort/rollback). The classic framework for describing transaction guarantees is ACID (Atomicity, Consistency, Isolation, Durability), but the 2nd edition is careful to show that each letter has nuances, and that real-world databases offer a spectrum of isolation levels — most of which provide weaker guarantees than the ideal for performance reasons.

Why transactions matter: Without transaction support, concurrent writes can corrupt data in subtle ways: lost updates, dirty reads, and write skew are bugs that appear only under load and are notoriously hard to reproduce. Transactions make these problems tractable by giving developers a clean correctness model to reason about.

2nd Edition updates: This chapter closely parallels 1st edition Ch7, but includes updated examples for modern cloud databases (Aurora, Spanner, CockroachDB), expanded coverage of SSI in production, and discussion of where transactions fit in microservices architectures (saga pattern). The chapter reinforces that most NewSQL databases by 2026 provide serializable isolation efficiently via SSI.

Key Concepts

What Exactly Is a Transaction?

A transaction is a group of reads and writes treated as one atomic operation by the database. The database guarantees that the transaction either commits (all writes persisted) or aborts (all writes rolled back, as if the transaction never happened).

Why this matters: Without transactions, if your application writes to three tables and crashes after the second write, the database is in an inconsistent state. With transactions, the incomplete writes are rolled back automatically.

Transaction boundaries: Delimited by BEGIN/START TRANSACTION and COMMIT or ROLLBACK. Any reads and writes between these markers are part of the transaction.

BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 'alice';
  UPDATE accounts SET balance = balance + 100 WHERE id = 'bob';
COMMIT;
-- Either both writes happen, or neither does

The Meaning of ACID

The ACID acronym was coined in 1983 by Haerder and Reuter. The 2nd edition discusses each property carefully:

Atomicity (A):

All-or-nothing: if any operation in the transaction fails, all operations are rolled back
The key word is abortability — the ability to undo a partial transaction
Not about concurrency (despite the word “atomic” in concurrent programming contexts); about crash recovery
Example: Money transfer — debit fails after credit applied → the credit is rolled back; no money lost

Consistency (C):

The most misleading letter: it refers to application-defined invariants (e.g., account balances never go negative; total money in system is conserved)
The database enforces certain constraints (NOT NULL, FOREIGN KEY, UNIQUE), but it cannot enforce arbitrary application logic
C is the application’s responsibility, not the database’s
Kleppmann argues the C was “bolted on” to make the acronym work; it doesn’t describe a database property per se

Isolation (I):

Concurrently executing transactions are isolated from each other as if they ran serially
Serializability is the strongest form: the database produces results equivalent to some serial ordering of the transactions
In practice, most databases use weaker isolation levels for performance (Read Committed, Snapshot Isolation)
The isolation level is the most important and nuanced property — the rest of the chapter explores its degrees

Durability (D):

Once a transaction commits, the data is persisted and survives hardware failures, crashes, and power outages
Typically implemented via: write-ahead log (WAL) flushed to disk before commit acknowledgment
In distributed databases: data replicated to multiple nodes before commit (stronger durability)
“Perfect durability does not exist”: correlated disk failures, firmware bugs, or datacenter disasters can still lose data; durability is probabilistic

ACID Properties — Concrete Example

Property	What It Prevents	Example Violation Without It
Atomicity	Partial writes on crash	Transfer: debit succeeds, credit fails → money disappears
Consistency	Invariant violations	Double-booking: two users book the last hotel room simultaneously
Isolation	Concurrency anomalies	Two processes both read balance=100, both write 150 (should be 200)
Durability	Data loss on crash	Transaction commits, server crashes before flush → data gone

Single-Object and Multi-Object Operations

Single-object atomicity: Even for a single object, writes can be non-atomic (e.g., writing a large JSON blob where a crash mid-write corrupts the record). Databases handle this with page-level atomicity, WAL, or copy-on-write.

Multi-object transactions: Most interesting application operations touch multiple objects:

Inserting a row into a table and updating a counter in another table
Updating denormalized data (document DB: user document + their post documents)
Maintaining referential integrity across tables

The case against multi-object transactions (NoSQL argument):

Cross-shard transactions require distributed coordination (2PC)
Distributed transactions are expensive, complex, and reduce availability
Many NoSQL databases historically avoided them (MongoDB pre-4.0, early Cassandra)

The case for multi-object transactions:

Single-object atomicity is insufficient for most real applications
Error handling without transactions is deeply complex: partial failures leave data inconsistent
Retrying failed operations can cause duplicate effects without idempotency guarantees
By 2026, most databases offer multi-object transactions: MongoDB 4.0+, Cassandra (lightweight transactions), DynamoDB (transactional APIs)

Weak Isolation Levels

Because full serializability has performance costs, databases offer weaker isolation levels that prevent some — but not all — concurrency anomalies.

Read Committed

The minimum standard: The most widely-used isolation level; the default in PostgreSQL, Oracle, SQL Server, and many others.

What it prevents:

Dirty reads: Reading data written by a transaction that has not yet committed
Dirty writes: Overwriting data written by a transaction that has not yet committed

What it does NOT prevent: Read skew, non-repeatable reads, phantom reads, lost updates, write skew

Dirty Read (prevented by Read Committed):
─────────────────────────────────────────
Time  Transaction A              Transaction B
t1    UPDATE balance=200         
t2                               SELECT balance → sees 200 (dirty read!)
t3    ROLLBACK (error)           
t4                               -- B saw uncommitted data that no longer exists

Implementation:

No dirty reads: DB remembers the old committed value until transaction commits; other transactions read the old value
No dirty writes: Row-level locks held until commit; only one writer per row at a time

Snapshot Isolation and Repeatable Read

What it prevents: Read skew (non-repeatable reads) — the problem where a transaction reads different values for the same row at different times within the same transaction.

Read Skew (prevented by Snapshot Isolation):
──────────────────────────────────────────────
Time  Transaction A (bank backup)   Transaction B (transfer $100 alice→bob)
t1    SELECT balance FROM alice → 500
t2                                   UPDATE alice SET balance=400
t3                                   UPDATE bob SET balance=600
t4                                   COMMIT
t5    SELECT balance FROM bob → 600
t6    -- A sees alice=500, bob=600 → total=1100 (should be 1000!)
      -- A got an inconsistent view of the world

How Snapshot Isolation works: Each transaction sees a consistent snapshot of the database as of the transaction’s start time. Writes by concurrent transactions that commit after the current transaction started are invisible to the current transaction.

MVCC (Multi-Version Concurrency Control): The standard implementation:

Each row stores multiple versions, tagged with transaction IDs (or timestamps)
Each transaction has a snapshot timestamp; it only sees row versions with created_by_txn_id ≤ snapshot_txn_id and not yet deleted at snapshot time
Reads never block writes; writes never block reads
Old versions are garbage-collected by a background process (VACUUM in PostgreSQL)

MVCC row versions:
──────────────────────────────────────────────────────────────────
Row: alice_balance
  Version 1: value=500, created_by=txn_42, deleted_by=txn_100
  Version 2: value=400, created_by=txn_100, deleted_by=NULL (current)

Transaction txn_99 (started before txn_100):
  → Sees Version 1 (value=500) because created_by=42 ≤ 99 and deleted_by=100 > 99

Transaction txn_101 (started after txn_100):
  → Sees Version 2 (value=400) because created_by=100 ≤ 101 and deleted_by=NULL

Naming confusion: “Snapshot Isolation” is often called “Repeatable Read” in PostgreSQL and MySQL — but MySQL’s “Repeatable Read” actually provides slightly different guarantees than PostgreSQL’s. The SQL standard is poorly defined on this point.

Preventing Lost Updates

Lost update: Two transactions concurrently perform a read-modify-write cycle on the same data; one transaction’s write overwrites the other’s without incorporating both updates.

Lost Update (read-modify-write race):
──────────────────────────────────────
Time  Transaction A                 Transaction B
t1    SELECT counter FROM t → 5    
t2                                  SELECT counter FROM t → 5
t3    UPDATE t SET counter = 6     
t4    COMMIT                       
t5                                  UPDATE t SET counter = 6
t6                                  COMMIT
      -- Counter should be 7 (incremented twice), but both see 5 and write 6

Common scenarios: Counter increments, shopping cart updates, wiki page editing (last-write-wins), account balance modifications via read-modify-write.

Solutions (in order of preference):

1. Atomic write operations (best when available):

UPDATE counters SET value = value + 1 WHERE key = 'page_views';
-- Single operation: DB handles it atomically; no application-level read needed

Database serializes concurrent atomic writes on the same row
Works for: increments, appends, set operations

2. Explicit locking (SELECT FOR UPDATE):

BEGIN;
SELECT * FROM accounts WHERE id = 'alice' FOR UPDATE;  -- exclusive lock
-- application logic here
UPDATE accounts SET balance = new_value WHERE id = 'alice';
COMMIT;

Lock held until commit; other transactions wait
Allows complex application logic between read and write

3. Compare-and-set (CAS):

UPDATE wiki_pages
SET content = 'new content', version = 6
WHERE id = 123 AND version = 5;  -- only update if version hasn't changed
-- 0 rows affected → version changed, retry needed

Fails if another transaction modified the value since the read
Application must check affected row count and retry if 0
Caveat: Some databases read from snapshot during CAS check — CAS may not detect the conflict

4. Automatic lost update detection (PostgreSQL):

With REPEATABLE READ / Snapshot Isolation, PostgreSQL automatically detects lost updates
Aborts one of the conflicting transactions; application retries
No explicit locking required

Write Skew and Phantoms

Write skew: Both transactions read the same data, each decides to make a write that would be valid when considered alone, but the combination of both writes violates an application invariant.

Write Skew (doctor on-call example):
──────────────────────────────────────────────────────────
Invariant: At least 1 doctor must be on call at all times

Time  Doctor Alice                    Doctor Bob
t1    SELECT count(*) FROM on_call → 2
t2                                    SELECT count(*) FROM on_call → 2
t3    -- "2 doctors on call, I can go off" --  "2 on call, I can go off"
t4    DELETE FROM on_call WHERE id=alice
t5                                    DELETE FROM on_call WHERE id=bob
t6    COMMIT                          COMMIT
t7    -- 0 doctors on call! Invariant violated!

Write skew is NOT prevented by snapshot isolation because each transaction’s individual write was valid based on what it read — but the combined effect violates the invariant. The conflict is between the reads of one transaction and the writes of another.

Phantom reads: A transaction queries a set of rows matching a condition; another transaction inserts a row matching that condition; the first transaction repeating the query sees new rows that didn’t exist before.

Phantom Read (meeting room booking):
──────────────────────────────────────────────────────────
Time  Transaction A (book room 101, 2pm)   Transaction B (book room 101, 2pm)
t1    SELECT count(*) FROM bookings
      WHERE room=101 AND time=14:00 → 0
t2                                          SELECT count(*) FROM bookings
                                            WHERE room=101 AND time=14:00 → 0
t3    INSERT INTO bookings ...room=101, time=14:00
t4    COMMIT
t5                                          INSERT INTO bookings ...room=101, time=14:00
t6                                          COMMIT
      -- DOUBLE BOOKING: two transactions both see no conflict and both insert

The phantom “doesn’t exist yet” when either transaction starts — snapshot isolation only prevents reads of existing rows from changing; it cannot prevent phantom inserts.

Phantoms require predicate locks or index-range locks to prevent:

Predicate lock: Lock all rows that match a given query condition (including rows that don’t exist yet)
Index-range lock: Lock a range in an index (approximation of predicate locks; simpler to implement)

Serializability

The gold standard: transactions appear as if they executed one at a time, in some serial order. Serializability prevents all concurrency anomalies: dirty reads, dirty writes, read skew, lost updates, write skew, and phantoms.

Three approaches to achieving serializability:

Actual Serial Execution

Run transactions literally one at a time, on a single thread.

Why it’s feasible now:

RAM is cheap enough to hold the entire working dataset in memory (in-memory DBs: VoltDB, Redis)
Network latency (not disk I/O) is the bottleneck
Interactive transactions (waiting for user input mid-transaction) eliminated — stored procedures only
Single-threaded throughput on modern hardware (millions of simple ops/sec) is sufficient for many use cases

Requirements:

Entire dataset fits in RAM
Transactions must be short and fast (no user interaction, no external calls)
Transactions implemented as stored procedures (pre-compiled, sent to DB as a unit)
Throughput limited to single-core throughput — can scale by partitioning data

Used by: VoltDB (H-Store), Redis (single-threaded), Datomic, FoundationDB

Limitations:

Cannot span multiple shards efficiently (cross-shard serial transactions are very expensive)
Throughput capped at single-core capacity; need to partition data to scale

Two-Phase Locking (2PL)

The classic serializable isolation algorithm; used in databases for decades.

Core principle: Writers block readers AND readers block writers (unlike snapshot isolation where readers and writers don’t block each other).

Two phases:

Phase 1 (Expanding/Growing): Acquire locks as needed during transaction execution
Phase 2 (Shrinking): Release locks only at commit or abort; never release a lock during the transaction

Lock types:

Shared lock (S): Multiple transactions can hold simultaneously (for reads); compatible with other S locks
Exclusive lock (X): Only one transaction at a time; incompatible with S or X locks (for writes)
A read requires S lock; a write requires upgrading to X lock

Predicate locks (prevents phantoms):

Lock not just existing rows, but the predicate (condition) used in a query
SELECT ... WHERE room=101 AND time=14:00 → lock the predicate room=101 AND time=14:00
Any insert/update that would match this predicate must wait for the lock

Index-range locks (practical approximation):

Lock a range in an index instead of the exact predicate (coarser lock, but simpler)
May lock more than necessary (false positives) but prevents all phantoms
Used by: PostgreSQL, MySQL InnoDB (gap locks)

MySQL InnoDB Gap Locks:
──────────────────────────────────────────────────────────────────
Index on (room, time): ... [101,13:00] [101,14:00] [101,15:00] ...
                                           ↑
Transaction queries room=101, time=14:00 — acquires gap lock on
the range (101,13:00) to (101,15:00), preventing any insert in that gap

Deadlocks: 2PL can produce deadlocks (A waits for B’s lock; B waits for A’s lock). Database must detect and abort one transaction; application must retry.

Performance: 2PL has significantly lower throughput than weaker isolation levels. Lock contention under high concurrency causes long wait queues. This led to the development of SSI.

Serializable Snapshot Isolation (SSI)

An optimistic concurrency control approach: let transactions run without blocking, detect conflicts at commit time, abort if a conflict is found.

How SSI works:

Every transaction runs under snapshot isolation (MVCC — reads from consistent snapshot)
The database tracks read-write dependencies: which reads relied on data that was subsequently modified by another committed transaction
At commit time: if the transaction’s reads are now “stale” (a committed concurrent transaction modified data the current transaction read), the current transaction is aborted
Application retries the aborted transaction

SSI conflict detection:
──────────────────────────────────────────────────────────────────
Time  Txn A (doctor alice)          Txn B (doctor bob)
t1    reads on_call_count = 2
t2                                   reads on_call_count = 2
t3    deletes alice from on_call
t4    COMMIT (succeeds — no conflict detected yet)
t5                                   deletes bob from on_call
t6                                   COMMIT → ABORT!
      (B's read of on_call_count was based on a snapshot that A has now modified)
      Application retries B

SSI advantages over 2PL:

Readers never block writers; writers never block readers
No lock contention → higher throughput under mixed read/write workloads
Deadlocks impossible (no locks held)
Performance degrades gracefully: only high-contention workloads see many aborts

SSI advantages over actual serial execution:

Transactions can run on multiple cores/nodes simultaneously
Better utilization of hardware; scales to distributed systems

Used by: PostgreSQL (serializable since 9.1, 2011), CockroachDB, Datomic, YugabyteDB

Performance characteristics:

Low contention workloads: SSI ≈ snapshot isolation performance (very low abort rate)
High contention workloads: SSI abort rate increases; may approach 2PL performance or worse
Best for: Read-heavy workloads with occasional writes; low-to-medium contention

Distributed Transactions and 2PC

Problem: How do you atomically commit a transaction that touches multiple nodes (multiple shards, multiple services)?

Two-Phase Commit (2PC)

The classic protocol for distributed atomic commit:

Two-Phase Commit Protocol:
──────────────────────────────────────────────────────────────────
                    Coordinator

Phase 1 (Prepare):
Coordinator → Node A: "Can you commit txn_99?" → Node A: "YES, prepared"
Coordinator → Node B: "Can you commit txn_99?" → Node B: "YES, prepared"
Coordinator → Node C: "Can you commit txn_99?" → Node C: "YES, prepared"

(Each node writes redo log to disk; promises to commit if asked)

Phase 2 (Commit/Abort):
IF all responded YES:
  Coordinator → Node A, B, C: "COMMIT txn_99"
  Each node commits and releases locks

IF any responded NO (or timeout):
  Coordinator → all nodes: "ABORT txn_99"
  Each node rolls back

The 2PC coordinator is a single point of failure:

If coordinator crashes after sending prepare but before sending commit/abort:
- Participants hold locks indefinitely; cannot proceed or rollback
- These are called in-doubt transactions: blocked waiting for coordinator recovery
Recovery: coordinator must recover its commit/abort decision from its WAL; then complete the protocol
Until recovery: participants remain blocked (potentially for minutes or hours)

Performance cost:

2 round trips (prepare + commit) between coordinator and all participants
Participants hold locks during the entire prepare phase
Network round trips add significant latency for geographically distributed nodes

XA Transactions: A standard interface for 2PC across heterogeneous data stores (different DB vendors, message queues). Used in Java EE/Jakarta EE with JTA. Provides vendor-neutral 2PC, but has high latency and is rarely used in modern systems.

Why 2PC is problematic:

Blocking protocol: in-doubt transactions block until coordinator recovers
Amplifies failures: any participant failure during commit blocks all participants
Performance: significant overhead for geo-distributed transactions

Modern Alternatives to 2PC

Saga pattern (for microservices):

Break a long-running transaction into a sequence of local transactions, each with a compensating action
If step 3 fails, execute compensating transactions for steps 2 and 1 to undo their effects
Not ACID: no isolation between sagas running concurrently; requires idempotency
Orchestration: One central orchestrator drives the sequence
Choreography: Services react to events and emit their own events (decentralized)

SSI-based distributed transactions (NewSQL):

CockroachDB, Spanner: distributed SSI without 2PC’s blocking nature
Spanner: TrueTime API (GPS + atomic clocks) provides bounded clock uncertainty → timestamps can safely order cross-node transactions
CockroachDB: hybrid logical clocks (HLC) + SSI across nodes
These are effectively non-blocking distributed transactions with serializable guarantees

Isolation Levels Comparison Table

Isolation Level	Dirty Reads	Non-Repeatable Reads	Phantom Reads	Write Skew	Lost Updates	Notes
Read Uncommitted	Allowed	Allowed	Allowed	Allowed	Allowed	Rarely used; dangerous
Read Committed	Prevented	Allowed	Allowed	Allowed	Allowed	Most common default (PG, Oracle, SQL Server)
Repeatable Read	Prevented	Prevented	Allowed	Allowed	Allowed*	MySQL default; PG implements as Snapshot
Snapshot Isolation	Prevented	Prevented	Prevented**	Allowed	Detected (PG)	MVCC-based; not in SQL standard
Serializable (2PL)	Prevented	Prevented	Prevented	Prevented	Prevented	Predicate/gap locks; slower
Serializable (SSI)	Prevented	Prevented	Prevented	Prevented	Prevented	Optimistic; better throughput than 2PL

*MySQL with gap locks prevents lost updates under Repeatable Read
**Snapshot isolation prevents phantom reads for read-only queries; write-time phantoms still possible without full serializability

2PL vs SSI Comparison

Dimension	Two-Phase Locking (2PL)	Serializable Snapshot Isolation (SSI)
Type	Pessimistic (lock before acting)	Optimistic (act, detect, abort)
Reader/writer blocking	Readers block writers; writers block readers	Readers never block; writers never block
Deadlocks	Possible (detection + abort needed)	Impossible (no locks)
Throughput (low contention)	Moderate	High (near snapshot isolation)
Throughput (high contention)	Low (lock queues)	Very low (many aborts + retries)
Latency per txn	Higher (lock acquisition waits)	Lower (no waiting)
Phantom prevention	Index-range/predicate locks	Tracked read-write dependencies
Suitable for	High contention, write-heavy	Low-medium contention, read-heavy
Used by	MySQL InnoDB, SQL Server	PostgreSQL, CockroachDB, YugabyteDB

When to use 2PL: When contention is high and you want predictable throughput without abort storms; when the workload has many short writes to the same rows.

When to use SSI: Most modern OLTP workloads with mixed reads/writes; when you want serializable guarantees without the operational overhead of lock management; when read-heavy and occasional write conflicts are acceptable.

Important Points Summary

ACID “Consistency” is the application’s job, not the database’s: The database enforces constraints (UNIQUE, NOT NULL, FK), but application-level invariants (account balance conservation, at-most-one booking) require correct transaction design.
Snapshot Isolation is not serializable: It prevents dirty reads and read skew but allows write skew and phantom-based anomalies. Many applications running under “Repeatable Read” assume serializability but don’t have it.
Lost updates are the most common concurrency bug: Every read-modify-write cycle without atomic operations or explicit locking is a potential lost update. They appear rarely in testing, massively at production load.
Write skew requires serializability to prevent: Snapshot isolation cannot detect write skew because the reads and writes happen in different transactions. Only predicate locks (2PL) or dependency tracking (SSI) catch it.
SSI is the modern default for serializable isolation: PostgreSQL’s SSI implementation (since 9.1) is mature; CockroachDB and YugabyteDB also use SSI. The old assumption that serializable = slow is no longer valid for most workloads.
2PC creates in-doubt transactions and blocks on coordinator failure: This makes 2PC unsuitable for high-availability systems. NewSQL databases implement non-blocking distributed serializable transactions (Spanner, CockroachDB) that avoid 2PC’s blocking nature.
The saga pattern trades atomicity for availability in microservices: Long-running cross-service operations are decomposed into local transactions with compensating rollbacks. Sagas are eventually consistent, not isolated — concurrent sagas can interfere.
MVCC enables snapshot isolation without blocking: Readers and writers can proceed concurrently because old row versions are maintained. The cost is storage for multiple versions and background garbage collection.

Modern Context (2026)

SSI is the default serializable isolation (not 2PL):
By 2026, the standard advice for applications that need serializability is SET TRANSACTION ISOLATION LEVEL SERIALIZABLE in PostgreSQL (or equivalent), which uses SSI — not 2PL. The performance gap between SSI and snapshot isolation is small for most OLTP workloads. “Serializable is too slow” is mostly a relic of 2PL-based implementations.

NoSQL now offers transactions:
The era of “NoSQL = no transactions” is over. MongoDB 4.0+ (2018): multi-document ACID transactions. DynamoDB: TransactWriteItems API (2018). Cassandra: Lightweight Transactions (LWT) using Paxos for single-partition conditional writes. By 2026, almost every major database offers some form of multi-object transaction support.

NewSQL: distributed serializable transactions at scale:
CockroachDB, Google Spanner, and YugabyteDB provide globally distributed serializable transactions as a first-class feature:

Spanner: TrueTime API (bounded clock uncertainty from GPS + atomic clocks) allows timestamp-based distributed serializable reads
CockroachDB: Hybrid Logical Clocks + SSI across all shards; no 2PC coordinator bottleneck
These systems absorb the complexity that 2PC exposed to the application

Sagas in microservices:
Across service boundaries, database transactions are impossible. The saga pattern (sequence of local transactions + compensating actions) has become the standard. Tools like Temporal (durable execution engine) automate saga orchestration with persistent workflow state, making compensating logic easier to write and reason about.

Aurora and distributed transaction innovations:
AWS Aurora: shared storage layer (not shared-nothing); transactions within a cluster are purely local without distributed coordination, because all nodes share the same log-structured storage. This enables fast multi-writer transactions for relational workloads without the latency of 2PC.

Questions for Reflection

Which ACID property is most important for preventing data corruption — and which one is the application’s responsibility rather than the database’s?
A user complains that their bank balance appears inconsistent during a transfer: they see the debit but not the credit. Which isolation anomaly is this, and what isolation level prevents it?
Explain why two transactions can each individually pass a “check invariant” query, yet together violate the invariant. What isolation level would prevent this?
An application performs a read-modify-write to update a JSON document stored in a database. Under which conditions does each of the four lost update prevention strategies (atomic ops, SELECT FOR UPDATE, CAS, automatic detection) apply?
Why does SSI outperform 2PL on typical OLTP workloads, and under what conditions would 2PL be preferable?
Why is the saga pattern “not a transaction” — what specific ACID property does it sacrifice, and what does the application need to do to compensate?

ch07-sharding — Cross-shard transactions require distributed coordination (2PC or saga)
ch09-trouble-with-distributed-systems — Network failures affect 2PC; coordinator failure creates in-doubt transactions
ch10-consistency-and-consensus — Consensus protocols (Paxos, Raft) used in CockroachDB/Spanner for distributed transactions
ch07-transactions — 1st edition version for comparison; same core content
ch06-replication — Replication and transactions interact: replication lag affects read-your-own-writes, which looks like a transaction isolation issue

Last Updated: 2026-05-29

Study Notes by Niladri & AI

Explorer

ch08-transactions

Chapter 8: Transactions

Overview

Key Concepts

What Exactly Is a Transaction?

The Meaning of ACID

ACID Properties — Concrete Example

Single-Object and Multi-Object Operations

Weak Isolation Levels

Read Committed

Snapshot Isolation and Repeatable Read

Preventing Lost Updates

Write Skew and Phantoms

Serializability

Actual Serial Execution

Two-Phase Locking (2PL)

Serializable Snapshot Isolation (SSI)

Distributed Transactions and 2PC

Two-Phase Commit (2PC)

Modern Alternatives to 2PC

Isolation Levels Comparison Table

2PL vs SSI Comparison

Important Points Summary

Modern Context (2026)

Questions for Reflection

Graph View

Table of Contents

Backlinks

Study Notes by Niladri & AI

Explorer

ch08-transactions

Chapter 8: Transactions

Overview

Key Concepts

What Exactly Is a Transaction?

The Meaning of ACID

ACID Properties — Concrete Example

Single-Object and Multi-Object Operations

Weak Isolation Levels

Read Committed

Snapshot Isolation and Repeatable Read

Preventing Lost Updates

Write Skew and Phantoms

Serializability

Actual Serial Execution

Two-Phase Locking (2PL)

Serializable Snapshot Isolation (SSI)

Distributed Transactions and 2PC

Two-Phase Commit (2PC)

Modern Alternatives to 2PC

Isolation Levels Comparison Table

2PL vs SSI Comparison

Important Points Summary

Modern Context (2026)

Questions for Reflection

Related Resources

Graph View

Table of Contents

Backlinks