Chapter 9: Data Ownership and Distributed Transactions
saht data-ownership distributed-transactions eventual-consistency joint-ownership 2pc
Status: Notes complete
Overview
In a monolith, transactions are straightforward: a single ACID database handles everything. Once you decompose into microservices — each owning its own database — two hard problems emerge:
- Who owns what data? When a table’s data is written by multiple services, you have a conflict of ownership.
- How do you keep data consistent across service boundaries? Without a shared database, full ACID guarantees disappear.
Chapter 9 systematically works through both problems. It catalogs three data ownership scenarios in order of difficulty, four techniques for resolving the hardest case (joint ownership), and three eventual consistency patterns for managing distributed transactions. The chapter is foundational for ch11-managing-distributed-workflows and ch12-transactional-sagas, which build richer patterns on top of these foundations.
The Sysops Squad Saga application threads throughout: as the monolith is decomposed, the ticket processing subsystem reveals all three ownership scenarios and forces the team to choose among the resolution techniques.
Data Ownership Scenarios
The book establishes a clear vocabulary: data ownership means the service responsible for writing to (and therefore controlling) a particular table or data entity. Read access is a separate concern — a service that reads data it doesn’t own is fine; a service that writes data it doesn’t own is a problem.
Scenario 1 — Single Ownership
Definition: Exactly one service writes to a table. Other services may read it, but they do not write to it.
Why it’s easy: No ownership conflict exists. The owning service is the single source of truth. Other services query it via API or read from a replicated copy.
Example (Sysops Squad): The Survey service is the only service that writes to the survey table. The Reporting service reads survey results but only through a read API — it never writes. Ownership is unambiguous.
┌─────────────────────┐
│ Survey Service │──writes──▶ [survey table]
└─────────────────────┘
│
│ read API
▼
┌─────────────────────┐
│ Reporting Service │ (reads only, no ownership)
└─────────────────────┘
Resolution: None required. Assign the table to the service that writes it.
Scenario 2 — Common Ownership
Definition: Many services need to read from a shared, relatively static reference table (e.g., zip codes, country codes, product categories, lookup lists). No service writes to it frequently; when updates happen, they are administrative or infrequent.
Why it’s manageable: The data is not domain-specific — it belongs to no single business domain. It acts as infrastructure or reference data. Multiple services reading the same reference data is not a conflict; it’s sharing.
Resolution options:
-
Shared reference data service — Create a dedicated service (e.g.,
ReferenceDataService) that owns the table and exposes it via API. All other services query it. Simple, single source of truth, but adds network latency and a dependency. -
Replicate to each service — Each service gets its own copy of the reference data, synchronized periodically. Eliminates cross-service runtime calls, but introduces eventual consistency for the reference data itself.
-
Shared schema — Allow multiple services to read from the same database schema for this table (a pragmatic exception to per-service databases, acceptable for truly static data).
Example (Sysops Squad): Zip code lookup data is needed by the Customer service, the Ticket Routing service, and the Billing service. None of them “owns” it in a business sense — it’s reference data. The team creates a small shared LocationReferenceService.
[zip_code table] ◀── owns ── LocationReferenceService
│
read API
┌────┴─────────────────────────────┐
▼ ▼ ▼
Customer Svc TicketRouting Svc Billing Svc
Scenario 3 — Joint Ownership
Definition: Two or more services both need to write to the same table (or overlapping columns of the same table). This is the hardest case.
Why it’s hard: You cannot simply assign ownership to one service without the other losing write access. You cannot share the table without coupling the services at the database level, which defeats the purpose of decomposition. Any solution involves a trade-off.
Example (Sysops Squad): The Ticket service and the Assignment service both write to the ticket table. The Ticket service writes ticket creation data (customer, description, priority). The Assignment service writes assignment data (assigned expert, scheduling). They cannot both “own” the table, and the table holds interleaved data from both domains.
The book offers four resolution techniques, described in the next section.
Joint Ownership Resolution Techniques
Technique 1 — Table Split
Concept: Split the shared table into two separate tables, each owned by one service. Each service writes only to its own table. If the data needs to be joined, it is joined at the API layer or via an event-driven approach.
Mechanism:
- Analyze which columns each service writes to.
- Split the table along those column boundaries.
- Each service gets a dedicated table with a shared primary key (e.g.,
ticket_id).
BEFORE (shared table):
┌──────────────────────────────────────────────────────┐
│ ticket table │
│ ticket_id | cust_id | desc | priority | expert_id │
│ ← Ticket Svc ──▶ | ← Assignment Svc ──▶ │
└──────────────────────────────────────────────────────┘
AFTER (table split):
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ ticket table (Ticket Svc) │ │ assignment table (Assign Svc) │
│ ticket_id | cust_id | desc │ │ ticket_id | expert_id | sched │
│ | priority │ │ │
└──────────────────────────────┘ └──────────────────────────────┘
shared primary key (ticket_id)
Trade-offs:
| Aspect | Assessment |
|---|---|
| Data coupling | Low — clean separation |
| Service coupling | Low — services are independent |
| Data integrity | Harder — referential integrity across tables requires application-level enforcement |
| Query complexity | Higher — joins must happen at API/aggregation layer |
| When to use | When the columns map cleanly to service boundaries |
Fits well when: Each service writes to a distinct, non-overlapping set of columns. If services write to the same column (e.g., a shared status field), a table split alone is insufficient.
Technique 2 — Data Domain
Concept: Rather than splitting the table, create a shared data domain — a dedicated schema or database explicitly shared by the services that need joint write access. This is a controlled exception to the “one database per service” rule.
Mechanism:
- Define a data domain (e.g.,
TicketDomain) that contains the contested table(s). - Both
Ticketservice andAssignmentservice have read/write access to this domain. - The domain is explicitly documented as shared; no other services access it.
┌─────────────────────────────────────────────────────┐
│ Ticket Data Domain (shared) │
│ ┌──────────────────────────────────────────────┐ │
│ │ ticket table │ │
│ └──────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
▲ ▲
writes/reads writes/reads
│ │
┌───────────────┐ ┌───────────────────┐
│ Ticket Svc │ │ Assignment Svc │
└───────────────┘ └───────────────────┘
Trade-offs:
| Aspect | Assessment |
|---|---|
| Data coupling | High — both services share the same schema |
| Service coupling | Medium — shared database creates deployment coupling |
| Data integrity | Good — full ACID within the shared domain |
| Query complexity | Low — standard SQL joins work |
| Operational overhead | Low — no cross-service API calls for writes |
| When to use | When data integrity is paramount and the sharing scope is strictly bounded |
Risk: The shared data domain can grow — other services start “borrowing” access, gradually recreating the monolith’s shared database. Governance is required to keep the domain boundary tight.
Fits well when: The two services are tightly related (perhaps candidates for consolidation), data integrity constraints are strict, and you want to avoid eventual consistency complexity.
Technique 3 — Delegate
Concept: Designate one service as the owner of the table. The other service that previously wrote to it must now delegate all writes through the owning service’s API. The non-owning service sends write requests; it never touches the database directly.
Mechanism:
- Assign ownership to the service whose core domain the table best represents.
- The non-owning service calls the owning service’s write API whenever it needs to modify the data.
- The owning service enforces business rules, validation, and consistency.
BEFORE: both services write to table directly
AFTER (Ticket Svc owns the table):
Assignment Svc ──── write request (API call) ────▶ Ticket Svc
│
validates
│
▼
[ticket table]
Trade-offs:
| Aspect | Assessment |
|---|---|
| Data coupling | Low — only owner touches the DB |
| Service coupling | High — non-owner is runtime-dependent on the owner |
| Data integrity | Good — owner enforces all rules |
| Availability | Reduced — if owner is down, delegating service cannot write |
| Performance | Reduced — extra network hop for every write |
| When to use | When one service is clearly the “true” owner semantically |
Fits well when: Ownership is logically clear but historically both services wrote to the table for convenience. Also useful as a migration step — the delegate pattern can be introduced quickly while a longer-term solution (table split or consolidation) is designed.
Failure mode: If the owning service becomes a bottleneck or single point of failure, all dependent services are affected. Circuit breakers and retry logic become essential.
Technique 4 — Service Consolidation
Concept: If two services cannot cleanly separate their data ownership, merge them into a single service. The “joint ownership” problem disappears because there is only one service.
Mechanism:
- Evaluate whether the two services have high semantic cohesion despite being split.
- If yes, merge them into one service with one database.
- The merged service owns all the data and handles all the writes internally.
BEFORE:
┌───────────────┐ ┌───────────────────┐
│ Ticket Svc │ │ Assignment Svc │
│ (writes to │ │ (writes to │
│ ticket tbl) │ │ ticket tbl) │
└───────────────┘ └───────────────────┘
AFTER (consolidated):
┌────────────────────────────────────────┐
│ Ticket Management Service │
│ (handles creation + assignment │
│ internally; owns ticket table) │
└────────────────────────────────────────┘
│
▼
[ticket table]
Trade-offs:
| Aspect | Assessment |
|---|---|
| Data coupling | None — single owner |
| Service coupling | None — single service |
| Granularity | Increases service size (integrator force) |
| Deployability | Merged service is larger, harder to deploy independently |
| Scalability | Must scale the merged service as a whole |
| When to use | When the two services are more cohesive than they appear; when other techniques are too complex |
Fits well when: The services are so tightly coupled operationally and semantically that keeping them separate creates more problems than it solves. This technique is a recognition that the original decomposition was too granular.
Key insight: Service consolidation is not a failure — it is a valid granularity integrator. See ch07-service-granularity for the disintegrator/integrator framework.
Choosing Among the Four Techniques
Is there a clean column boundary?
YES ──▶ Table Split
NO ──▶
Is data integrity across writes critical?
YES ──▶ Data Domain (controlled shared DB)
NO ──▶
Is one service the semantic owner?
YES ──▶ Delegate
NO ──▶ Service Consolidation
The book also frames it in terms of coupling tolerance: if you can tolerate higher service coupling, Delegate works. If you need low coupling but high integrity, Data Domain. If neither constraint dominates, Table Split or Consolidation depending on column alignment and cohesion.
Distributed Transactions: The Problem
Why ACID Breaks Across Services
In a monolith with a single relational database, ACID guarantees are free:
- Atomicity: All operations in a transaction commit or all roll back.
- Consistency: The database enforces integrity constraints across all tables.
- Isolation: Concurrent transactions don’t see each other’s partial state.
- Durability: Committed data survives failures.
In a distributed architecture, each service has its own database. A business operation spanning two services (e.g., creating a ticket AND updating the customer’s open-ticket count) must touch two databases. There is no native mechanism to make this atomic.
Concrete failure scenario:
Step 1: Ticket Service creates ticket record in tickets DB ✓
Step 2: Customer Service increments open_tickets in customers DB ✗ (crash)
Result: Ticket exists, but customer count is wrong.
System is inconsistent. No rollback of Step 1.
Why Two-Phase Commit (2PC) Doesn’t Scale
2PC is the classical distributed transaction protocol. It works in two phases:
Phase 1 — Prepare:
- A coordinator sends PREPARE to all participant databases.
- Each participant writes the pending changes to a durable log and responds READY or ABORT.
Phase 2 — Commit:
- If all participants sent READY, coordinator sends COMMIT to all.
- If any sent ABORT, coordinator sends ROLLBACK to all.
Coordinator
│
├──PREPARE──▶ DB1 ──READY──▶ │
├──PREPARE──▶ DB2 ──READY──▶ │
│ │
│◀─── all READY ──────────────┘
│
├──COMMIT──▶ DB1
└──COMMIT──▶ DB2
Problems with 2PC in microservices:
-
Blocking protocol: If the coordinator crashes after sending PREPARE but before sending COMMIT, all participants are blocked indefinitely — they have locks held, cannot commit or roll back, and must wait for the coordinator to recover. This is the blocking problem.
-
Synchronous coupling: 2PC requires all participants to be available simultaneously. In a microservices environment with many services, the probability that all are healthy at the same instant decreases with the number of participants.
-
Performance: All participants hold locks through both phases. Under high concurrency, this creates severe contention.
-
Doesn’t cross HTTP boundaries cleanly: 2PC was designed for database-level coordination (XA protocol). Applying it across HTTP-communicating microservices requires XA-aware resource managers in each service — extremely rare in practice.
-
Tight coupling: The coordinator and all participants are tightly coupled — a failure in any participant can stall the entire transaction.
Conclusion: 2PC is not a viable general solution for distributed microservices transactions. The book’s answer is eventual consistency managed through one of three patterns.
Eventual Consistency Patterns
The key conceptual shift: instead of trying to make distributed operations atomic (all succeed or all fail instantly), accept that consistency will be achieved eventually — and design the system to detect, handle, and recover from transient inconsistencies.
The three patterns differ in who drives the synchronization, how failures are handled, and what the coupling trade-offs are.
Pattern 1 — Background Synchronization
Core idea: Each service performs its local operation independently. A separate background process (batch job, scheduler, or reconciliation service) periodically checks for inconsistencies and corrects them.
How it works:
Time 0: Ticket Service creates ticket [tickets DB: ticket_id=42, open]
Time 0: Customer Service receives no update [customers DB: open_count still old]
...
Time T: Background Sync Process runs
- Queries tickets DB: find tickets created since last run
- For each ticket, checks customers DB
- Detects open_count discrepancy
- Issues corrective write to customers DB
Time T: customers DB is now consistent [customers DB: open_count correct]
Concrete example (Sysops Squad): When a ticket is created in the Ticket service, the Customer service’s open ticket count should increment. With background synchronization, a nightly (or hourly) reconciliation job scans for tickets created without a corresponding customer count update and applies corrections.
Sequence diagram:
Client ──▶ Ticket Svc ──▶ [tickets DB] (write succeeds)
... time passes ...
BackgroundSync ──▶ [tickets DB] (reads new tickets)
BackgroundSync ──▶ [customers DB] (checks counts)
BackgroundSync ──▶ [customers DB] (corrects discrepancy)
Failure modes:
- Stale reads: Between the service write and the next sync run, the data is inconsistent. Queries during this window return stale data.
- Sync job failure: If the background job itself fails, inconsistency persists until the job recovers. Requires monitoring and alerting.
- Race conditions: If a customer is updated both by a service and the sync job simultaneously, write conflicts can occur. Requires idempotent corrections.
- Detection lag: The longer the sync interval, the longer inconsistency persists.
Trade-offs:
| Aspect | Assessment |
|---|---|
| Architectural complexity | Low — simple pattern, no changes to services |
| Service coupling | Very low — services know nothing of each other |
| Data consistency | Eventual, with configurable lag (minutes to hours) |
| Fault tolerance | Poor — sync job is a SPOF; inconsistency during failure |
| Scalability | Good — services are independent |
| Responsiveness | High — no synchronous coordination |
| Best for | Low-volume, low-criticality data with high tolerance for staleness |
When to use: Reporting databases, analytics aggregations, non-critical summaries (e.g., nightly reconciliation of financial summaries). Not suitable when users need to see consistent data immediately after an operation.
Pattern 2 — Orchestrated Request-Based Pattern
Core idea: A central orchestrator service calls participating services in sequence over HTTP/gRPC. If a step fails, the orchestrator issues compensating transactions to undo the steps that already succeeded.
How it works:
Orchestrator
│
├──1. Create ticket──▶ Ticket Service ──▶ [tickets DB] ✓
│◀────────────────────── ticket_id=42 ───────────────────
│
├──2. Increment count──▶ Customer Service ──▶ [customers DB] ✗ (fails)
│◀─────────────────────── FAILURE ─────────────────────────
│
├──COMPENSATE: Delete ticket──▶ Ticket Service ──▶ [tickets DB] ✓
│◀────────────────────────────── OK ────────────────────────────
│
└──Return error to client
Concrete example (Sysops Squad): A ticket creation workflow orchestrator:
- Calls
Ticket Service— creates the ticket record. - Calls
Customer Service— increments open ticket count. - Calls
Assignment Service— routes to appropriate expert.
If step 2 fails, the orchestrator calls Ticket Service with a delete/cancel compensating transaction to undo step 1. If step 3 fails, it compensates steps 1 and 2.
Compensating transaction design is critical: compensations must be idempotent (can be called multiple times safely) and must handle the case where the original operation partially succeeded.
Failure modes:
- Orchestrator failure: If the orchestrator crashes mid-workflow, in-progress transactions are stranded. Requires persistent workflow state (durable orchestration log) to resume. Addressed more fully in ch11-managing-distributed-workflows and ch12-transactional-sagas.
- Compensation failure: If a compensating transaction also fails, the system is in an inconsistent state that requires manual intervention or a retry mechanism.
- Partial compensation: Complex workflows with many steps require many compensating transactions, each of which must be carefully designed.
- Retry storms: Under failure, aggressive retry logic from the orchestrator can overwhelm downstream services.
Trade-offs:
| Aspect | Assessment |
|---|---|
| Architectural complexity | Medium — orchestrator must manage state and compensation logic |
| Service coupling | Medium — orchestrator is coupled to all participating services |
| Data consistency | Strong eventual — inconsistency window is the duration of the workflow |
| Fault tolerance | Medium — orchestrator is a SPOF; compensations can fail |
| Scalability | Medium — orchestrator can become a bottleneck |
| Responsiveness | Medium — synchronous calls add latency |
| Best for | Multi-step workflows requiring a defined failure recovery path |
When to use: Business workflows where the sequence of operations is known in advance, failure recovery logic is definable, and some latency is acceptable. More fully developed as the Saga pattern in ch12-transactional-sagas.
Pattern 3 — Event-Based Pattern
Core idea: Services publish domain events to a message broker. Other services subscribe and react to those events asynchronously. No synchronous coordination. Each service maintains its own consistency by consuming events.
How it works:
Ticket Service ──▶ publishes "TicketCreated" event ──▶ [Message Broker]
│
┌────────────────────┤
▼ ▼
Customer Service Assignment Service
(subscribes to (subscribes to
TicketCreated) TicketCreated)
│ │
▼ ▼
increments count routes to expert
[customers DB] [assignments DB]
Concrete example (Sysops Squad): When a ticket is created:
Ticket Servicewrites the ticket locally and publishes aTicketCreatedevent.Customer Serviceconsumes the event and increments the customer’s open ticket count.Assignment Serviceconsumes the event and triggers routing logic.
Both downstream services process the event independently. If Customer Service is temporarily down, the event remains in the broker queue until the service recovers.
Failure modes:
- Event loss: If the broker loses an event (no persistence), downstream services never get notified. Requires a durable message broker (Kafka, RabbitMQ with persistence).
- Out-of-order events: Events may arrive out of sequence. Services must handle idempotency and ordering.
- Consumer failure after processing: The consumer processes the event but crashes before acknowledging. The broker redelivers — the consumer must handle duplicate events idempotently.
- No rollback semantics: If
Customer Serviceprocesses aTicketCreatedevent and later the ticket is discovered to be invalid, a compensatingTicketCancelledevent must be published. Compensations are event-driven, not procedural. - Debugging complexity: Tracing the cause of an inconsistency across asynchronous event streams is significantly harder than debugging synchronous flows.
Trade-offs:
| Aspect | Assessment |
|---|---|
| Architectural complexity | High — event schema design, broker management, idempotency, ordering |
| Service coupling | Very low — services only share event contracts, not direct dependencies |
| Data consistency | Eventual — lag depends on broker throughput and consumer availability |
| Fault tolerance | High — broker buffers events; consumers recover independently |
| Scalability | High — services scale independently; broker scales horizontally |
| Responsiveness | High for publisher — fire and forget; no waiting for downstream |
| Best for | High-throughput, loosely coupled systems where eventual consistency is acceptable |
When to use: High-volume operations, fan-out notifications (one event triggers many consumers), systems where decoupling is paramount, and where the engineering team is equipped to handle asynchronous debugging complexity. See ch11-managing-distributed-workflows for choreography patterns built on events.
Pattern Comparison
| Dimension | Background Sync | Orchestrated Request-Based | Event-Based |
|---|---|---|---|
| Who drives sync | Batch/reconciliation job | Central orchestrator | Message broker + consumers |
| Communication style | Polling (pull) | Synchronous request/response | Asynchronous publish/subscribe |
| Consistency lag | High (minutes–hours) | Low (seconds) | Medium (milliseconds–seconds) |
| Service coupling | Very low | Medium | Very low |
| Architectural complexity | Low | Medium | High |
| Fault tolerance | Low (sync job SPOF) | Medium (orchestrator SPOF; compensations) | High (broker buffers; consumers recover) |
| Failure recovery | Next sync run corrects | Compensating transactions | Compensating events |
| Scalability | Good | Medium (orchestrator bottleneck) | High |
| Debugging | Easy | Medium | Hard (distributed traces needed) |
| Idempotency required | Yes (sync corrections) | Yes (compensations) | Yes (duplicate event delivery) |
| Best use case | Reporting, analytics, low-criticality reconciliation | Multi-step business workflows | High-throughput, fan-out, loosely coupled domains |
| Example scenario | Nightly billing reconciliation | Order creation with payment and inventory steps | Ticket creation notifying multiple downstream services |
No universal winner: The right choice depends on consistency requirements, volume, team capability, and tolerance for complexity. Many real systems combine patterns — e.g., event-based for high-volume paths, orchestrated for critical financial transactions.
Decision Framework
Step 1 — Identify the ownership scenario
Does only one service write to this data?
YES ──▶ Single Ownership. No further action needed.
NO ──▶ Do multiple services need to READ (but only one writes)?
YES ──▶ Common Ownership. Use shared reference service or replication.
NO ──▶ Multiple services WRITE ──▶ Joint Ownership. Go to Step 2.
Step 2 — Resolve joint ownership
Can you split columns cleanly by service boundary?
YES ──▶ Table Split technique.
NO ──▶ Is data integrity across writes non-negotiable?
YES ──▶ Data Domain technique (shared schema, controlled scope).
NO ──▶ Is one service the semantic owner?
YES ──▶ Delegate technique.
NO ──▶ Are these services more cohesive than they appear?
YES ──▶ Service Consolidation.
NO ──▶ Re-examine domain boundaries (decomposition may be wrong).
Step 3 — Choose an eventual consistency pattern
How long can data be inconsistent?
Hours acceptable ──▶ Background Synchronization.
Minutes/seconds acceptable ──▶ Go to next question.
Is the workflow a defined sequence of steps with known failure recovery?
YES ──▶ Orchestrated Request-Based (Saga).
NO ──▶ Go to next question.
Is high throughput, low coupling, or fan-out notification required?
YES ──▶ Event-Based pattern.
Uncertain ──▶ Evaluate team capability for async complexity first.
Additional factors
| Factor | Favors |
|---|---|
| Team inexperienced with async | Orchestrated or Background Sync |
| High write throughput | Event-based |
| Strict audit trail needed | Orchestrated (explicit state) |
| Many downstream consumers per event | Event-based |
| Hard deadline for consistency | Orchestrated |
| Low operational overhead | Background Sync |
| Broker infrastructure already in place | Event-based |
Sysops Squad Saga
The Sysops Squad case study in Chapter 9 follows the ticket processing workflow as the team works out who owns what.
The Problem
The monolith’s ticket table is written by several components now separated into distinct services:
Ticketservice — creates and closes ticketsAssignmentservice — assigns experts and schedules appointmentsSurveyservice — creates follow-up surveys after ticket resolution
After decomposition, all three services find themselves wanting to write to overlapping data.
Resolution Applied
Survey service — the team identifies that only Survey service writes to the survey table. This is single ownership. No change needed.
Zip code lookup — needed by Ticket, Customer, and Assignment services for routing. Nobody “owns” it. This is common ownership. Resolved by creating a small LocationReferenceService that all three query.
Ticket + Assignment services and the ticket table — this is the hard case: joint ownership. The team analyzes the columns:
Ticketservice writes:ticket_id,cust_id,description,priority,created_at,statusAssignmentservice writes:assigned_expert,scheduled_date,completion_notes
Because the column boundaries are relatively clean, the team applies the Table Split technique:
tickettable stays withTicketservice (creation and lifecycle).- New
ticket_assignmenttable is created, owned byAssignmentservice, linked byticket_id.
The Distributed Transaction
After the split, creating a ticket and assigning it becomes a distributed transaction. The team evaluates:
- Background sync is too slow — customers need to see their ticket assignment immediately.
- Event-based is attractive but the team doesn’t yet have a mature broker infrastructure.
- Orchestrated request-based is chosen for the initial implementation: a lightweight orchestrator calls
Ticketservice thenAssignmentservice, with compensating transactions (delete the ticket if assignment fails).
This is a deliberate pragmatic choice — the team notes they may migrate to event-based as their event infrastructure matures.
Key Takeaways
-
Data ownership must be explicit: Every table in a distributed system must have exactly one service responsible for writing to it. Ambiguity causes corruption and coupling. The first step in distributed data design is to assign ownership clearly.
-
The three scenarios form a difficulty ladder: Single ownership is trivial; common ownership is manageable; joint ownership is the real challenge and requires deliberate resolution.
-
Joint ownership has four resolution techniques: Table Split (column partition), Data Domain (controlled shared schema), Delegate (one service owns, others call its API), and Service Consolidation (merge the services). Each is a trade-off between coupling, integrity, and complexity.
-
2PC does not scale in microservices: The blocking protocol, synchronous coupling, and lock contention make 2PC impractical for HTTP-based distributed services. Eventual consistency is the pragmatic alternative.
-
Eventual consistency is a spectrum: Background synchronization has the highest lag but the least complexity. Orchestrated patterns offer tighter consistency windows at the cost of a SPOF orchestrator. Event-based patterns offer the best decoupling and scalability but the highest implementation complexity.
-
Compensating transactions are not rollbacks: They are forward-moving operations that logically undo a previous step. They must be idempotent and explicitly designed — they are not automatic.
-
The event-based pattern requires idempotent consumers: Because message brokers may redeliver messages (at-least-once delivery), every consumer must safely handle duplicate events without double-counting or corrupting data.
-
Chapter 9 patterns are foundational for Chapters 11 and 12: The orchestrated request-based pattern evolves into full Saga patterns in ch12-transactional-sagas. The event-based pattern grounds the choreography workflow style in ch11-managing-distributed-workflows.
-
The Sysops Squad saga shows pragmatic evolution: The team doesn’t implement the theoretically ideal solution; they choose what fits their current infrastructure and team capability, with a documented migration path toward event-based when ready.
-
No one-size-fits-all: The same distributed system may use all three eventual consistency patterns in different subsystems, selected based on criticality, volume, and consistency requirements of each flow.
Related Resources
- ch07-service-granularity — Granularity disintegrators and integrators; service consolidation as an integrator force
- ch08-reuse-patterns — Sidecar, shared library, and service patterns; context for code vs. data reuse
- ch10-distributed-data-access — How services access data they don’t own (read patterns, API layer, caching)
- ch11-managing-distributed-workflows — Choreography and orchestration; event-based and orchestrator workflow patterns
- ch12-transactional-sagas — Full saga pattern catalog: epic saga, phone tag saga, fairy tale saga, etc.
- ch06-pulling-apart-operational-data — How to decompose a monolith’s database; sets up the ownership problem
- ddia-ch07-transactions — ACID deep dive; isolation levels; why distributed transactions are hard (DDIA context)
- ddia-ch09-consistency-and-consensus — Consensus algorithms, linearizability, eventual consistency theory
Last Updated: 2026-05-30