Chapter 7: Service Granularity
saht service-granularity distributed-architecture microservices
Status: Notes complete
Overview
Chapter 7 addresses one of the most consequential — and difficult — decisions in distributed architecture: how big should a service be? There is no universal answer; granularity is determined by competing forces. The authors introduce a structured framework of Granularity Disintegrators (reasons to split a service into smaller pieces) and Granularity Integrators (reasons to merge services into a larger piece). Finding the right granularity requires consciously weighing these opposing forces against each other for each specific service.
The chapter’s central insight: getting granularity wrong in either direction is costly. Too fine-grained produces chattiness, distributed transaction nightmares, and operational overhead. Too coarse-grained re-creates the monolith’s change-coupling and scaling problems.
Core Concepts
| Term | Definition |
|---|---|
| Service granularity | The scope and size of a service — how much functionality it encapsulates |
| Granularity disintegrators | Architectural forces that push toward splitting a service into smaller pieces |
| Granularity integrators | Architectural forces that push toward combining services into a larger piece |
| Chatty services | Services that require many fine-grained inter-service calls to complete a business operation; a symptom of over-decomposition |
| Single-purpose principle | A service should do one thing well — have a single, well-defined responsibility |
| Fault tolerance | The ability of a system to continue operating despite partial failure |
| Code volatility | The rate at which code changes; high-volatility code often should be separated from low-volatility code |
Granularity Disintegrators
These are the forces that argue for splitting a service into smaller, more focused services.
1. Service Scope and Function
The most intuitive driver: a service that does too many unrelated things should be split. This aligns with the Single Responsibility Principle at the service level.
Indicators that scope is too broad:
- The service name is vague (e.g., “CustomerService” handles registration, preferences, payment methods, and notification settings)
- Different teams own different parts of the same service
- PRs constantly conflict because different features touch the same codebase
- It is impossible to describe the service’s purpose in one sentence without using “and”
Example: A notification service that handles email, SMS, push notifications, and in-app alerts. These may share some code but are distinct enough (different protocols, different vendors, different failure modes) to warrant separate services.
Trade-off: Splitting by function improves cohesion and team autonomy but increases the number of services to deploy, monitor, and maintain.
2. Code Volatility
Parts of a service that change at different rates should be separated. Mixing high-volatility code with stable code means every change to the volatile part forces a full redeployment of the stable part — increasing risk and deployment frequency unnecessarily.
Indicators of volatility mismatch:
- One area of the service has 10x more commits than another area
- Business rules change frequently while data access code rarely changes
- A new feature in one function causes unintended test failures in another
Example: In an order service, the discount calculation logic changes with every marketing campaign (high volatility), while the order persistence logic changes once a quarter (low volatility). Separating them means discount rule changes no longer risk breaking order persistence.
Trade-off: Decoupling by volatility reduces deployment risk but may create a network call where there was previously a method call.
3. Scalability and Throughput
Services with parts that have dramatically different scalability requirements should be split so the hot parts can scale independently without scaling everything.
Indicators:
- One function within a service gets 100x more traffic than the rest
- You are scaling the entire service to support one bottleneck function
- Cost analysis shows you are over-provisioning most of the service to handle a peak in one area
Example: A product catalog service where search is called thousands of times per second but product creation is called a few hundred times per day. Running both in one service means you must scale the entire service for search traffic, paying for compute capacity that the product-creation code never uses.
Trade-off: Independent scalability reduces cost and improves performance, but adds operational complexity (separate deployment pipelines, separate scaling policies).
4. Fault Tolerance
Functions with different failure risk profiles should be isolated so that a failure in one does not bring down another. This is especially important for functions that call unreliable external systems.
Indicators:
- One function calls a flaky third-party API
- A memory leak in one function OOMs the entire service process
- A bug in one area causes the entire service to crash, taking down unrelated functionality
Example: A payment service where the payment processing function (which calls an external payment gateway) should be isolated from the refund calculation function (which is purely internal). If the payment gateway goes down, the refund calculation should still work.
Trade-off: Fault isolation prevents blast radius expansion but requires implementing retry logic, circuit breakers, and health checks at service boundaries.
5. Security
Functions with elevated security requirements (PII, payment data, authentication secrets) should be isolated to minimize the attack surface and allow stricter access controls.
Indicators:
- One function handles PII or PCI data while the rest of the service does not
- Compliance requirements (PCI-DSS, HIPAA, GDPR) apply to only part of a service
- Different authentication/authorization levels are required for different functions
Example: In a customer service, the function that stores and retrieves credit card data should be in a separate service with network isolation, stricter logging, dedicated secrets management, and limited internal access — rather than co-located with general profile management.
Trade-off: Security isolation reduces attack surface and simplifies compliance auditing, but increases the number of service-to-service calls that must be secured.
6. Extensibility
Functions that are likely to change implementation or grow in scope independently should be separated so they can evolve without impacting stable parts of the system.
Indicators:
- A function is a known integration point where new vendors/providers will be plugged in
- Business requirements suggest this area will grow substantially
- The architecture team anticipates replacing the implementation in the next 12-18 months
Example: A shipping service that currently uses a single carrier. If the business plans to add multiple carrier integrations (FedEx, UPS, DHL), separating the carrier-specific logic from the shipping coordination logic allows each carrier adapter to evolve or be replaced without touching the coordination logic.
Trade-off: Extensibility-driven splits make the system more adaptable but add complexity now for benefits that may not materialize.
Granularity Integrators
These are the forces that argue for merging services into a single, larger service.
1. Database Transactions
If two (or more) services must participate in the same ACID transaction, keeping them together in one service is often the right choice. Distributed transactions are notoriously complex (two-phase commit, sagas) and introduce significant latency and failure modes.
Indicators that merging is warranted:
- Services share the same database tables and need atomicity across them
- A business operation must either complete fully across multiple services or not at all
- Rollback logic across service boundaries is currently broken or inconsistently implemented
Example: Customer registration that must atomically create a customer record AND create the initial account balance. If these live in separate services, a failure after the first operation but before the second leaves the system inconsistent. Keeping them together avoids the distributed transaction problem.
Key insight: If you cannot tolerate eventual consistency between two operations, they likely belong in the same service.
Trade-off: Keeping services together maintains transactional integrity at the cost of reduced independent deployability and scalability.
2. Workflow and Choreography
If completing a business operation requires too many fine-grained inter-service calls (chatty services), merging some of the services reduces network overhead, latency, and complexity.
Indicators:
- A single user-facing operation requires 8+ inter-service calls
- Services are constantly calling each other in tight loops
- Latency from network hops is noticeably degrading user experience
- Error handling across many hops is complex and fragile
The “chatty services” problem: Each network call adds latency (typically 10-100ms), potential for failure, and complexity. When services are so fine-grained that every meaningful operation requires orchestrating many of them, the overhead outweighs the benefits of decomposition.
Example: A checkout process that requires calling: inventory service (check stock), price service (get price), discount service (apply discounts), tax service (calculate tax), address service (validate shipping), payment service (charge), and notification service (send confirmation) — sequentially. Merging some of these (e.g., price + discount + tax into an “order calculation” service) can reduce the chain to 4-5 calls.
Trade-off: Merging chatty services reduces latency and network overhead but reduces independent deployability of the merged functions.
3. Shared Code
If multiple services share the same business logic (not just infrastructure/utility code), it is a signal that they might belong in the same service — or that the shared logic needs careful ownership.
Important distinction:
- Shared infrastructure code (logging, tracing, retry logic): fine to share as a library — this is not a merging signal
- Shared business logic (discount calculation rules, customer eligibility checks): sharing this often indicates the services are actually part of the same bounded context
Indicators:
- The same business rule is duplicated across multiple services (or extracted into a shared library that many services depend on)
- Changes to the business rule require simultaneous deployment of multiple services
- The shared code is semantically meaningful to the domain (not just utility code)
Example: Two services both implementing the same customer tier eligibility logic (gold/silver/bronze based on spending). If the rules change, all dependent services must be updated in sync. This is a sign they may belong together — or that the logic should be owned by one authoritative service.
Trade-off: Merging eliminates shared-code coupling but reduces the granularity benefits (independent deployment, scaling, team ownership).
4. Data Relationships
If data in two services is tightly related and frequently joined, keeping them in the same service (and therefore the same database) avoids complex distributed joins and maintains referential integrity.
Indicators:
- Cross-service queries require implementing “in-memory joins” or API composition
- Foreign key relationships are maintained manually across services
- Reporting or analytics on the data always needs both datasets together
Example: A customer profile service and a customer address service. If every customer operation requires both profile and address data, and if the data is always accessed together, maintaining them as separate services creates join overhead and potential consistency issues. Merging them simplifies the data model.
Trade-off: Keeping related data together maintains integrity and avoids joins but reduces the ability to scale or deploy data management independently.
Finding the Right Balance
The granularity decision is a balancing act between the disintegrators and integrators. There is no formula — it requires judgment about which forces dominate in the specific context.
Decision Flowchart
Start: Should I split this service?
│
├── YES signals (Disintegrators):
│ ├── Does it violate single purpose? → Consider splitting by function
│ ├── Do parts change at very different rates? → Consider splitting by volatility
│ ├── Do parts need dramatically different scale? → Consider splitting by scalability
│ ├── Does a failure in one part bring down unrelated functions? → Consider splitting by fault tolerance
│ ├── Does one part handle sensitive data others don't? → Consider splitting by security
│ └── Will one part grow/change implementation independently? → Consider splitting by extensibility
│
└── NO signals (Integrators — resist the split or merge back):
├── Do the split services need the same ACID transaction? → Keep together (or merge)
├── Would the split create too many chatty inter-service calls? → Keep together (or merge)
├── Do split services share business logic that changes in sync? → Keep together (or merge)
└── Is the data always accessed together with tight referential relationships? → Keep together (or merge)
Practical Rules of Thumb
| Rule | Guidance |
|---|---|
| Start coarser, refine later | It is easier to split a service than to merge two separate services with separate teams and databases |
| Prefer eventual consistency | If you can tolerate eventual consistency between two operations, they can often be separate services |
| Name test | If you cannot name the service without using “and,” it might be too broad |
| Two-pizza team test | A service should be ownable by one small team (2-8 people) |
| Deployment cadence | Services that always deploy together probably belong together |
| Data test | Services that always query the same tables together might belong together |
Granularity Trade-off Summary
| Force | Split | Keep Together |
|---|---|---|
| Scope/Function | Too many responsibilities | Single, clear purpose |
| Code Volatility | Parts change at different rates | All parts change at the same rate |
| Scalability | Parts have different throughput needs | Throughput is uniform |
| Fault Tolerance | Parts have different failure risk | Failure risk is uniform |
| Security | Parts have different sensitivity | Same security domain |
| Extensibility | Parts will evolve independently | Parts evolve together |
| Transactions | (not a split signal) | ACID required across operations |
| Workflow | (not a split signal) | Too many inter-service calls |
| Shared Code | (not a split signal) | Business logic is shared |
| Data Relationships | (not a split signal) | Data is tightly related |
Disintegrators vs. Integrators: Tension Matrix
When forces conflict, use this guide:
| Scenario | Dominant Force | Recommendation |
|---|---|---|
| Need independent scale AND shared transaction | Integrator (transaction) wins | Keep together; use compensating transactions if needed |
| Different volatility BUT chatty if split | Depends on how chatty and how volatile | Split if volatility delta is large; keep if chattiness is severe |
| Security isolation needed BUT tight data relationships | Disintegrator (security) wins if compliance required | Split with API gateway; use encryption and strict access control |
| Extensibility desired BUT shared business logic | Shared logic must be resolved first | Extract shared logic to its own service or library, THEN split |
Sysops Squad Saga
The book uses two case study examples in Chapter 7 to illustrate granularity decisions.
Example 1: Ticket Assignment Granularity
Context: The Sysops Squad system has a service that handles both ticket creation and ticket assignment to technicians. The question: should these be split?
Disintegrators present:
- Code volatility: assignment logic (based on technician skills, availability, geography) changes frequently as routing rules evolve; ticket creation logic is stable
- Scalability: assignment is computationally intensive (runs matching algorithms); ticket creation is lightweight
- Extensibility: the team anticipates adding ML-based routing, which will only affect assignment
Integrators present:
- Workflow: assignment always follows creation; they appear tightly coupled in the workflow
- Transactions: a ticket should not be created without an initial assignment attempt (consistency concern)
Decision: Split — the volatility and scalability differences dominate. The transactional concern is addressed by making initial assignment eventual (a ticket can exist briefly unassigned, with the assignment service picking it up asynchronously).
Lesson: When volatility and scalability differences are significant, accept eventual consistency to enable independent evolution.
Example 2: Customer Registration Granularity
Context: The customer registration process creates a customer record and sets up notification preferences. The question: should notification preferences be a separate service?
Disintegrators present:
- Scope: notification preference management seems like a distinct capability (different from identity)
Integrators present:
- Transactions: registration must atomically create the customer AND set up default notification preferences — a customer without notification preferences is an invalid state
- Data relationships: customer identity and notification preferences are always accessed together for most operations
Decision: Keep together — the transactional integrity concern dominates. The risk of inconsistent state (customer exists but has no notification preferences) outweighs the benefit of separating the function.
Lesson: When the merged data represents a single consistent unit that must not be partially created, transactional integrity wins.
Key Takeaways
-
Granularity is context-dependent: There is no universally “correct” service size. The right granularity depends on the specific system’s volatility, scalability needs, security requirements, and data consistency needs.
-
Two opposing forces: Granularity disintegrators push toward smaller, more focused services; granularity integrators push toward larger, more cohesive services. Both must be evaluated for every service.
-
The six disintegrators: Service scope/function, code volatility, scalability/throughput, fault tolerance, security, and extensibility are the main reasons to split a service.
-
The four integrators: Database transactions, workflow/choreography chattiness, shared business logic, and tight data relationships are the main reasons to keep services together (or merge them).
-
Transactions are the strongest integrator: If two operations must be atomic and you cannot tolerate eventual consistency, keep them in the same service. Distributed transactions are a last resort, not a first choice.
-
Chatty services signal over-decomposition: When a business operation requires more than 5-7 inter-service calls to complete, services are likely too fine-grained.
-
Start coarser, refine later: Splitting services is operationally easier than merging them. When in doubt, start with a larger service and split when a disintegrating force becomes clearly dominant.
-
Volatility is a powerful splitter: Code that changes at very different rates should be separated to reduce deployment risk and frequency for stable code.
-
Shared business logic is a merger signal: If the same business rule lives in multiple services and changes must be synchronized, this is a strong sign the services are part of the same bounded context.
-
The naming test: If you cannot describe a service’s purpose without the word “and,” it is a candidate for splitting by function.
Related Resources
- ch06-pulling-apart-operational-data — data decomposition that precedes granularity decisions
- ch08-reuse-patterns — what to do with code that is shared across services
- ch09-data-ownership — data ownership decisions closely related to granularity
- ch11-managing-distributed-workflows — choreography and orchestration, directly related to the workflow integrator
- ch12-transactional-sagas — how to handle distributed transactions when you must split despite transactional needs
- README — SAHT book overview and key concepts
Last Updated: 2026-05-30