Chapter 7: Service Granularity

saht service-granularity distributed-architecture microservices

Status: Notes complete


Overview

Chapter 7 addresses one of the most consequential — and difficult — decisions in distributed architecture: how big should a service be? There is no universal answer; granularity is determined by competing forces. The authors introduce a structured framework of Granularity Disintegrators (reasons to split a service into smaller pieces) and Granularity Integrators (reasons to merge services into a larger piece). Finding the right granularity requires consciously weighing these opposing forces against each other for each specific service.

The chapter’s central insight: getting granularity wrong in either direction is costly. Too fine-grained produces chattiness, distributed transaction nightmares, and operational overhead. Too coarse-grained re-creates the monolith’s change-coupling and scaling problems.


Core Concepts

TermDefinition
Service granularityThe scope and size of a service — how much functionality it encapsulates
Granularity disintegratorsArchitectural forces that push toward splitting a service into smaller pieces
Granularity integratorsArchitectural forces that push toward combining services into a larger piece
Chatty servicesServices that require many fine-grained inter-service calls to complete a business operation; a symptom of over-decomposition
Single-purpose principleA service should do one thing well — have a single, well-defined responsibility
Fault toleranceThe ability of a system to continue operating despite partial failure
Code volatilityThe rate at which code changes; high-volatility code often should be separated from low-volatility code

Granularity Disintegrators

These are the forces that argue for splitting a service into smaller, more focused services.

1. Service Scope and Function

The most intuitive driver: a service that does too many unrelated things should be split. This aligns with the Single Responsibility Principle at the service level.

Indicators that scope is too broad:

  • The service name is vague (e.g., “CustomerService” handles registration, preferences, payment methods, and notification settings)
  • Different teams own different parts of the same service
  • PRs constantly conflict because different features touch the same codebase
  • It is impossible to describe the service’s purpose in one sentence without using “and”

Example: A notification service that handles email, SMS, push notifications, and in-app alerts. These may share some code but are distinct enough (different protocols, different vendors, different failure modes) to warrant separate services.

Trade-off: Splitting by function improves cohesion and team autonomy but increases the number of services to deploy, monitor, and maintain.


2. Code Volatility

Parts of a service that change at different rates should be separated. Mixing high-volatility code with stable code means every change to the volatile part forces a full redeployment of the stable part — increasing risk and deployment frequency unnecessarily.

Indicators of volatility mismatch:

  • One area of the service has 10x more commits than another area
  • Business rules change frequently while data access code rarely changes
  • A new feature in one function causes unintended test failures in another

Example: In an order service, the discount calculation logic changes with every marketing campaign (high volatility), while the order persistence logic changes once a quarter (low volatility). Separating them means discount rule changes no longer risk breaking order persistence.

Trade-off: Decoupling by volatility reduces deployment risk but may create a network call where there was previously a method call.


3. Scalability and Throughput

Services with parts that have dramatically different scalability requirements should be split so the hot parts can scale independently without scaling everything.

Indicators:

  • One function within a service gets 100x more traffic than the rest
  • You are scaling the entire service to support one bottleneck function
  • Cost analysis shows you are over-provisioning most of the service to handle a peak in one area

Example: A product catalog service where search is called thousands of times per second but product creation is called a few hundred times per day. Running both in one service means you must scale the entire service for search traffic, paying for compute capacity that the product-creation code never uses.

Trade-off: Independent scalability reduces cost and improves performance, but adds operational complexity (separate deployment pipelines, separate scaling policies).


4. Fault Tolerance

Functions with different failure risk profiles should be isolated so that a failure in one does not bring down another. This is especially important for functions that call unreliable external systems.

Indicators:

  • One function calls a flaky third-party API
  • A memory leak in one function OOMs the entire service process
  • A bug in one area causes the entire service to crash, taking down unrelated functionality

Example: A payment service where the payment processing function (which calls an external payment gateway) should be isolated from the refund calculation function (which is purely internal). If the payment gateway goes down, the refund calculation should still work.

Trade-off: Fault isolation prevents blast radius expansion but requires implementing retry logic, circuit breakers, and health checks at service boundaries.


5. Security

Functions with elevated security requirements (PII, payment data, authentication secrets) should be isolated to minimize the attack surface and allow stricter access controls.

Indicators:

  • One function handles PII or PCI data while the rest of the service does not
  • Compliance requirements (PCI-DSS, HIPAA, GDPR) apply to only part of a service
  • Different authentication/authorization levels are required for different functions

Example: In a customer service, the function that stores and retrieves credit card data should be in a separate service with network isolation, stricter logging, dedicated secrets management, and limited internal access — rather than co-located with general profile management.

Trade-off: Security isolation reduces attack surface and simplifies compliance auditing, but increases the number of service-to-service calls that must be secured.


6. Extensibility

Functions that are likely to change implementation or grow in scope independently should be separated so they can evolve without impacting stable parts of the system.

Indicators:

  • A function is a known integration point where new vendors/providers will be plugged in
  • Business requirements suggest this area will grow substantially
  • The architecture team anticipates replacing the implementation in the next 12-18 months

Example: A shipping service that currently uses a single carrier. If the business plans to add multiple carrier integrations (FedEx, UPS, DHL), separating the carrier-specific logic from the shipping coordination logic allows each carrier adapter to evolve or be replaced without touching the coordination logic.

Trade-off: Extensibility-driven splits make the system more adaptable but add complexity now for benefits that may not materialize.


Granularity Integrators

These are the forces that argue for merging services into a single, larger service.

1. Database Transactions

If two (or more) services must participate in the same ACID transaction, keeping them together in one service is often the right choice. Distributed transactions are notoriously complex (two-phase commit, sagas) and introduce significant latency and failure modes.

Indicators that merging is warranted:

  • Services share the same database tables and need atomicity across them
  • A business operation must either complete fully across multiple services or not at all
  • Rollback logic across service boundaries is currently broken or inconsistently implemented

Example: Customer registration that must atomically create a customer record AND create the initial account balance. If these live in separate services, a failure after the first operation but before the second leaves the system inconsistent. Keeping them together avoids the distributed transaction problem.

Key insight: If you cannot tolerate eventual consistency between two operations, they likely belong in the same service.

Trade-off: Keeping services together maintains transactional integrity at the cost of reduced independent deployability and scalability.


2. Workflow and Choreography

If completing a business operation requires too many fine-grained inter-service calls (chatty services), merging some of the services reduces network overhead, latency, and complexity.

Indicators:

  • A single user-facing operation requires 8+ inter-service calls
  • Services are constantly calling each other in tight loops
  • Latency from network hops is noticeably degrading user experience
  • Error handling across many hops is complex and fragile

The “chatty services” problem: Each network call adds latency (typically 10-100ms), potential for failure, and complexity. When services are so fine-grained that every meaningful operation requires orchestrating many of them, the overhead outweighs the benefits of decomposition.

Example: A checkout process that requires calling: inventory service (check stock), price service (get price), discount service (apply discounts), tax service (calculate tax), address service (validate shipping), payment service (charge), and notification service (send confirmation) — sequentially. Merging some of these (e.g., price + discount + tax into an “order calculation” service) can reduce the chain to 4-5 calls.

Trade-off: Merging chatty services reduces latency and network overhead but reduces independent deployability of the merged functions.


3. Shared Code

If multiple services share the same business logic (not just infrastructure/utility code), it is a signal that they might belong in the same service — or that the shared logic needs careful ownership.

Important distinction:

  • Shared infrastructure code (logging, tracing, retry logic): fine to share as a library — this is not a merging signal
  • Shared business logic (discount calculation rules, customer eligibility checks): sharing this often indicates the services are actually part of the same bounded context

Indicators:

  • The same business rule is duplicated across multiple services (or extracted into a shared library that many services depend on)
  • Changes to the business rule require simultaneous deployment of multiple services
  • The shared code is semantically meaningful to the domain (not just utility code)

Example: Two services both implementing the same customer tier eligibility logic (gold/silver/bronze based on spending). If the rules change, all dependent services must be updated in sync. This is a sign they may belong together — or that the logic should be owned by one authoritative service.

Trade-off: Merging eliminates shared-code coupling but reduces the granularity benefits (independent deployment, scaling, team ownership).


4. Data Relationships

If data in two services is tightly related and frequently joined, keeping them in the same service (and therefore the same database) avoids complex distributed joins and maintains referential integrity.

Indicators:

  • Cross-service queries require implementing “in-memory joins” or API composition
  • Foreign key relationships are maintained manually across services
  • Reporting or analytics on the data always needs both datasets together

Example: A customer profile service and a customer address service. If every customer operation requires both profile and address data, and if the data is always accessed together, maintaining them as separate services creates join overhead and potential consistency issues. Merging them simplifies the data model.

Trade-off: Keeping related data together maintains integrity and avoids joins but reduces the ability to scale or deploy data management independently.


Finding the Right Balance

The granularity decision is a balancing act between the disintegrators and integrators. There is no formula — it requires judgment about which forces dominate in the specific context.

Decision Flowchart

Start: Should I split this service?
│
├── YES signals (Disintegrators):
│   ├── Does it violate single purpose? → Consider splitting by function
│   ├── Do parts change at very different rates? → Consider splitting by volatility
│   ├── Do parts need dramatically different scale? → Consider splitting by scalability
│   ├── Does a failure in one part bring down unrelated functions? → Consider splitting by fault tolerance
│   ├── Does one part handle sensitive data others don't? → Consider splitting by security
│   └── Will one part grow/change implementation independently? → Consider splitting by extensibility
│
└── NO signals (Integrators — resist the split or merge back):
    ├── Do the split services need the same ACID transaction? → Keep together (or merge)
    ├── Would the split create too many chatty inter-service calls? → Keep together (or merge)
    ├── Do split services share business logic that changes in sync? → Keep together (or merge)
    └── Is the data always accessed together with tight referential relationships? → Keep together (or merge)

Practical Rules of Thumb

RuleGuidance
Start coarser, refine laterIt is easier to split a service than to merge two separate services with separate teams and databases
Prefer eventual consistencyIf you can tolerate eventual consistency between two operations, they can often be separate services
Name testIf you cannot name the service without using “and,” it might be too broad
Two-pizza team testA service should be ownable by one small team (2-8 people)
Deployment cadenceServices that always deploy together probably belong together
Data testServices that always query the same tables together might belong together

Granularity Trade-off Summary

ForceSplitKeep Together
Scope/FunctionToo many responsibilitiesSingle, clear purpose
Code VolatilityParts change at different ratesAll parts change at the same rate
ScalabilityParts have different throughput needsThroughput is uniform
Fault ToleranceParts have different failure riskFailure risk is uniform
SecurityParts have different sensitivitySame security domain
ExtensibilityParts will evolve independentlyParts evolve together
Transactions(not a split signal)ACID required across operations
Workflow(not a split signal)Too many inter-service calls
Shared Code(not a split signal)Business logic is shared
Data Relationships(not a split signal)Data is tightly related

Disintegrators vs. Integrators: Tension Matrix

When forces conflict, use this guide:

ScenarioDominant ForceRecommendation
Need independent scale AND shared transactionIntegrator (transaction) winsKeep together; use compensating transactions if needed
Different volatility BUT chatty if splitDepends on how chatty and how volatileSplit if volatility delta is large; keep if chattiness is severe
Security isolation needed BUT tight data relationshipsDisintegrator (security) wins if compliance requiredSplit with API gateway; use encryption and strict access control
Extensibility desired BUT shared business logicShared logic must be resolved firstExtract shared logic to its own service or library, THEN split

Sysops Squad Saga

The book uses two case study examples in Chapter 7 to illustrate granularity decisions.

Example 1: Ticket Assignment Granularity

Context: The Sysops Squad system has a service that handles both ticket creation and ticket assignment to technicians. The question: should these be split?

Disintegrators present:

  • Code volatility: assignment logic (based on technician skills, availability, geography) changes frequently as routing rules evolve; ticket creation logic is stable
  • Scalability: assignment is computationally intensive (runs matching algorithms); ticket creation is lightweight
  • Extensibility: the team anticipates adding ML-based routing, which will only affect assignment

Integrators present:

  • Workflow: assignment always follows creation; they appear tightly coupled in the workflow
  • Transactions: a ticket should not be created without an initial assignment attempt (consistency concern)

Decision: Split — the volatility and scalability differences dominate. The transactional concern is addressed by making initial assignment eventual (a ticket can exist briefly unassigned, with the assignment service picking it up asynchronously).

Lesson: When volatility and scalability differences are significant, accept eventual consistency to enable independent evolution.


Example 2: Customer Registration Granularity

Context: The customer registration process creates a customer record and sets up notification preferences. The question: should notification preferences be a separate service?

Disintegrators present:

  • Scope: notification preference management seems like a distinct capability (different from identity)

Integrators present:

  • Transactions: registration must atomically create the customer AND set up default notification preferences — a customer without notification preferences is an invalid state
  • Data relationships: customer identity and notification preferences are always accessed together for most operations

Decision: Keep together — the transactional integrity concern dominates. The risk of inconsistent state (customer exists but has no notification preferences) outweighs the benefit of separating the function.

Lesson: When the merged data represents a single consistent unit that must not be partially created, transactional integrity wins.


Key Takeaways

  1. Granularity is context-dependent: There is no universally “correct” service size. The right granularity depends on the specific system’s volatility, scalability needs, security requirements, and data consistency needs.

  2. Two opposing forces: Granularity disintegrators push toward smaller, more focused services; granularity integrators push toward larger, more cohesive services. Both must be evaluated for every service.

  3. The six disintegrators: Service scope/function, code volatility, scalability/throughput, fault tolerance, security, and extensibility are the main reasons to split a service.

  4. The four integrators: Database transactions, workflow/choreography chattiness, shared business logic, and tight data relationships are the main reasons to keep services together (or merge them).

  5. Transactions are the strongest integrator: If two operations must be atomic and you cannot tolerate eventual consistency, keep them in the same service. Distributed transactions are a last resort, not a first choice.

  6. Chatty services signal over-decomposition: When a business operation requires more than 5-7 inter-service calls to complete, services are likely too fine-grained.

  7. Start coarser, refine later: Splitting services is operationally easier than merging them. When in doubt, start with a larger service and split when a disintegrating force becomes clearly dominant.

  8. Volatility is a powerful splitter: Code that changes at very different rates should be separated to reduce deployment risk and frequency for stable code.

  9. Shared business logic is a merger signal: If the same business rule lives in multiple services and changes must be synchronized, this is a strong sign the services are part of the same bounded context.

  10. The naming test: If you cannot describe a service’s purpose without the word “and,” it is a candidate for splitting by function.


Last Updated: 2026-05-30