Chapter 5: Component-Based Decomposition Patterns
saht decomposition component-patterns modularity fitness-functions
Status: Notes complete
Overview
When an organization decides to break a monolith apart, the hardest question is always where to cut. Chapter 5 answers that question systematically. The authors present six sequential decomposition patterns — each a concrete analytical step — that together form a migration workflow from monolith to a set of well-bounded services.
The core philosophy is component-based decomposition: a component is the unit of analysis (typically a top-level namespace, package, or module in the codebase). By studying components — their size, their dependencies, their shared elements, their hierarchical depth — an architect can make principled decisions about service boundaries rather than guessing.
The chapter also introduces fitness functions as the governance mechanism: automated checks (metrics, coupling scores, dependency rules) that alert the team when decomposition decisions are violated as the codebase evolves.
The Sysops Squad Saga appears after each pattern, showing the fictional Sysops Squad (a support-ticketing monolith) applying each pattern in sequence, making the abstract concrete.
The Six Decomposition Patterns
Pattern 1: Identify and Size Components
What it does
Before any decomposition can happen, the architect must first see the components. This pattern establishes an inventory of all top-level components in the monolith and evaluates whether each one is appropriately sized — neither a monolithic god-component nor an over-granular sliver.
How to apply
-
Inventory components — enumerate every top-level namespace/package in the codebase. Each namespace represents a candidate component. Map the namespace to the domain concept it represents (e.g.,
ss.ticket,ss.customer,ss.billing). -
Measure relative size — measure the percentage of overall codebase each component occupies (line count, class count, or statement count are all valid proxies). Components that are too large will be harder to split into services cleanly; they contain too much mixed responsibility. Components that are too small are likely sub-function fragments that belong inside another component.
-
Apply size heuristics — the book suggests that no single component should exceed roughly 10–15% of total codebase size (a rough upper bound). Components below ~1–2% are suspects for being too granular.
-
Document component responsibilities — for each component, write a brief statement of what it owns. This statement will be refined in subsequent patterns.
Fitness functions for governance
- Component size fitness function: flag any component whose size exceeds a configurable percentage threshold. Run this as part of the CI pipeline so growth is caught early.
- Component count fitness function: alert when the total component count drifts outside an expected range (too few suggests components collapsed; too many suggests fragmentation).
Trade-offs
| Concern | Detail |
|---|---|
| Line-count as proxy | LOC is a rough proxy; it can be gamed. Using class/file count or architectural fitness tools (e.g., ArchUnit, JDepend) is more reliable. |
| Namespace ≠ domain | Legacy code often has namespace sprawl unrelated to domain boundaries. Cleanup may be required before sizing means anything. |
| Cost of inventory | For large monoliths (millions of LOC), automated tooling is essential. Manual inventory is error-prone. |
Sysops Squad application
The team maps their monolith’s namespaces and discovers their ss.ticket component holds ~38% of total code — a clear god-component. Meanwhile ss.login is under 1%. These findings drive all subsequent decisions. Several namespaces turn out to be pure infrastructure (logging, security, database utilities) that will be handled in Pattern 2.
Pattern 2: Gather Common Domain Components
What it does
Shared infrastructure code — logging frameworks, authentication utilities, database connection pools, error handling libraries, notification helpers — tends to be scattered across multiple components through copy-paste or loose imports. If this code is later split into separate services, each service will either duplicate it or create a tight runtime coupling on a shared library. This pattern identifies those cross-cutting concerns and consolidates them into a common domain component (or set of components), which then becomes a candidate for a shared service or library.
How to apply
-
Identify cross-cutting namespaces — scan all component namespaces for classes whose names or purposes are clearly infrastructure/utility:
Logger,DBConnector,Authenticator,EmailSender,AuditTrail, etc. -
Measure coupling breadth — for each candidate common component, count how many other components import or call it. Any component imported by a majority of other components (say, >50%) is a strong candidate for the common domain.
-
Consolidate into a
commonnamespace — move or redirect all shared code to a singless.common(or similar) top-level namespace. Sub-divide by function:ss.common.logging,ss.common.messaging,ss.common.persistence. -
Decide deployment model — the consolidated common domain can become:
- A shared library (embedded in each service at build time) — avoids runtime coupling but requires careful versioning.
- A shared service (deployed once, called at runtime) — eliminates duplication but introduces network coupling and a single point of failure.
Fitness functions for governance
- Common coupling fitness function: alert when any non-common component directly references classes that should live in the common domain. Prevents drift back into scattered shared code.
- Shared library version drift: if using a library model, flag services that depend on different versions of the shared library.
Trade-offs
| Concern | Detail |
|---|---|
| Library vs. service | Libraries are faster and simpler but create deployment coupling (all services must redeploy when the library changes). Services decouple deployments but add latency and failure modes. |
| Versioning complexity | Shared libraries across many services create diamond-dependency problems over time. Semantic versioning + strict compatibility policies are essential. |
| What qualifies? | Not everything shared belongs in common. Domain logic shared between two components is a signal of coupling, not a reason for a common service — address it in Pattern 5 (domains) instead. |
Sysops Squad application
The team finds that logging, notifications, and database utilities are duplicated across five components. They consolidate into ss.common and decide on a shared library for logging/persistence (simple, stable) and a notification service for email/SMS (because notification channels change frequently and independent deployment is valuable).
Pattern 3: Flatten Components
What it does
Many real-world codebases have deep namespace hierarchies — namespaces nested three, four, or five levels deep. This depth makes it very hard to reason about component boundaries because a “component” at level 2 may contain hundreds of sub-namespaces at levels 3-5 that have their own internal structure. This pattern simplifies analysis by collapsing all code to a single level of nesting below the root namespace — making each second-level namespace a clean, visible component.
How to apply
-
Map the current hierarchy — draw or list every namespace path in the codebase. Note the depth at which meaningful domain concepts appear.
-
Define the target depth — the goal is
<root>.<domain>(two levels). Everything below<domain>is internal implementation detail of that component, not a separate architectural component. -
Merge upward — any code in
ss.ticket.domain.entity.Customershould be considered part ofss.ticket. The deep path is internal packaging convention, not a separate architectural boundary. -
Split downward only when necessary — if a second-level namespace is still too large after merging its children (violating Pattern 1’s size heuristic), it may need to be split into two sibling second-level namespaces (e.g.,
ss.ticket.createandss.ticket.resolvebecomess.ticketcreateandss.ticketresolve). -
Handle orphan classes — classes that live directly in the root namespace (
ss.*without a domain) must be assigned to the most appropriate second-level component or moved toss.common.
Fitness functions for governance
- Hierarchy depth fitness function: fail the build if any source file is found at nesting depth > 2 (below root). This prevents re-growth of deep hierarchies.
- Orphan class fitness function: flag any class found in the root namespace rather than a named component.
Trade-offs
| Concern | Detail |
|---|---|
| Disruption | Flattening requires renaming namespaces — a mechanical refactor but one that touches many files. Use automated refactoring tools. |
| Premature flattening | Merging too aggressively can produce oversized second-level components, contradicting Pattern 1. Pattern 3 should iterate with Pattern 1. |
| Framework conventions | Some frameworks (Spring, Rails, Django) have opinionated directory structures that may conflict. Adapt the rule to the language/framework’s conventions. |
Sysops Squad application
The Sysops Squad finds their codebase has namespaces four levels deep in places (e.g., ss.ticket.domain.repository.impl). After flattening, ss.ticket absorbs its sub-namespaces and emerges as a single (though large) component. This forces them to revisit Pattern 1 and split ss.ticket into ss.ticketmgmt and ss.ticketworkflow to meet size constraints.
Pattern 4: Determine Component Dependencies
What it does
Once components are well-sized and at a uniform depth, the architect must understand how they connect. This pattern uses afferent coupling (incoming dependencies, Ca) and efferent coupling (outgoing dependencies, Ce) to build a dependency graph. The goal is to identify dependency tangles — circular or highly entangled groups of components that cannot be cleanly separated into services.
How to apply
-
Build the dependency matrix — for each pair of components (A, B), record whether A imports/calls B, B imports/calls A, or there is no dependency. This produces an N×N matrix.
-
Calculate coupling metrics — for each component:
- Afferent coupling (Ca): how many other components depend on this component. High Ca = many callers. Changing this component breaks many things.
- Efferent coupling (Ce): how many other components this component depends on. High Ce = this component relies on many things.
- Instability (I) = Ce / (Ca + Ce). Range 0 (stable, many dependents) to 1 (unstable, many dependencies).
-
Visualize the dependency graph — draw a directed graph. Look for:
- Circular dependencies: A → B → C → A. These are tangles that must be broken before components can become separate services.
- Hub components: components with very high Ca — they are shared by many and are strong candidates for the common domain (Pattern 2 revisited).
- Leaf components: high Ce, low Ca — these depend on many things but nothing depends on them. Good candidates to extract as services first.
-
Break circular dependencies — techniques:
- Extract a new component: move the shared code that A and B both need into a new
ss.sharedcomponent; both depend on it, but the cycle is broken. - Apply Dependency Inversion: introduce an interface in a neutral package; A and B depend on the interface rather than each other.
- Merge: if A and B are tightly coupled because they represent one cohesive concept, merge them into a single component.
- Extract a new component: move the shared code that A and B both need into a new
Fitness functions for governance
- No circular dependencies fitness function: use ArchUnit (Java), Dependency Cruiser (JavaScript), or similar to fail the build if any circular dependency is introduced. This is one of the most important fitness functions in the chapter.
- Maximum afferent coupling fitness function: alert when a non-common component’s Ca exceeds a threshold (e.g., Ca > 5), suggesting it is taking on shared-domain responsibilities.
- Maximum efferent coupling fitness function: alert when a component’s Ce exceeds a threshold, suggesting it is either doing too much or its boundaries are wrong.
Trade-offs
| Concern | Detail |
|---|---|
| Tooling investment | Generating accurate dependency graphs for large codebases requires investment in static analysis tooling. |
| False positives | Some circular dependencies in legacy code are intentional (e.g., framework-generated). Fitness functions need a whitelist/exclusion mechanism. |
| Runtime vs. compile-time coupling | Dependency analysis captures compile-time coupling. Runtime coupling (service calls, events) must be tracked separately. |
| Effort to untangle | Breaking long-standing circular dependencies in a large monolith can take weeks. Prioritize by impact on service extraction order. |
Sysops Squad application
The team discovers a circular dependency between ss.ticketmgmt and ss.customer: ticket management queries customer data directly, and customer data calls back into ticket management to update customer status. They resolve it by extracting a ss.customerprofile component that owns only read-side customer data; ss.ticket reads from it but never writes back. The write-back moves to an asynchronous event.
Pattern 5: Create Component Domains
What it does
Having well-sized, flattened, dependency-clean components, the architect now groups them into domains — logical clusters of related components that will eventually become service boundaries. A domain corresponds roughly to a DDD bounded context. Components in the same domain are allowed to call each other directly (service-internal calls); components in different domains must communicate only through defined interfaces (cross-service calls).
How to apply
-
Group by business capability — assign each component to a domain based on its primary business responsibility. Components that collaborate frequently on the same user-facing feature belong in the same domain.
-
Apply the “shared concepts” test — two components belong in the same domain if:
- They share the same primary entity (e.g., both deal primarily with
Ticket). - They are always deployed together in practice.
- Separating them would require constant synchronous cross-service calls.
- They share the same primary entity (e.g., both deal primarily with
-
Name the domains explicitly — give each domain a name that reflects the business capability, not the technology:
TicketManagement,CustomerNotifications,BillingAndPayments,ExpertManagement,Reporting. -
Namespace the domain — restructure the codebase so that all components within a domain share a common second-level prefix:
ss.ticket.*,ss.customer.*,ss.billing.*. This makes the domain boundary visible in the code. -
Validate domain cohesion — calculate cross-domain dependency ratios. A well-formed domain has most of its dependencies internal (intra-domain) and few external (cross-domain).
Fitness functions for governance
- Cross-domain dependency fitness function: alert when the ratio of cross-domain dependencies to intra-domain dependencies exceeds a threshold. A domain that calls more things outside itself than inside is likely mis-bounded.
- Domain namespace fitness function: enforce that component namespaces match their domain assignment in the architecture decision record. Prevents “shadow” components living outside their declared domain.
Trade-offs
| Concern | Detail |
|---|---|
| Domain boundaries are subjective | Two architects may draw domain lines differently. The book recommends using dependency data (Pattern 4) to guide the decision rather than relying solely on intuition. |
| Domains are not yet services | A domain is a grouping concept, not a deployment unit. The team must resist deploying each domain immediately; Pattern 6 governs that decision. |
| Fine-grained vs. coarse-grained | Too-fine domains (one component per domain) produces a nanoservices problem. Too-coarse domains (one domain for everything) recreates the monolith. Aim for 3-8 components per domain. |
Sysops Squad application
The team defines five domains: TicketManagement (ticket creation, workflow, assignment, history), CustomerManagement (customer profile, customer notifications), ExpertManagement (expert profiles, scheduling, availability), BillingAndPayments (billing, invoices, payments), and Reporting (surveys, reporting, analytics). The ss.common library sits outside all domains.
Pattern 6: Create Domain Services
What it does
This is the culminating pattern: each domain identified in Pattern 5 becomes a candidate for a separate, independently deployable service. The pattern provides guidance on which domains to extract first and how to move the domain’s code into a service without breaking the monolith while migration is ongoing.
How to apply
-
Rank domains by coupling — calculate each domain’s cross-domain dependency count. The domain with the fewest cross-domain dependencies is the easiest to extract (least impact on remaining monolith). Start with the least-coupled domain.
-
Assess data ownership — each domain service will eventually own its own database schema. For each domain, identify the database tables it owns. Domains whose tables are heavily shared with other components present the greatest data-separation challenge (addressed more deeply in later chapters).
-
Choose an extraction strategy — two main options:
- Branch by abstraction: introduce an interface in front of the domain’s functionality; the monolith calls the interface. Then swap the implementation from the in-process class to a remote service call. Keeps the monolith working throughout.
- Strangler fig: route traffic to the new service for specific endpoints while the monolith handles others. Progressively move endpoints until the monolith’s portion of that domain is dead.
-
Define the service API — specify the contract (REST, gRPC, messaging) that the new domain service exposes. This contract replaces all the in-process method calls that other components made to this domain.
-
Migrate data — move the domain’s tables to a dedicated database. Use dual-write or event-sourcing patterns to keep data consistent during the migration window.
-
Retire the monolith code — once the service is deployed and all callers have been updated, delete the domain’s code from the monolith.
Fitness functions for governance
- Service dependency fitness function: enforce that the extracted service’s code has no compile-time imports pointing back into the monolith codebase. Prevents accidental in-process coupling after extraction.
- Data ownership fitness function: alert when a service’s code accesses database tables declared as owned by another service. This is the data coupling equivalent of a circular import.
- API contract fitness function: use consumer-driven contract tests (e.g., Pact) to ensure the service’s API matches what all consumers expect.
Trade-offs
| Concern | Detail |
|---|---|
| Extraction order matters | Starting with the most-coupled domain first dramatically increases risk and cost. Always start with low-coupling, low-data-sharing domains. |
| Network latency | In-process calls become network calls. Previously trivial operations may now require caching, timeouts, and retry logic. |
| Data consistency | Moving data ownership from a shared DB to per-service DBs introduces eventual consistency challenges. This is the biggest long-term cost. |
| Operational complexity | Each new service adds CI/CD pipelines, monitoring, alerting, and on-call rotation. Factor this overhead into the extraction decision. |
| Incremental vs. big-bang | The book strongly recommends incremental extraction (one domain at a time). Big-bang rewrites have a very high failure rate. |
Sysops Squad application
The team decides to extract Reporting first — it has the fewest inbound dependencies, owns its own data tables, and can tolerate slight data staleness (read-only analytics). TicketManagement is extracted last because it has the most cross-domain dependencies and the most shared data. They use the strangler fig pattern throughout.
Sequencing the Patterns
The six patterns form a pipeline — each produces output that the next pattern consumes. They are not fully sequential (some iteration is needed), but the general flow is:
[Raw Monolith Codebase]
|
v
Pattern 1: Identify and Size Components
→ Inventory + size assessment of all components
|
v
Pattern 2: Gather Common Domain Components
→ Common infrastructure extracted to ss.common
|
v
Pattern 3: Flatten Components
→ All components at uniform depth (root.domain)
|
v
Pattern 4: Determine Component Dependencies
→ Dependency graph built; circular deps broken
|
v
Pattern 5: Create Component Domains
→ Components grouped into business-capability domains
|
v
Pattern 6: Create Domain Services
→ Each domain extracted as a deployable service
|
v
[Distributed Architecture]
Iteration points:
- After Pattern 3 (Flatten), the team may discover some components need re-sizing → loop back to Pattern 1.
- After Pattern 4 (Dependencies), breaking a circular dependency may create a new component → loop back to Pattern 2 and Pattern 3.
- After Pattern 5 (Domains), a domain may be too large → split it and re-run Pattern 4 to verify the split.
The patterns are designed to be applied in order but revisited as needed. Real migrations rarely go through each pattern exactly once.
Fitness Functions for Governance
Fitness functions are automated architectural checks — the term is borrowed from evolutionary computation. An architectural fitness function is any mechanism that evaluates a structural or behavioral property of the system and produces a pass/fail (or metric) result. They are run in CI/CD pipelines to catch drift.
Summary Table
| Pattern | Fitness Function | Tool Examples |
|---|---|---|
| 1 — Identify & Size | Component size % threshold | ArchUnit, custom metrics scripts |
| 1 — Identify & Size | Component count bounds | ArchUnit, SonarQube |
| 2 — Common Domain | No non-common refs to shared utilities | ArchUnit, Dependency Cruiser |
| 2 — Common Domain | Shared library version alignment | Dependabot, Renovate |
| 3 — Flatten | Max namespace depth = 2 | ArchUnit, custom script |
| 3 — Flatten | No orphan classes in root namespace | ArchUnit |
| 4 — Dependencies | No circular dependencies | ArchUnit, JDepend, Dependency Cruiser |
| 4 — Dependencies | Max afferent coupling threshold | JDepend, ArchUnit |
| 4 — Dependencies | Max efferent coupling threshold | JDepend, ArchUnit |
| 5 — Domains | Cross-domain dependency ratio | Custom ArchUnit rules |
| 5 — Domains | Namespace matches domain assignment | ArchUnit |
| 6 — Domain Services | No imports back into monolith | ArchUnit |
| 6 — Domain Services | Data ownership (no cross-schema queries) | Custom DB audit |
| 6 — Domain Services | API consumer-driven contracts | Pact |
Why fitness functions matter: Without automated governance, decomposition decisions degrade over time. Engineers under deadline pressure take shortcuts — a quick import from outside a domain boundary, a helper class left in the root namespace, a utility copied rather than placed in common. Fitness functions catch these violations in CI before they accumulate into the same tangle the team started with.
Key characteristics of good fitness functions:
- Automated — must run without human judgment.
- Fast — should complete within the normal CI build window.
- Specific — each function tests one architectural property.
- Threshold-based — where possible, express as a metric with a configurable threshold rather than a binary rule, so thresholds can be tightened incrementally.
Trade-off Summary
| Dimension | Component-Based Decomposition | Alternative: Domain-Driven Design alone | Alternative: Big-bang rewrite |
|---|---|---|---|
| Basis for decisions | Structural analysis of existing code | Conceptual modeling of business domains | Greenfield design |
| Risk | Low to medium (incremental) | Medium (alignment with actual code may be poor) | Very high (all-or-nothing) |
| Time to first service | Weeks to months | Months | 6-18+ months |
| Legacy compatibility | Preserves monolith during migration | Requires parallel implementation | Monolith frozen during rewrite |
| Automation potential | High (fitness functions throughout) | Medium | Low during build, high post-launch |
| Handles tech debt | Forces confrontation of coupling/naming debt | May hide tech debt behind clean domain model | Leaves tech debt in old code |
| Team skill required | Static analysis, architectural metrics | Domain modeling, event storming | Full-stack redesign |
| Failure mode | Slow migration, partial decomposition | Misaligned service boundaries | Total failure, monolith retained |
The component-based approach is particularly strong when:
- The monolith has years of accumulated code without clear boundaries.
- The team cannot afford a rewrite but needs to progressively modernize.
- There is organizational pressure to show incremental progress.
It is less appropriate when:
- The monolith is being retired entirely in favor of a purchased SaaS product.
- The existing codebase is so degraded (no package structure, everything in one namespace) that analysis produces no useful signal.
Sysops Squad Saga: End-to-End
The Sysops Squad is a fictional company running a support-ticketing platform as a monolith. Customers report problems, tickets are created, experts are assigned, work is done, billing occurs, and surveys are sent. The team wants to migrate to a distributed architecture to improve scalability and independent deployability.
Starting state: A single deployable JAR with namespace chaos, a god-component for ticketing, scattered utility code, and at least one circular dependency between ticketing and customer management.
After Pattern 1: Component inventory reveals 12 namespaces. ss.ticket is 38% of the codebase (too large). ss.login is 0.8% (too small, will be absorbed).
After Pattern 2: Logging, DB utilities, and notification helpers move to ss.common. The team decides ss.common.notification will eventually become a standalone notification service.
After Pattern 3: Deep namespace paths (e.g., ss.ticket.domain.entity.repository.impl) are collapsed. ss.ticket is still too large, so it is split into ss.ticketmgmt and ss.ticketworkflow.
After Pattern 4: The circular dependency between ss.ticketmgmt and ss.customer is found and broken. The dependency graph now has no cycles. Afferent coupling analysis confirms ss.common is the most-depended-upon component (expected).
After Pattern 5: Five domains are defined — TicketManagement, CustomerManagement, ExpertManagement, Billing, Reporting — each containing 2-4 components.
After Pattern 6: Reporting is extracted first (fewest dependencies, read-only data). ExpertManagement is second. CustomerManagement and Billing follow. TicketManagement is last. The notification service emerges from ss.common.notification as a cross-domain shared service.
End state: Five domain services + one shared notification service, each with its own deployment pipeline. The monolith is retired.
Key Takeaways
-
Components are the unit of decomposition — work at the component (namespace/package) level first, not at the class or method level and not at the service level. The component is the right granularity for architectural decision-making.
-
The six patterns are sequential but iterative — each pattern builds on the previous, but findings in later patterns often require revisiting earlier ones. Plan for at least two passes through the full sequence.
-
Size matters before structure — a component that is too large cannot be cleanly analyzed for dependencies or domain assignment. Get sizing right (Pattern 1) before anything else.
-
Common infrastructure is a first-class concern — failing to identify and consolidate shared code early (Pattern 2) leads to either duplication (maintenance nightmare) or hidden shared-service coupling discovered late in the migration.
-
Hierarchy depth is an enemy of clarity — deep namespace hierarchies make it impossible to see component boundaries. Flatten first, then analyze.
-
Circular dependencies are the most dangerous finding — they are the single biggest blocker to clean service extraction. Pattern 4’s circular dependency fitness function should be enabled permanently, not just during migration.
-
Domain grouping is informed by data, not just intuition — use the dependency graph from Pattern 4 to validate domain groupings (Pattern 5). Two components with very high mutual coupling should be in the same domain; if they are not, the boundary is wrong.
-
Start extraction with the least-coupled domain — Pattern 6’s most important guidance is sequencing. The wrong extraction order dramatically increases risk. Always work from the periphery inward.
-
Fitness functions make governance scalable — without automation, decomposition decisions erode. A CI-enforced fitness function costs hours to set up and saves weeks of re-tangling over the life of the project.
-
The strangler fig beats the big bang — the book consistently advocates incremental extraction over wholesale rewrites. Each incremental step can be validated, and the monolith continues to operate throughout.
Related Resources
- ch04-architectural-decomposition — the broader decomposition decision framework (tactical vs. strategic forking, migration patterns)
- ch06-pulling-apart-operational-data — the data-side companion to this chapter; service extraction requires data decomposition too
- fitness-functions-overview — cross-chapter reference on fitness functions as an evolutionary architecture tool
- ddd-bounded-contexts — the conceptual foundation for “domain” in Pattern 5
- strangler-fig-pattern — migration pattern used in Pattern 6
- coupling-metrics-reference — afferent/efferent coupling, instability metric (Robert Martin)
Last Updated: 2026-05-30