Chapter 5: Component-Based Decomposition Patterns

saht decomposition component-patterns modularity fitness-functions

Status: Notes complete


Overview

When an organization decides to break a monolith apart, the hardest question is always where to cut. Chapter 5 answers that question systematically. The authors present six sequential decomposition patterns — each a concrete analytical step — that together form a migration workflow from monolith to a set of well-bounded services.

The core philosophy is component-based decomposition: a component is the unit of analysis (typically a top-level namespace, package, or module in the codebase). By studying components — their size, their dependencies, their shared elements, their hierarchical depth — an architect can make principled decisions about service boundaries rather than guessing.

The chapter also introduces fitness functions as the governance mechanism: automated checks (metrics, coupling scores, dependency rules) that alert the team when decomposition decisions are violated as the codebase evolves.

The Sysops Squad Saga appears after each pattern, showing the fictional Sysops Squad (a support-ticketing monolith) applying each pattern in sequence, making the abstract concrete.


The Six Decomposition Patterns

Pattern 1: Identify and Size Components

What it does

Before any decomposition can happen, the architect must first see the components. This pattern establishes an inventory of all top-level components in the monolith and evaluates whether each one is appropriately sized — neither a monolithic god-component nor an over-granular sliver.

How to apply

  1. Inventory components — enumerate every top-level namespace/package in the codebase. Each namespace represents a candidate component. Map the namespace to the domain concept it represents (e.g., ss.ticket, ss.customer, ss.billing).

  2. Measure relative size — measure the percentage of overall codebase each component occupies (line count, class count, or statement count are all valid proxies). Components that are too large will be harder to split into services cleanly; they contain too much mixed responsibility. Components that are too small are likely sub-function fragments that belong inside another component.

  3. Apply size heuristics — the book suggests that no single component should exceed roughly 10–15% of total codebase size (a rough upper bound). Components below ~1–2% are suspects for being too granular.

  4. Document component responsibilities — for each component, write a brief statement of what it owns. This statement will be refined in subsequent patterns.

Fitness functions for governance

  • Component size fitness function: flag any component whose size exceeds a configurable percentage threshold. Run this as part of the CI pipeline so growth is caught early.
  • Component count fitness function: alert when the total component count drifts outside an expected range (too few suggests components collapsed; too many suggests fragmentation).

Trade-offs

ConcernDetail
Line-count as proxyLOC is a rough proxy; it can be gamed. Using class/file count or architectural fitness tools (e.g., ArchUnit, JDepend) is more reliable.
Namespace ≠ domainLegacy code often has namespace sprawl unrelated to domain boundaries. Cleanup may be required before sizing means anything.
Cost of inventoryFor large monoliths (millions of LOC), automated tooling is essential. Manual inventory is error-prone.

Sysops Squad application

The team maps their monolith’s namespaces and discovers their ss.ticket component holds ~38% of total code — a clear god-component. Meanwhile ss.login is under 1%. These findings drive all subsequent decisions. Several namespaces turn out to be pure infrastructure (logging, security, database utilities) that will be handled in Pattern 2.


Pattern 2: Gather Common Domain Components

What it does

Shared infrastructure code — logging frameworks, authentication utilities, database connection pools, error handling libraries, notification helpers — tends to be scattered across multiple components through copy-paste or loose imports. If this code is later split into separate services, each service will either duplicate it or create a tight runtime coupling on a shared library. This pattern identifies those cross-cutting concerns and consolidates them into a common domain component (or set of components), which then becomes a candidate for a shared service or library.

How to apply

  1. Identify cross-cutting namespaces — scan all component namespaces for classes whose names or purposes are clearly infrastructure/utility: Logger, DBConnector, Authenticator, EmailSender, AuditTrail, etc.

  2. Measure coupling breadth — for each candidate common component, count how many other components import or call it. Any component imported by a majority of other components (say, >50%) is a strong candidate for the common domain.

  3. Consolidate into a common namespace — move or redirect all shared code to a single ss.common (or similar) top-level namespace. Sub-divide by function: ss.common.logging, ss.common.messaging, ss.common.persistence.

  4. Decide deployment model — the consolidated common domain can become:

    • A shared library (embedded in each service at build time) — avoids runtime coupling but requires careful versioning.
    • A shared service (deployed once, called at runtime) — eliminates duplication but introduces network coupling and a single point of failure.

Fitness functions for governance

  • Common coupling fitness function: alert when any non-common component directly references classes that should live in the common domain. Prevents drift back into scattered shared code.
  • Shared library version drift: if using a library model, flag services that depend on different versions of the shared library.

Trade-offs

ConcernDetail
Library vs. serviceLibraries are faster and simpler but create deployment coupling (all services must redeploy when the library changes). Services decouple deployments but add latency and failure modes.
Versioning complexityShared libraries across many services create diamond-dependency problems over time. Semantic versioning + strict compatibility policies are essential.
What qualifies?Not everything shared belongs in common. Domain logic shared between two components is a signal of coupling, not a reason for a common service — address it in Pattern 5 (domains) instead.

Sysops Squad application

The team finds that logging, notifications, and database utilities are duplicated across five components. They consolidate into ss.common and decide on a shared library for logging/persistence (simple, stable) and a notification service for email/SMS (because notification channels change frequently and independent deployment is valuable).


Pattern 3: Flatten Components

What it does

Many real-world codebases have deep namespace hierarchies — namespaces nested three, four, or five levels deep. This depth makes it very hard to reason about component boundaries because a “component” at level 2 may contain hundreds of sub-namespaces at levels 3-5 that have their own internal structure. This pattern simplifies analysis by collapsing all code to a single level of nesting below the root namespace — making each second-level namespace a clean, visible component.

How to apply

  1. Map the current hierarchy — draw or list every namespace path in the codebase. Note the depth at which meaningful domain concepts appear.

  2. Define the target depth — the goal is <root>.<domain> (two levels). Everything below <domain> is internal implementation detail of that component, not a separate architectural component.

  3. Merge upward — any code in ss.ticket.domain.entity.Customer should be considered part of ss.ticket. The deep path is internal packaging convention, not a separate architectural boundary.

  4. Split downward only when necessary — if a second-level namespace is still too large after merging its children (violating Pattern 1’s size heuristic), it may need to be split into two sibling second-level namespaces (e.g., ss.ticket.create and ss.ticket.resolve become ss.ticketcreate and ss.ticketresolve).

  5. Handle orphan classes — classes that live directly in the root namespace (ss.* without a domain) must be assigned to the most appropriate second-level component or moved to ss.common.

Fitness functions for governance

  • Hierarchy depth fitness function: fail the build if any source file is found at nesting depth > 2 (below root). This prevents re-growth of deep hierarchies.
  • Orphan class fitness function: flag any class found in the root namespace rather than a named component.

Trade-offs

ConcernDetail
DisruptionFlattening requires renaming namespaces — a mechanical refactor but one that touches many files. Use automated refactoring tools.
Premature flatteningMerging too aggressively can produce oversized second-level components, contradicting Pattern 1. Pattern 3 should iterate with Pattern 1.
Framework conventionsSome frameworks (Spring, Rails, Django) have opinionated directory structures that may conflict. Adapt the rule to the language/framework’s conventions.

Sysops Squad application

The Sysops Squad finds their codebase has namespaces four levels deep in places (e.g., ss.ticket.domain.repository.impl). After flattening, ss.ticket absorbs its sub-namespaces and emerges as a single (though large) component. This forces them to revisit Pattern 1 and split ss.ticket into ss.ticketmgmt and ss.ticketworkflow to meet size constraints.


Pattern 4: Determine Component Dependencies

What it does

Once components are well-sized and at a uniform depth, the architect must understand how they connect. This pattern uses afferent coupling (incoming dependencies, Ca) and efferent coupling (outgoing dependencies, Ce) to build a dependency graph. The goal is to identify dependency tangles — circular or highly entangled groups of components that cannot be cleanly separated into services.

How to apply

  1. Build the dependency matrix — for each pair of components (A, B), record whether A imports/calls B, B imports/calls A, or there is no dependency. This produces an N×N matrix.

  2. Calculate coupling metrics — for each component:

    • Afferent coupling (Ca): how many other components depend on this component. High Ca = many callers. Changing this component breaks many things.
    • Efferent coupling (Ce): how many other components this component depends on. High Ce = this component relies on many things.
    • Instability (I) = Ce / (Ca + Ce). Range 0 (stable, many dependents) to 1 (unstable, many dependencies).
  3. Visualize the dependency graph — draw a directed graph. Look for:

    • Circular dependencies: A → B → C → A. These are tangles that must be broken before components can become separate services.
    • Hub components: components with very high Ca — they are shared by many and are strong candidates for the common domain (Pattern 2 revisited).
    • Leaf components: high Ce, low Ca — these depend on many things but nothing depends on them. Good candidates to extract as services first.
  4. Break circular dependencies — techniques:

    • Extract a new component: move the shared code that A and B both need into a new ss.shared component; both depend on it, but the cycle is broken.
    • Apply Dependency Inversion: introduce an interface in a neutral package; A and B depend on the interface rather than each other.
    • Merge: if A and B are tightly coupled because they represent one cohesive concept, merge them into a single component.

Fitness functions for governance

  • No circular dependencies fitness function: use ArchUnit (Java), Dependency Cruiser (JavaScript), or similar to fail the build if any circular dependency is introduced. This is one of the most important fitness functions in the chapter.
  • Maximum afferent coupling fitness function: alert when a non-common component’s Ca exceeds a threshold (e.g., Ca > 5), suggesting it is taking on shared-domain responsibilities.
  • Maximum efferent coupling fitness function: alert when a component’s Ce exceeds a threshold, suggesting it is either doing too much or its boundaries are wrong.

Trade-offs

ConcernDetail
Tooling investmentGenerating accurate dependency graphs for large codebases requires investment in static analysis tooling.
False positivesSome circular dependencies in legacy code are intentional (e.g., framework-generated). Fitness functions need a whitelist/exclusion mechanism.
Runtime vs. compile-time couplingDependency analysis captures compile-time coupling. Runtime coupling (service calls, events) must be tracked separately.
Effort to untangleBreaking long-standing circular dependencies in a large monolith can take weeks. Prioritize by impact on service extraction order.

Sysops Squad application

The team discovers a circular dependency between ss.ticketmgmt and ss.customer: ticket management queries customer data directly, and customer data calls back into ticket management to update customer status. They resolve it by extracting a ss.customerprofile component that owns only read-side customer data; ss.ticket reads from it but never writes back. The write-back moves to an asynchronous event.


Pattern 5: Create Component Domains

What it does

Having well-sized, flattened, dependency-clean components, the architect now groups them into domains — logical clusters of related components that will eventually become service boundaries. A domain corresponds roughly to a DDD bounded context. Components in the same domain are allowed to call each other directly (service-internal calls); components in different domains must communicate only through defined interfaces (cross-service calls).

How to apply

  1. Group by business capability — assign each component to a domain based on its primary business responsibility. Components that collaborate frequently on the same user-facing feature belong in the same domain.

  2. Apply the “shared concepts” test — two components belong in the same domain if:

    • They share the same primary entity (e.g., both deal primarily with Ticket).
    • They are always deployed together in practice.
    • Separating them would require constant synchronous cross-service calls.
  3. Name the domains explicitly — give each domain a name that reflects the business capability, not the technology: TicketManagement, CustomerNotifications, BillingAndPayments, ExpertManagement, Reporting.

  4. Namespace the domain — restructure the codebase so that all components within a domain share a common second-level prefix: ss.ticket.*, ss.customer.*, ss.billing.*. This makes the domain boundary visible in the code.

  5. Validate domain cohesion — calculate cross-domain dependency ratios. A well-formed domain has most of its dependencies internal (intra-domain) and few external (cross-domain).

Fitness functions for governance

  • Cross-domain dependency fitness function: alert when the ratio of cross-domain dependencies to intra-domain dependencies exceeds a threshold. A domain that calls more things outside itself than inside is likely mis-bounded.
  • Domain namespace fitness function: enforce that component namespaces match their domain assignment in the architecture decision record. Prevents “shadow” components living outside their declared domain.

Trade-offs

ConcernDetail
Domain boundaries are subjectiveTwo architects may draw domain lines differently. The book recommends using dependency data (Pattern 4) to guide the decision rather than relying solely on intuition.
Domains are not yet servicesA domain is a grouping concept, not a deployment unit. The team must resist deploying each domain immediately; Pattern 6 governs that decision.
Fine-grained vs. coarse-grainedToo-fine domains (one component per domain) produces a nanoservices problem. Too-coarse domains (one domain for everything) recreates the monolith. Aim for 3-8 components per domain.

Sysops Squad application

The team defines five domains: TicketManagement (ticket creation, workflow, assignment, history), CustomerManagement (customer profile, customer notifications), ExpertManagement (expert profiles, scheduling, availability), BillingAndPayments (billing, invoices, payments), and Reporting (surveys, reporting, analytics). The ss.common library sits outside all domains.


Pattern 6: Create Domain Services

What it does

This is the culminating pattern: each domain identified in Pattern 5 becomes a candidate for a separate, independently deployable service. The pattern provides guidance on which domains to extract first and how to move the domain’s code into a service without breaking the monolith while migration is ongoing.

How to apply

  1. Rank domains by coupling — calculate each domain’s cross-domain dependency count. The domain with the fewest cross-domain dependencies is the easiest to extract (least impact on remaining monolith). Start with the least-coupled domain.

  2. Assess data ownership — each domain service will eventually own its own database schema. For each domain, identify the database tables it owns. Domains whose tables are heavily shared with other components present the greatest data-separation challenge (addressed more deeply in later chapters).

  3. Choose an extraction strategy — two main options:

    • Branch by abstraction: introduce an interface in front of the domain’s functionality; the monolith calls the interface. Then swap the implementation from the in-process class to a remote service call. Keeps the monolith working throughout.
    • Strangler fig: route traffic to the new service for specific endpoints while the monolith handles others. Progressively move endpoints until the monolith’s portion of that domain is dead.
  4. Define the service API — specify the contract (REST, gRPC, messaging) that the new domain service exposes. This contract replaces all the in-process method calls that other components made to this domain.

  5. Migrate data — move the domain’s tables to a dedicated database. Use dual-write or event-sourcing patterns to keep data consistent during the migration window.

  6. Retire the monolith code — once the service is deployed and all callers have been updated, delete the domain’s code from the monolith.

Fitness functions for governance

  • Service dependency fitness function: enforce that the extracted service’s code has no compile-time imports pointing back into the monolith codebase. Prevents accidental in-process coupling after extraction.
  • Data ownership fitness function: alert when a service’s code accesses database tables declared as owned by another service. This is the data coupling equivalent of a circular import.
  • API contract fitness function: use consumer-driven contract tests (e.g., Pact) to ensure the service’s API matches what all consumers expect.

Trade-offs

ConcernDetail
Extraction order mattersStarting with the most-coupled domain first dramatically increases risk and cost. Always start with low-coupling, low-data-sharing domains.
Network latencyIn-process calls become network calls. Previously trivial operations may now require caching, timeouts, and retry logic.
Data consistencyMoving data ownership from a shared DB to per-service DBs introduces eventual consistency challenges. This is the biggest long-term cost.
Operational complexityEach new service adds CI/CD pipelines, monitoring, alerting, and on-call rotation. Factor this overhead into the extraction decision.
Incremental vs. big-bangThe book strongly recommends incremental extraction (one domain at a time). Big-bang rewrites have a very high failure rate.

Sysops Squad application

The team decides to extract Reporting first — it has the fewest inbound dependencies, owns its own data tables, and can tolerate slight data staleness (read-only analytics). TicketManagement is extracted last because it has the most cross-domain dependencies and the most shared data. They use the strangler fig pattern throughout.


Sequencing the Patterns

The six patterns form a pipeline — each produces output that the next pattern consumes. They are not fully sequential (some iteration is needed), but the general flow is:

[Raw Monolith Codebase]
        |
        v
Pattern 1: Identify and Size Components
   → Inventory + size assessment of all components
        |
        v
Pattern 2: Gather Common Domain Components
   → Common infrastructure extracted to ss.common
        |
        v
Pattern 3: Flatten Components
   → All components at uniform depth (root.domain)
        |
        v
Pattern 4: Determine Component Dependencies
   → Dependency graph built; circular deps broken
        |
        v
Pattern 5: Create Component Domains
   → Components grouped into business-capability domains
        |
        v
Pattern 6: Create Domain Services
   → Each domain extracted as a deployable service
        |
        v
[Distributed Architecture]

Iteration points:

  • After Pattern 3 (Flatten), the team may discover some components need re-sizing → loop back to Pattern 1.
  • After Pattern 4 (Dependencies), breaking a circular dependency may create a new component → loop back to Pattern 2 and Pattern 3.
  • After Pattern 5 (Domains), a domain may be too large → split it and re-run Pattern 4 to verify the split.

The patterns are designed to be applied in order but revisited as needed. Real migrations rarely go through each pattern exactly once.


Fitness Functions for Governance

Fitness functions are automated architectural checks — the term is borrowed from evolutionary computation. An architectural fitness function is any mechanism that evaluates a structural or behavioral property of the system and produces a pass/fail (or metric) result. They are run in CI/CD pipelines to catch drift.

Summary Table

PatternFitness FunctionTool Examples
1 — Identify & SizeComponent size % thresholdArchUnit, custom metrics scripts
1 — Identify & SizeComponent count boundsArchUnit, SonarQube
2 — Common DomainNo non-common refs to shared utilitiesArchUnit, Dependency Cruiser
2 — Common DomainShared library version alignmentDependabot, Renovate
3 — FlattenMax namespace depth = 2ArchUnit, custom script
3 — FlattenNo orphan classes in root namespaceArchUnit
4 — DependenciesNo circular dependenciesArchUnit, JDepend, Dependency Cruiser
4 — DependenciesMax afferent coupling thresholdJDepend, ArchUnit
4 — DependenciesMax efferent coupling thresholdJDepend, ArchUnit
5 — DomainsCross-domain dependency ratioCustom ArchUnit rules
5 — DomainsNamespace matches domain assignmentArchUnit
6 — Domain ServicesNo imports back into monolithArchUnit
6 — Domain ServicesData ownership (no cross-schema queries)Custom DB audit
6 — Domain ServicesAPI consumer-driven contractsPact

Why fitness functions matter: Without automated governance, decomposition decisions degrade over time. Engineers under deadline pressure take shortcuts — a quick import from outside a domain boundary, a helper class left in the root namespace, a utility copied rather than placed in common. Fitness functions catch these violations in CI before they accumulate into the same tangle the team started with.

Key characteristics of good fitness functions:

  • Automated — must run without human judgment.
  • Fast — should complete within the normal CI build window.
  • Specific — each function tests one architectural property.
  • Threshold-based — where possible, express as a metric with a configurable threshold rather than a binary rule, so thresholds can be tightened incrementally.

Trade-off Summary

DimensionComponent-Based DecompositionAlternative: Domain-Driven Design aloneAlternative: Big-bang rewrite
Basis for decisionsStructural analysis of existing codeConceptual modeling of business domainsGreenfield design
RiskLow to medium (incremental)Medium (alignment with actual code may be poor)Very high (all-or-nothing)
Time to first serviceWeeks to monthsMonths6-18+ months
Legacy compatibilityPreserves monolith during migrationRequires parallel implementationMonolith frozen during rewrite
Automation potentialHigh (fitness functions throughout)MediumLow during build, high post-launch
Handles tech debtForces confrontation of coupling/naming debtMay hide tech debt behind clean domain modelLeaves tech debt in old code
Team skill requiredStatic analysis, architectural metricsDomain modeling, event stormingFull-stack redesign
Failure modeSlow migration, partial decompositionMisaligned service boundariesTotal failure, monolith retained

The component-based approach is particularly strong when:

  • The monolith has years of accumulated code without clear boundaries.
  • The team cannot afford a rewrite but needs to progressively modernize.
  • There is organizational pressure to show incremental progress.

It is less appropriate when:

  • The monolith is being retired entirely in favor of a purchased SaaS product.
  • The existing codebase is so degraded (no package structure, everything in one namespace) that analysis produces no useful signal.

Sysops Squad Saga: End-to-End

The Sysops Squad is a fictional company running a support-ticketing platform as a monolith. Customers report problems, tickets are created, experts are assigned, work is done, billing occurs, and surveys are sent. The team wants to migrate to a distributed architecture to improve scalability and independent deployability.

Starting state: A single deployable JAR with namespace chaos, a god-component for ticketing, scattered utility code, and at least one circular dependency between ticketing and customer management.

After Pattern 1: Component inventory reveals 12 namespaces. ss.ticket is 38% of the codebase (too large). ss.login is 0.8% (too small, will be absorbed).

After Pattern 2: Logging, DB utilities, and notification helpers move to ss.common. The team decides ss.common.notification will eventually become a standalone notification service.

After Pattern 3: Deep namespace paths (e.g., ss.ticket.domain.entity.repository.impl) are collapsed. ss.ticket is still too large, so it is split into ss.ticketmgmt and ss.ticketworkflow.

After Pattern 4: The circular dependency between ss.ticketmgmt and ss.customer is found and broken. The dependency graph now has no cycles. Afferent coupling analysis confirms ss.common is the most-depended-upon component (expected).

After Pattern 5: Five domains are defined — TicketManagement, CustomerManagement, ExpertManagement, Billing, Reporting — each containing 2-4 components.

After Pattern 6: Reporting is extracted first (fewest dependencies, read-only data). ExpertManagement is second. CustomerManagement and Billing follow. TicketManagement is last. The notification service emerges from ss.common.notification as a cross-domain shared service.

End state: Five domain services + one shared notification service, each with its own deployment pipeline. The monolith is retired.


Key Takeaways

  1. Components are the unit of decomposition — work at the component (namespace/package) level first, not at the class or method level and not at the service level. The component is the right granularity for architectural decision-making.

  2. The six patterns are sequential but iterative — each pattern builds on the previous, but findings in later patterns often require revisiting earlier ones. Plan for at least two passes through the full sequence.

  3. Size matters before structure — a component that is too large cannot be cleanly analyzed for dependencies or domain assignment. Get sizing right (Pattern 1) before anything else.

  4. Common infrastructure is a first-class concern — failing to identify and consolidate shared code early (Pattern 2) leads to either duplication (maintenance nightmare) or hidden shared-service coupling discovered late in the migration.

  5. Hierarchy depth is an enemy of clarity — deep namespace hierarchies make it impossible to see component boundaries. Flatten first, then analyze.

  6. Circular dependencies are the most dangerous finding — they are the single biggest blocker to clean service extraction. Pattern 4’s circular dependency fitness function should be enabled permanently, not just during migration.

  7. Domain grouping is informed by data, not just intuition — use the dependency graph from Pattern 4 to validate domain groupings (Pattern 5). Two components with very high mutual coupling should be in the same domain; if they are not, the boundary is wrong.

  8. Start extraction with the least-coupled domain — Pattern 6’s most important guidance is sequencing. The wrong extraction order dramatically increases risk. Always work from the periphery inward.

  9. Fitness functions make governance scalable — without automation, decomposition decisions erode. A CI-enforced fitness function costs hours to set up and saves weeks of re-tangling over the life of the project.

  10. The strangler fig beats the big bang — the book consistently advocates incremental extraction over wholesale rewrites. Each incremental step can be validated, and the monolith continues to operate throughout.


Last Updated: 2026-05-30