Chapter 4: Architectural Decomposition

saht decomposition coupling instability abstractness component-decomposition tactical-forking

Status: Notes complete

Overview

Chapter 4 moves from why to decompose (Chapter 3’s modularity drivers) to how to decompose. It begins with a critical prerequisite question: is the codebase even decomposable? Not every codebase can be cleanly separated — some are so entangled at the code level that decomposition requires a fundamentally different approach than others. The chapter provides tools from software metrics (afferent/efferent coupling, abstractness, instability, distance from the main sequence) to diagnose decomposability.

The chapter then presents and compares two primary decomposition strategies: component-based decomposition (a careful, incremental approach that identifies component boundaries from existing code structure) and tactical forking (a “clone and prune” approach that starts from a copy of the whole system). Each strategy suits a different type of codebase and organizational context, and the authors provide explicit criteria for choosing between them.

The chapter’s central argument is that decomposition is not a free design activity — it has a cost proportional to the degree of existing coupling in the codebase. Before choosing a strategy, architects must measure coupling to understand what they are working with.

Core Concepts

Afferent coupling (Ca): The number of external components that depend on a given component. High afferent coupling means many things depend on this component — it is “widely used” and changes to it have broad impact. Also called “incoming coupling.”

Efferent coupling (Ce): The number of external components that a given component depends on. High efferent coupling means this component depends on many things — it is fragile and brittle because its behavior is contingent on many other components. Also called “outgoing coupling.”

Abstractness (A): The ratio of abstract classes and interfaces to total classes in a component. Ranges 0.0 to 1.0. A=0 means all concrete implementations; A=1 means all abstract definitions.

Instability (I): A derived metric: Ce / (Ca + Ce). Ranges 0.0 to 1.0. I=0 is maximally stable (nothing depends on, everything depends on this); I=1 is maximally unstable (depends on many things, nothing depends on this). Measures resistance to change.

Distance from the main sequence (D): A derived metric: |A + I - 1|. Ranges 0.0 to 1.0. D=0 means the component is on the ideal “main sequence” (balanced abstractness and instability). D approaching 1 means the component is in a “zone of pain” or “zone of uselessness.”

Zone of pain: High stability (low I) + low abstractness (low A) — concrete classes that many things depend on. Difficult to change because changes ripple everywhere, yet difficult to extend because nothing is abstract.

Zone of uselessness: High abstractness (high A) + high instability (high I) — abstract classes/interfaces that nothing actually uses. Dead weight in the codebase.

Main sequence: The ideal diagonal line from (A=1, I=0) to (A=0, I=1) — abstract components are stable (widely depended on), concrete components are unstable (free to change because fewer things depend on them).

Component-based decomposition: A systematic, incremental strategy that uses existing code structure (packages, namespaces, modules) as the basis for identifying service boundaries, then carefully moves components across those boundaries one at a time.

Tactical forking: A “clone and prune” strategy that starts by duplicating the entire monolith for each intended service, then progressively deletes the code that doesn’t belong in each copy until distinct services emerge.

Is the Codebase Decomposable? Coupling Analysis

Before choosing a decomposition strategy, architects must assess how entangled the codebase actually is. This assessment uses the software metrics framework from Robert C. Martin’s work on package coupling principles.

Afferent and Efferent Coupling

             Afferent Coupling (Ca)
             "What depends on me?"
                      ^
                      |
             [Component X]  ---------> Dependencies it uses
                                        (Efferent Coupling, Ce)
                                        "What do I depend on?"

Measuring Ca and Ce gives a picture of each component’s role:

High Ca, Low Ce: A foundation/utility component. Many things depend on it; it depends on little. Stable but risky to change (wide blast radius).
Low Ca, High Ce: A leaf/feature component. Few things depend on it; it depends on many. Easy to change in isolation, but fragile because it can be broken by changes in its dependencies.
High Ca, High Ce: A hub component — the most dangerous kind. Many things depend on it AND it depends on many things. Changes here are both risky and difficult. These are the components that make decomposition hard.
Low Ca, Low Ce: An isolated component. Easy to decompose — it’s nearly already independent.

The Instability Metric (I)

I = Ce / (Ca + Ce)

where:
  Ce = efferent (outgoing) coupling
  Ca = afferent (incoming) coupling

Range: 0.0 (maximally stable) to 1.0 (maximally unstable)

Interpretation:

I ≈ 0: Very stable. Many things depend on it; it depends on few things. Changes propagate outward to many dependents. This component should be abstract (to allow extension without modification).
I ≈ 1: Very unstable. Few things depend on it; it depends on many things. Changes don’t propagate to many dependents. This component can afford to be concrete.

The stability principle (from Martin): Depend in the direction of stability. Components that are concrete (low A) should be unstable (high I) — they are free to change. Components that are stable (low I) must be abstract (high A) — their stability is achieved through abstraction, allowing extension without modification.

The Abstractness Metric (A)

A = (Number of abstract classes + interfaces) / (Total number of classes)

Range: 0.0 (fully concrete) to 1.0 (fully abstract)

Abstractness measures the degree to which a component relies on abstraction vs. implementation. High abstractness means behavior is primarily defined through interfaces/abstract classes that can be extended; low abstractness means behavior is in concrete classes that cannot be extended without modification.

Distance from the Main Sequence (D)

D = |A + I - 1|

Range: 0.0 (on the main sequence) to ~1.41 (maximally distant)
Normalized: D' = |A + I - 1| / sqrt(2)  (range 0.0 to 1.0)

The main sequence is the ideal relationship between abstractness and instability. Components on the main sequence are either:

Abstract and stable (frameworks, APIs, interfaces that many things depend on)
Concrete and unstable (implementations that few things depend on, free to change)

The Instability/Abstractness Graph

Abstractness (A)
1.0 |
    | Zone of
    | Uselessness
    |    *
0.7 |      *
    |        *   [Main Sequence]
0.5 |          *
    |            *
0.3 |              *
    |                *
    |                   * Zone of
0.0 |_____________________*___Pain
    0.0   0.3   0.5   0.7  1.0
                              Instability (I)

Zone of Pain    (I≈0, A≈0): Concrete + stable. Hard to change, hard to extend.
                             Example: utility classes everything depends on.
Zone of Useless (I≈1, A≈1): Abstract + unstable. Abstract but nobody uses them.
                             Example: abandoned framework interfaces.
Main Sequence   (D≈0):       Ideal. Abstract things are stable; concrete things are free.

What D Tells You About Decomposability

D Score	Meaning	Decomposition Implication
0.0 – 0.2	On or near main sequence	Clean component, good candidate for extraction
0.2 – 0.5	Some distance from ideal	May be extractable with refactoring
0.5 – 0.7	Significant structural issues	Decomposition will be painful; consider tactical forking
0.7 – 1.0	Zone of pain or uselessness	Very hard to decompose incrementally

A codebase where most components have D > 0.5 is a strong indicator that tactical forking may be more practical than incremental component-based decomposition — there’s no clean structure to work with.

Decomposition Approach 1: Component-Based Decomposition

What It Is

Component-based decomposition is an incremental, structure-preserving approach. It uses the existing package/namespace/module hierarchy as the starting point for identifying service candidates, then systematically moves cohesive groups of classes across service boundaries. The underlying assumption is that the codebase has some recognizable structure that maps to business domains, even if imperfectly.

This is the approach detailed extensively in Chapter 5 (Component-Based Decomposition Patterns). Chapter 4 introduces it as one of two strategic options.

How It Works (High Level)

Step 1: Identify existing components
        (packages, namespaces, modules in the monolith)
             |
             v
Step 2: Measure coupling between components
        (Ca, Ce, D scores per component)
             |
             v
Step 3: Find clusters of high cohesion / low coupling
        (components that mainly talk to each other)
             |
             v
Step 4: Define service boundaries around clusters
             |
             v
Step 5: Extract one service at a time, incrementally
        (strangler fig pattern)
             |
             v
Step 6: Verify independence before extracting the next

Advantages of Component-Based Decomposition

Preserves existing structure: Reuses the work already done in organizing the codebase. Developers are familiar with the structure.
Incremental risk: Each extraction step is independently testable and reversible. Mistakes affect one service, not the whole system.
Clear progress tracking: You can measure coupling metrics before and after each step to confirm improvement.
Naturally discovers domain boundaries: Following cohesion clusters often reveals domain structure that matches business ownership.
Lower wasted effort: Only the code actually needed for each service is written/tested; nothing is written twice.

Disadvantages of Component-Based Decomposition

Requires existing structure: If the codebase is a Big Ball of Mud (high coupling everywhere, no recognizable structure), there are no clusters to follow. The prerequisite simply isn’t there.
Slower: Incremental extraction takes more calendar time than forking. Teams must maintain and evolve the monolith while simultaneously extracting services.
Requires significant architectural knowledge: Someone must deeply understand the codebase to identify the correct boundaries. This is often lost knowledge in aging codebases.
Risk of creating distributed monolith: Without careful coupling analysis, “extracted” services may still depend tightly on the remaining monolith — producing all the costs of distribution without the independence.

When to Use Component-Based Decomposition

The codebase has recognizable structure (packages/namespaces align with domains)
D scores are mostly below 0.5 (components are reasonably well-placed)
The team has good knowledge of the existing codebase
Time is available for careful, incremental work
Business continuity requires maintaining the monolith during migration

Decomposition Approach 2: Tactical Forking

What It Is

Tactical forking (sometimes called “clone and prune” or “strangler by subtraction”) starts from the opposite end. Instead of building up services from parts of the monolith, it duplicates the entire monolith for each intended service and then deletes everything that doesn’t belong in that service. The result is that each service starts as a full copy of the system and is progressively pruned until only the relevant code remains.

Monolith (full)
     |
     +-------> Copy 1 (intended: Order Service)
     |              Delete: User, Inventory, Reporting, ...
     |              Keep: Order processing logic
     |              Result: Order Service
     |
     +-------> Copy 2 (intended: User Service)
     |              Delete: Order, Inventory, Reporting, ...
     |              Keep: User management logic
     |              Result: User Service
     |
     +-------> Copy 3 (intended: Inventory Service)
                    Delete: Order, User, Reporting, ...
                    Keep: Inventory management logic
                    Result: Inventory Service

Why Deletion Is Easier Than Extraction

In a tightly coupled codebase, removing code that calls into a component is mechanically simpler than extracting that component while keeping the rest working:

Deletion is additive risk: Deleting code that “belongs to another service” can be validated by running the remaining tests. If something breaks, the deleted code was needed.
Extraction is subtractive risk: Extracting a component requires re-wiring all its callers to use the new service boundary — a complex refactoring task in a tightly coupled codebase.

This asymmetry is the key insight behind tactical forking. In a poorly structured codebase, deletion is the more tractable operation.

Advantages of Tactical Forking

Works on any codebase: Even a Big Ball of Mud can be forked. You don’t need existing structure.
Faster initial progress: Each team can immediately start working on their service copy without waiting for architectural analysis to complete.
Familiar environment: Each team continues working in a copy of the codebase they already know.
No strangler phase required: The monolith doesn’t need to be kept alive while migration happens; each fork is immediately a candidate deployment unit.
Parallelizable: Multiple teams can work on their respective forks simultaneously.

Disadvantages of Tactical Forking

Massive code duplication: Every fork starts as a full copy. Common infrastructure, utilities, and shared logic exist in N copies simultaneously. Changes to shared logic must be synchronized across all forks.
Risk of forking shared state: If shared libraries and utility code are simply included in each fork, those forks are not truly independent — they share code but not a service boundary. They may end up as a distributed monolith with duplicated code rather than duplicated services.
Technical debt multiplication: Any pre-existing technical debt in the monolith is replicated into every fork. Pruning removes code but not the quality issues in the remaining code.
No inherent cleanup: Tactical forking doesn’t require fixing underlying coupling issues — it just hides them. The resulting services may have the same internal quality problems as the monolith.
Test suite duplication: All tests exist in all forks. Test maintenance burden multiplies.

When to Use Tactical Forking

The codebase is a Big Ball of Mud (D scores mostly above 0.5, coupling is pervasive)
The team has little understanding of the existing codebase structure (high developer turnover)
Speed of initial progress is critical (competitive pressure, funding milestones)
The plan includes significant rewriting of the duplicated code (so duplication is temporary)
Domain boundaries are reasonably understood even if the code doesn’t reflect them

Comparing the Two Approaches

Trade-off Table

Dimension	Component-Based Decomposition	Tactical Forking
Prerequisite	Requires identifiable structure (low D)	Works on any codebase
Initial speed	Slower (analysis first)	Faster (start immediately)
Code duplication	None (code moves, not copies)	High (N full copies initially)
Technical debt	Cleaned up during extraction	Replicated into each fork
Risk profile	Lower per-step risk; incremental	Higher initial risk; big-bang
Domain knowledge needed	High (need to understand structure)	Lower (deletion is mechanical)
Distributed monolith risk	Medium (shared DB is the main risk)	High (shared code in forks)
Suitable for	Moderately structured codebases	Big Ball of Mud codebases
Resulting code quality	Generally higher	Depends on post-fork investment
Parallelizability	Lower (sequential extraction)	High (teams work independently)

Decision Tree

Is the codebase decomposable?
(Run coupling analysis: Ca, Ce, D scores)
         |
         v
Are D scores mostly < 0.5?
    YES                      NO
     |                        |
     v                        v
Do we have time for      Use TACTICAL FORKING
incremental migration?   Plan for post-fork
    YES          NO      rewrite/cleanup
     |            |
     v            v
COMPONENT-BASED  Consider
DECOMPOSITION    TACTICAL
                 FORKING

The Role of Coupling Metrics in Practice

Using Metrics to Guide Extraction Order

When using component-based decomposition, coupling metrics determine the sequence of extractions:

Extract low-Ce components first: Components with few dependencies can be extracted with minimal disruption to the remaining codebase.
Extract high-D components cautiously: Components far from the main sequence will require refactoring before or during extraction.
Defer high-Ca components: Components that many others depend on should be extracted last (or become shared libraries) because extracting them early creates a ripple of re-wiring.

Tools for Measuring These Metrics

JDepend (Java): Computes Ca, Ce, A, I, D per package
NDepend (.NET): Similar metrics for .NET assemblies
ArchUnit: Enforces coupling rules in test suites
Lattix: Dependency structure matrix visualization
Custom scripts parsing import statements in Python/Go/JavaScript

Coupling Clusters as Service Boundaries

The authors recommend building a dependency structure matrix (DSM) or coupling graph to visualize which components cluster together naturally:

Component Matrix (X = dependency):

         Auth  Order  Inv  Report  User  Notify
Auth       -     X     -     -      X     -
Order      X     -     X     -      X     X
Inv        -     X     -     X      -     -
Report     -     X     X     -      -     -
User       X     X     -     -      -     X
Notify     -     -     -     -      X     -

Clusters visible: {Auth, User, Notify} and {Order, Inv, Report}

Components with dense internal connections and sparse external connections are natural service candidates.

Sysops Squad Saga

Context: After Penelope makes the business case (Chapter 3), the team must choose a decomposition approach for the Sysops Squad monolith — 800K lines, 47,000+ classes, built over 15 years.

Analysis the team performs:

They run coupling analysis on the major packages: ss.ticket, ss.expert, ss.notification, ss.reporting, ss.billing, ss.user
They find that ss.ticket has high afferent coupling (many things call into it) and is concrete (low A) — it sits in the zone of pain with D ≈ 0.65
ss.notification has low Ca and low Ce — D ≈ 0.1, very close to the main sequence, and already nearly independent
ss.billing has complex coupling to external payment providers AND to internal ticket state — mixed results

Decision: The team chooses component-based decomposition because:

Several packages (notification, user management, reporting) have identifiable structure and D scores below 0.4
The team has a core set of long-tenure developers who understand the codebase architecture
Business continuity requires the monolith to keep running during migration

Extraction order decided:

ss.notification first — lowest D, nearly independent
ss.reporting second — read-only dependencies, can be separated cleanly
ss.user third — moderate coupling, clear domain boundary
ss.expert fourth — coupled to ticket but separable
ss.ticket last — highest Ca, most other services depend on it; must remain until others are migrated

What the saga demonstrates: Coupling metrics directly inform the extraction order. You don’t extract the most important service first — you extract the most independent service first to reduce risk and build team confidence.

Key Takeaways

Before choosing a decomposition approach, architects must diagnose codebase decomposability using coupling metrics — afferent coupling (Ca), efferent coupling (Ce), abstractness (A), instability (I), and distance from main sequence (D).
Instability (I = Ce / (Ca + Ce)) measures a component’s resistance to change: I≈0 is stable (many depend on it); I≈1 is unstable (it depends on many things). Components should be depended on in the direction of increasing stability.
The zone of pain (concrete + stable, high Ca + low A) is the most problematic region: components there are hard to change AND hard to extend. They are the primary obstacles to clean decomposition.
The zone of uselessness (abstract + unstable, high A + high I) contains dead weight — abstract structure that nothing actually uses. These should be deleted or collapsed.
Component-based decomposition is the preferred approach when the codebase has identifiable structure (D scores mostly below 0.5) and the team has codebase knowledge. It is incremental, lower-risk, and produces cleaner results.
Tactical forking (“clone and prune”) is the pragmatic choice when the codebase is a Big Ball of Mud — pervasive coupling makes incremental extraction impractical. It trades code duplication for tractability.
The primary risk of tactical forking is code duplication and technical debt multiplication — every quality problem in the monolith is copied into every fork. Post-fork investment in cleanup is essential.
Coupling metrics determine the extraction order in component-based decomposition: extract lowest-D, lowest-Ce components first; defer high-Ca components until dependent services are already extracted.
A dependency structure matrix (DSM) or coupling graph visualizes natural service clusters — groups of components with dense internal coupling and sparse external coupling are natural service candidates.
Both approaches risk creating a distributed monolith if shared state (especially the database) is not also decomposed. Application-tier decomposition must eventually be accompanied by data decomposition (covered in Ch. 6).

ch03-architectural-modularity — The modularity drivers that justify decomposition
ch05-component-decomposition-patterns — Specific patterns for executing component-based decomposition
ch06-pulling-apart-operational-data — Data decomposition required alongside service decomposition
ch02-architectural-coupling — Coupling types and coupling taxonomy
saht-README — Book overview and chapter index

Last Updated: 2026-05-30

Study Notes by Niladri & AI

Explorer

ch04-architectural-decomposition

Chapter 4: Architectural Decomposition

Overview

Core Concepts

Is the Codebase Decomposable? Coupling Analysis

Afferent and Efferent Coupling

The Instability Metric (I)

The Abstractness Metric (A)

Distance from the Main Sequence (D)

The Instability/Abstractness Graph

What D Tells You About Decomposability

Decomposition Approach 1: Component-Based Decomposition

What It Is

How It Works (High Level)

Advantages of Component-Based Decomposition

Disadvantages of Component-Based Decomposition

When to Use Component-Based Decomposition

Decomposition Approach 2: Tactical Forking

What It Is

Why Deletion Is Easier Than Extraction

Advantages of Tactical Forking

Disadvantages of Tactical Forking

When to Use Tactical Forking

Comparing the Two Approaches

Trade-off Table

Decision Tree

The Role of Coupling Metrics in Practice

Using Metrics to Guide Extraction Order

Tools for Measuring These Metrics

Coupling Clusters as Service Boundaries

Sysops Squad Saga

Key Takeaways

Graph View

Table of Contents

Backlinks

Study Notes by Niladri & AI

Explorer

ch04-architectural-decomposition

Chapter 4: Architectural Decomposition

Overview

Core Concepts

Is the Codebase Decomposable? Coupling Analysis

Afferent and Efferent Coupling

The Instability Metric (I)

The Abstractness Metric (A)

Distance from the Main Sequence (D)

The Instability/Abstractness Graph

What D Tells You About Decomposability

Decomposition Approach 1: Component-Based Decomposition

What It Is

How It Works (High Level)

Advantages of Component-Based Decomposition

Disadvantages of Component-Based Decomposition

When to Use Component-Based Decomposition

Decomposition Approach 2: Tactical Forking

What It Is

Why Deletion Is Easier Than Extraction

Advantages of Tactical Forking

Disadvantages of Tactical Forking

When to Use Tactical Forking

Comparing the Two Approaches

Trade-off Table

Decision Tree

The Role of Coupling Metrics in Practice

Using Metrics to Guide Extraction Order

Tools for Measuring These Metrics

Coupling Clusters as Service Boundaries

Sysops Squad Saga

Key Takeaways

Related Resources

Graph View

Table of Contents

Backlinks