Chapter 2: Measuring Performance

Why Measuring Software Performance is Hard

Unlike manufacturing, software inventory is invisible. Work breakdown is arbitrary. Design and delivery activities happen simultaneously in Agile. Previous measurement attempts fail in two ways:

  1. They focus on outputs rather than outcomes
  2. They focus on individual/local rather than team/global measures

Three Flawed Measurement Approaches

Lines of Code

  • Rewards bloat over elegance
  • Incentivizes writing more code rather than solving business problems
  • Minimizing LOC isn’t right either (unreadable one-liners)

Velocity

  • Relative and team-dependent — teams can’t be compared
  • Easily gamed: inflate estimates, focus on stories at expense of collaboration
  • Destroys its utility as a planning tool

Utilization

  • High utilization is only good up to a point
  • Queue theory: as utilization approaches 100%, lead times approach infinity
  • No slack = no capacity to absorb unplanned work or improvement

The Four DORA Metrics

A valid performance measure must: (1) focus on global outcomes, not local, and (2) measure outcomes not output.

MetricDescriptionWhy
Delivery Lead TimeCode committed → code in productionFrom Lean theory; shorter = faster feedback and course correction
Deployment FrequencyHow often deploys to productionProxy for batch size (smaller batches = better)
Mean Time to Restore (MTTR)How long to restore service after incidentFailure is inevitable; resilience matters more than MTBF
Change Failure Rate% of changes that cause degraded serviceKey quality metric; “percent complete and accurate”

Note: Deployment frequency is the reciprocal of batch size. More frequent deploys = smaller batches.

Performance Tiers (Cluster Analysis)

The book uses cluster analysis (hierarchical clustering) — a data-driven approach with no built-in concept of “good” or “bad.” Three clusters emerge naturally every year.

2016 Performance Benchmarks

HighMediumLow
Deploy FrequencyOn-demand (multiple/day)1/week–1/month1/month–6/months
Lead Time< 1 hour1 week–1 month1 month–6 months
MTTR< 1 hour< 1 day< 1 day*
Change Fail Rate0–15%31–45%16–30%

2017 Performance Benchmarks

HighMediumLow
Deploy FrequencyOn-demand (multiple/day)1/week–1/month1/week–1/month*
Lead Time< 1 hour1 week–1 month1 week–1 month*
MTTR< 1 hour< 1 day1 day–1 week
Change Fail Rate0–15%0–15%31–45%

*Low performers were lower on average but had the same median as medium performers.

The Big Finding: Speed AND Stability Are Correlated

“There is no tradeoff between improving performance and achieving higher levels of stability and quality. Rather, high performers do better at all of these measures.”

This refutes the dogma behind “bimodal IT” — the idea that fast systems must be less stable.

Trend: The Gap Is Growing

  • High performers maintained or improved 2016→2017
  • Low performers tried to increase tempo without addressing underlying obstacles
  • 2017: Low performers lost ground in stability while trying to match tempo

Impact on Organizational Performance

High-performing organizations were twice as likely to exceed goals on:

  • Profitability, market share, productivity (commercial)
  • Quantity/quality of goods/services, operating efficiency, customer satisfaction, mission goals (non-commercial)

Implication: Don’t Outsource Strategic Software

“The fact that software delivery performance matters provides a strong argument against outsourcing the development of software that is strategic to your business.”

Software that differentiates your business should be built in-house. Use SaaS for commodity functions (payroll, office productivity).

Warning: Use Metrics Carefully

In pathological cultures, measurement becomes a tool of control and people hide information. “Whenever there is fear, you get the wrong numbers” (Deming).

Metrics only work in a learning culture. Fix the culture first; then apply measurement rigorously.