Chapter 18: Build Systems and Build Philosophy

seg build-systems bazel artifact-based distributed-builds dependencies hermeticity

Status: Notes complete


Overview

Chapter 18 is one of the book’s most technically dense chapters. It examines build systems — the tools that transform source code into executable artifacts — through the lens of engineering at scale. The chapter argues that build system design is not merely a developer-convenience concern: at Google’s scale, the build system is a critical piece of infrastructure whose design determines whether continuous integration is feasible, whether builds are reproducible, and whether engineers can make cross-cutting changes across the codebase with confidence.

The chapter begins by asking what a build system is for and what goes wrong without one. It then traces the evolution from shell scripts through task-based systems (Make, Ant, Gradle) to artifact-based systems (Bazel, Pants, Buck), identifying the fundamental architectural distinction between the two paradigms and explaining why artifact-based systems are superior at scale. The chapter closes with guidance on managing modules and dependencies, including Google’s “1:1:1 rule” and the philosophy of hermetic builds.

A central thesis: the problems that build systems solve at scale are not performance problems — they are correctness and reproducibility problems. A fast build that sometimes produces wrong outputs is more dangerous than a slow build that is always correct.


Core Concepts

Build system: A tool that takes source code and other inputs and produces deployable artifacts (binaries, libraries, packages, container images). A build system manages the dependency graph between source files and outputs, ensuring that the minimum necessary set of actions is performed to produce correct outputs.

Task-based build system: A build system in which the fundamental unit is a task (shell command or script). Engineers define what commands to run and in what order. Examples: Make, Ant, Maven (partially), Gradle, shell scripts.

Artifact-based build system: A build system in which the fundamental unit is an artifact (a named output). Engineers declare what to produce and what inputs it depends on; the system determines how to produce it. Examples: Bazel, Pants, Buck, Please.

Hermetic build: A build that is fully isolated from the local environment — same inputs always produce same outputs regardless of the machine, user, or state of the local filesystem. Hermeticity is a prerequisite for reliable caching and distributed execution.

Distributed build: A build that executes actions on a pool of remote machines rather than (or in addition to) the local machine. Enables parallelism beyond what a single machine can provide and enables shared caching across all engineers in an organization.

1:1:1 rule: Google’s convention that each directory should contain at most one build target, producing at most one package/library, with one purpose. Encourages fine-grained modules with minimal, explicit dependency declarations.


What Happens Without a Build System

Shell Scripts

The simplest build system is a shell script: compile.sh. For tiny projects, this works. As projects grow, shell scripts encounter fundamental problems:

  • No dependency tracking: the script runs all steps every time, regardless of what changed. Build times grow with project size, not with the scope of changes.
  • No parallelism: shell scripts execute steps sequentially (without explicit parallelization effort)
  • Fragile ordering: engineers must manually specify the correct order of compilation steps; misordering causes mysterious failures
  • No reproducibility: the script depends on whatever tools happen to be installed on the engineer’s machine, meaning builds differ between machines

Makefiles

Make improves on shell scripts by adding dependency tracking: rules specify what a target depends on, and Make skips rebuilding targets whose dependencies have not changed. But Make has fundamental limitations that make it unsuitable at scale:

  • File-based dependency model: Make tracks dependencies between files, but source code dependencies are logical (module A uses module B), not purely file-based. Engineers must manually keep the dependency declarations consistent with the actual code — a maintenance burden that grows with codebase size.
  • Timestamp-based change detection: Make determines whether a file has changed based on filesystem timestamps. This is fragile: checking out an old file gives it a new timestamp; a file can change but keep its timestamp (e.g., touch); network filesystems have inconsistent timestamps.
  • Global state sensitivity: a Makefile rule can read environment variables, query the filesystem, or invoke arbitrary programs — meaning the same Makefile can produce different outputs in different environments.
  • Scaling problems: in a large project with hundreds of modules, Make must evaluate the entire dependency graph on every invocation — which can itself take significant time.

Modern Build Systems: All About Dependencies

The chapter argues that the fundamental problem build systems solve is dependency management, not merely compilation orchestration. Given a set of source files with dependencies between them, the build system must:

  1. Determine the correct order in which to process them
  2. Determine the minimal set to reprocess after a change
  3. Ensure that each processing step uses the correct, stable versions of its inputs
  4. Detect and report cycles in the dependency graph

This dependency perspective explains why task-based systems fail at scale: when the unit of work is a task (arbitrary shell command), the build system cannot understand what a task produces or consumes — it can only observe the order in which tasks are run. This opacity makes safe parallelism, reliable caching, and reproducibility nearly impossible.


Task-Based vs. Artifact-Based Build Systems

This is the chapter’s central conceptual distinction and the most important thing to understand about modern build system design.

Task-Based Systems

In a task-based system, the engineer writes a task — a description of what commands to run. The system executes tasks in the specified order.

# Gradle (task-based)
task compileJava {
    doFirst {
        // arbitrary code: read files, call APIs, delete things
    }
    doLast {
        exec { commandLine 'javac', ... }
    }
}

The fundamental problem: because tasks can execute arbitrary code, the build system has no way to know:

  • What files a task reads (without executing it)
  • What files a task writes (without executing it)
  • Whether a task’s output is deterministic given its inputs
  • Whether two tasks can safely run in parallel

This opacity forces the system to either:

  • Execute tasks serially in declared order (safe but slow — no parallelism)
  • Trust engineer-declared dependencies (fast but incorrect if declarations are wrong)
  • Execute all tasks every time (correct but wastes time on unchanged inputs)

Scale consequences:

  • Engineers must keep dependency declarations accurate manually — increasingly difficult as the codebase grows
  • Parallel execution requires trusting declarations that are rarely verified
  • Caching is difficult because task outputs cannot be proven to depend only on declared inputs
  • Non-deterministic tasks (reading clocks, network calls) produce inconsistent outputs that break caching

Artifact-Based Systems

In an artifact-based system, the engineer declares what to produce and what it depends on. The system determines how to produce it.

# Bazel (artifact-based)
java_library(
    name = "mylib",
    srcs = glob(["*.java"]),
    deps = [
        "//other/package:util",
        "@maven//:com_google_guava_guava",
    ],
)

What changes:

  • The system controls execution: Bazel, not the engineer, decides how to compile a java_library. The engineer cannot inject arbitrary shell commands into the build action.
  • Inputs are fully declared: all source files and dependencies must be declared; undeclared inputs cause build failures (not silent reads from the filesystem)
  • Outputs are fully declared: Bazel knows exactly what files a build action produces; it can cache them by content hash
  • Hermeticity is enforced: build actions are executed in sandboxed environments that cannot read undeclared inputs or write to undeclared outputs

Scale benefits:

  • Safe parallelism: Bazel can prove which actions are independent (no shared inputs/outputs) and run them in parallel — safely, not just optimistically
  • Reliable caching: because outputs are determined only by declared inputs (content-hashed), a cached output from yesterday’s build can be safely reused today if the inputs match
  • Distributed execution: the same properties that enable local caching enable remote caching and remote execution — the build can fan out to a build farm
  • Incremental correctness: when a file changes, Bazel can prove exactly which targets are affected and rebuild only those

Comparison Table

PropertyTask-Based (Make/Gradle)Artifact-Based (Bazel/Buck)
Unit of workTask (shell command)Artifact (named output)
Dependency declarationManual, unverifiedEnforced by sandbox
ParallelismOptimistic (trust declarations)Proven safe (input/output analysis)
Caching reliabilityFragile (timestamp-based or trust-based)Reliable (content-hash-based)
Distributed executionDifficult or impossibleNatural extension of caching
HermeticityNot enforcedEnforced by design
Build rulesArbitrary codeRestricted DSL
Engineer flexibilityHigh (can do anything in a task)Low (must fit into declared rule types)
Correctness guaranteeBest-effortStrong (given correct declarations)

Distributed Builds

Why Distributed Builds Are Necessary

At Google’s scale — a billion-line codebase where a single service may have thousands of transitive dependencies — building everything on a single machine is not a performance optimization question; it is a feasibility question. The authors estimate that building Google’s entire codebase from scratch on a single modern machine would take on the order of days. Distributed builds reduce this to minutes.

How Distributed Builds Work

Artifact-based systems enable distributed builds naturally through two mechanisms:

Remote caching:

  • Every build action is identified by a content hash of its inputs (source files + build rule + tool versions)
  • When an action completes, its outputs are stored in a remote cache keyed by that hash
  • When any engineer (or CI machine) runs a build, it first checks the remote cache; if a matching entry exists, the action’s outputs are downloaded rather than recomputed
  • Result: an engineer pulling a fresh checkout and running a build for the first time gets near-instant results if a colleague already built the same inputs

Remote execution:

  • Instead of executing build actions locally, the build client sends action descriptions to a build farm
  • The build farm executes actions in parallel across many machines
  • Results (both cached and newly computed) are returned to the client
  • The build client assembles the final outputs from distributed results

Both mechanisms depend critically on hermeticity: if build actions could read from the local environment, remote execution would produce different outputs than local execution — defeating the purpose.

When Distributed Builds Are Needed

The chapter is explicit that distributed builds are not needed at every scale:

ScaleAppropriate Build Approach
Small project (< 100K LOC)Local build, Make or similar
Medium project (< 1M LOC)Local build, possibly artifact-based for caching
Large project (> 10M LOC)Artifact-based mandatory; distributed beneficial
Very large (> 100M LOC)Distributed builds required for CI feasibility

The transition to distributed builds is motivated by CI requirements as much as by developer productivity. When builds take hours locally, CI cannot provide timely feedback. Distributed builds bring CI build times back into the range where they provide useful signal.


Hermetic Builds

Definition

A build is hermetic if:

  1. All inputs (source files, dependencies, tools) are fully declared
  2. The build action cannot access anything not in its declared inputs
  3. The same inputs always produce the same outputs, on any machine, at any time

Hermeticity is enforced in Bazel through sandboxing: each build action runs in an isolated environment (filesystem namespace, network namespace, or container) that can only access files listed as inputs. Accessing an undeclared file results in a build error, not a silent incorrect build.

Why Hermeticity Matters

Without hermeticity:

  • A build that works on an engineer’s laptop may fail on CI because the engineer has a local dependency installed that CI does not
  • A build that succeeds today may fail tomorrow because a tool version changed on the machine
  • Cached outputs cannot be trusted: a cache hit might return an output built with different tool versions than the current build

With hermeticity:

  • “It works on my machine” is a reliable statement — not a harbinger of CI failures
  • Caching is trustworthy: if the content hash of inputs matches, the cached output is correct
  • Builds are reproducible: given the same source, anyone can reproduce the same binary at any point in the future (critical for security incident response)

Anti-pattern — Environment Leakage: Build actions that read from $HOME, $PATH, ambient environment variables, or local tool installations. These create invisible dependencies on the engineer’s machine state, breaking hermeticity silently. The build appears to work but produces non-reproducible results.


Dealing with Modules and Dependencies

Fine-Grained Modules and the 1:1:1 Rule

Google’s convention for structuring code in a monorepo:

1:1:1 rule: Each directory contains:

  • 1 purpose — a single, clearly defined responsibility
  • 1 build target — a single BUILD file entry (e.g., one java_library or cc_library)
  • 1 package — the directory corresponds to a single importable module

Benefits of fine-grained modules:

  • Minimal recompilation: when a leaf module changes, only its direct and transitive dependents rebuild. Coarse-grained modules force rebuilding everything that shares the module boundary, even if only one small part changed.
  • Explicit dependencies: engineers must explicitly declare every dependency. This makes the dependency graph visible and queryable — tooling can answer “what depends on this module?” and “what does this module depend on?” with precision.
  • Easier ownership: a single-purpose module has a natural owner. A multi-purpose module creates ambiguous ownership.
  • Enforced API boundaries: callers can only use the exported interface of a module; internal implementation details cannot be accidentally imported

Trade-off: fine-grained modules require more BUILD file maintenance. The 1:1:1 rule creates many small build targets, which can feel verbose. Google accepts this cost because the benefits in build correctness, parallelism, and ownership clarity outweigh it.

Minimizing Module Visibility

Bazel’s visibility attribute controls which other build targets can depend on a given target. Options range from fully public (//visibility:public) to narrowly restricted (specific packages only).

Google’s guidance: default to the most restrictive visibility that satisfies the use case. Only expose what needs to be exposed. Overly broad visibility creates coupling: modules that can be depended on by anyone will be depended on by everyone, making refactoring impossible.

Anti-pattern — Visibility Creep: incrementally widening visibility to satisfy a new caller rather than redesigning the module boundary. Each widening makes the module harder to change because more callers must be accounted for.

Managing External Dependencies

External dependencies (third-party libraries) present a challenge for hermeticity: they live outside the monorepo and may change over time.

Google’s approach:

  • Vendor or pin: all external dependencies are pinned to specific content-hashed versions in the repository (or in a centrally managed dependency manifest)
  • Single version policy: Google enforces that only one version of any external dependency may be used across the entire monorepo at once (the “One Version Rule”). This prevents diamond dependency conflicts and ensures that security patches are applied uniformly.
  • Explicit updates: updating an external dependency is a deliberate engineering action — it requires a code change that goes through review and testing, not an automatic or implicit update

Trade-off of single-version policy: an external library upgrade must be compatible with all users in the monorepo simultaneously. This can mean that one team’s desire to upgrade is blocked by another team’s incompatibility. The benefit is that the monorepo never has hidden version conflicts, and security patches are applied everywhere at once.

Anti-pattern — Transitive Dependency Opacity: depending on a library’s transitive dependencies without declaring them explicitly. If library A depends on library B, and your code also uses B, you should declare B directly — not rely on A’s dependency on B. If A later drops its dependency on B, your code breaks silently.


Time, Scale, and Trade-offs in Build System Design

The chapter synthesizes the build system discussion with the book’s broader trade-off framework:

DecisionSmall ScaleLarge Scale
Build system typeTask-based is fineArtifact-based required
HermeticityOptional (local dev only)Mandatory (CI, caching)
Distributed buildsUnnecessaryRequired for CI feasibility
Module granularityCoarse (less overhead)Fine-grained (correctness, parallelism)
External dependency policyVersion ranges acceptablePinned versions, single-version rule
Build time budgetMinutes acceptableMust be seconds (CI gate)

The consistent theme: decisions that seem like premature optimization at small scale become correctness requirements at large scale. Investing in build system discipline early pays dividends that compound over the life of the codebase.


TL;DRs

  • A build system is responsible for transforming source code into deployable artifacts in a reproducible and efficient manner.
  • Task-based build systems give engineers too much power: arbitrarily complex build steps can have unexpected effects; the system cannot understand or verify the actions it’s performing.
  • Artifact-based build systems restrict what engineers can express, but this restriction enables correctness guarantees, safe parallelism, and reliable caching that task-based systems cannot provide.
  • Hermetic builds — builds that depend only on declared inputs — are a prerequisite for reliable caching and distributed execution.
  • At Google’s scale, distributed builds are not an optimization; they are a prerequisite for continuous integration with reasonable cycle times.
  • Fine-grained modules with explicit dependency declarations make the dependency graph visible, queryable, and correct — at the cost of more BUILD file maintenance.
  • The single-version rule for external dependencies prevents diamond dependency conflicts and ensures uniform security patch application across the monorepo.
  • Build system design choices made early in a project’s life are hard to change later — investing in a hermetic, artifact-based build system early avoids costly migrations.

Key Takeaways

  1. The build system is engineering infrastructure, not a convenience tool — at scale, build system design determines whether CI is feasible, whether builds are reproducible, and whether cross-codebase changes can be made with confidence.
  2. Task-based systems fail at scale because arbitrary tasks are opaque — the system cannot determine what a task reads or writes without executing it, preventing safe parallelism, reliable caching, and hermeticity enforcement.
  3. Artifact-based systems trade engineer flexibility for system correctness — by restricting build rules to a controlled DSL, the system gains the ability to prove independence, cache reliably, and execute remotely.
  4. Hermeticity is the foundation of caching and distributed builds — without the guarantee that same inputs produce same outputs, cached results cannot be trusted and remote execution produces inconsistent results.
  5. Distributed builds are enabled by hermeticity, not just engineering — the same content-hash-based caching that enables incremental local builds naturally extends to remote caching and remote execution.
  6. The 1:1:1 rule encodes a philosophy of explicit, fine-grained dependencies — each directory has one purpose, one build target, and one package, making the dependency graph visible and ownership unambiguous.
  7. Minimizing module visibility prevents coupling accumulation — modules with broad visibility attract callers; callers make refactoring harder; the discipline of minimal visibility keeps modules changeable.
  8. The single-version rule for external dependencies eliminates a class of silent failures — diamond dependency conflicts and inconsistent security patch application are impossible when the entire codebase uses exactly one version of every library.
  9. Build system investment at small scale pays off at large scale — a team that builds hermeticity and artifact-based discipline early avoids the painful migration from a legacy task-based system that cannot be incrementally improved.
  10. “Correctness” in builds means reproducibility, not just passing tests — a build that sometimes produces wrong cached outputs because of hermeticity violations is more dangerous than a slow build that is always correct.

  • ch17-code-search — Kythe cross-references are generated as a build artifact; understanding the build system explains how Kythe works
  • ch19-critique — Code review tooling depends on build results (test status, coverage) surfaced during review
  • ch22-large-scale-changes — Large-scale changes often require build system support to execute and validate changes across millions of files
  • ch23-continuous-integration — CI is the primary consumer of hermetic, distributed builds; build system design determines CI cycle time

Last Updated: 2026-06-02