Chapter 5 Cheat Sheet — Encoding and Evolution

One-Line Summaries

Concept	One-Liner
Encoding	Converting in-memory structures to bytes for storage or transmission
Backward compatibility	Newer code can read data written by older code
Forward compatibility	Older code can read data written by newer code
Field tag (Protobuf)	Numeric field identity in wire format — the canonical, permanent identifier
Avro schema resolution	Writer/reader schemas compared at decode time; fields matched by name
Schema registry	Central versioned schema store; enforces compatibility before bad data reaches consumers
Durable execution	Workflow engine that persists execution state to DB; survives process crashes transparently
Event notification	Minimal event triggers callback to source for full data
Event-carried state transfer	Event carries full entity state; consumer is self-sufficient
Event sourcing	State = replay of immutable event log

Encoding Format Comparison

Format	Human-Readable	Schema	Binary	Evolution	Field Identity	Size vs JSON	Primary Use
JSON	Yes	Optional	No	Manual	Field name	100%	REST APIs, config
XML	Yes	Optional	No	Manual	Tag name	~130%	Legacy enterprise
CSV	Yes	None	No	Poor	Column position	~70%	Bulk data export
MessagePack	No	None	Yes	Manual	Field name	~80%	Redis, compact JSON
Protobuf	No	Required	Yes	Good	Field tag number	~40%	gRPC, internal RPC
Thrift	No	Required	Yes	Good	Field tag number	~42-70%	Internal RPC
Avro	No	Required	Yes	Excellent	Field name (by schema match)	~38%	Kafka, Hadoop
FlatBuffers	No	Required	Yes	Good	Field offset	~50%	Games, HFT

Compatibility Rules Quick Reference

PROTOCOL BUFFERS / THRIFT:
  Add optional field (new tag)  → SAFE    (backward + forward)
  Remove optional field          → SAFE    (tag becomes unused)
  Rename field                   → SAFE    (name not in wire)
  Change tag number              → UNSAFE  (old data misinterpreted)
  Add required field             → UNSAFE  (old data missing it)
  Reuse deleted tag              → UNSAFE  (old data re-interpreted)
  Change field type              → UNSAFE  (usually; widening is sometimes OK)

AVRO:
  Add field with default value   → SAFE    (missing in old data → use default)
  Remove field                   → SAFE    (old data has value → ignored)
  Rename field                   → UNSAFE  (name is identity; schema resolution fails)
  Add field without default      → UNSAFE  (old data has no value, no default → error)
  Change field type              → UNSAFE  (unless promotion rules allow it)

JSON:
  Everything                     → MANUAL  (no enforcement; discipline required)

Avro Schema Resolution

Writer's Schema (v1)              Reader's Schema (v2)
─────────────────────             ────────────────────────
field: userName  string  ──────→  field: userName  string
field: age       int     ──────→  field: age       int, default=0
                                  field: email     string, default=""  ← new (gets default)

Rules:
  Writer has field, Reader doesn't      → value IGNORED
  Reader has field, Writer doesn't      → reader uses DEFAULT value
  Both have field, same name            → value USED (type promotion if needed)
  Reader field has no default, Writer   → ERROR (cannot decode)
  lacks the field

Protobuf Wire Format

JSON (81 bytes):                    Protobuf (~33 bytes):
{                                   [tag=1, type=string][len][M][a][r][t][i][n]
  "userName": "Martin",             [tag=2, type=varint][1337 as varint]
  "favoriteNumber": 1337,           [tag=3, type=string][len][d][a][y][d][r]...
  "interests": ["daydreaming"]      [tag=3, type=string][len][h][a][c][k][i][n][g]
}

Key insight: tag number = field identity. Name is only in the .proto file.
Changing name: SAFE. Changing tag: CORRUPTS existing data.

Three Modes of Dataflow

1. THROUGH DATABASES
   ┌────────┐  encode  ┌──────┐  decode  ┌────────┐
   │ App v1 │ ───────→ │  DB  │ ───────→ │ App v2 │
   └────────┘          └──────┘          └────────┘
   ⚠ Unknown field preservation: App v1 must re-emit fields it doesn't understand
   ⚠ Rolling upgrades: v1 and v2 run simultaneously, both access same DB

2. THROUGH SERVICES (REST/RPC)
   ┌──────────┐  request   ┌──────────┐
   │  Client  │ ─────────→ │  Server  │
   │  (v1)    │ ←───────── │  (v2)    │
   └──────────┘  response  └──────────┘
   ✓ API versioning: /v2/users or Accept header
   ✓ Clients and servers deploy independently

3. THROUGH MESSAGE BROKERS (Kafka/RabbitMQ)
   ┌──────────┐  publish  ┌────────┐  consume  ┌──────────┐
   │ Producer │ ────────→ │ Kafka  │ ────────→ │ Consumer │
   └──────────┘           └────────┘           └──────────┘
   ⚠ Messages may be consumed days after production — must stay decodable
   ⚠ Schema registry mandatory for binary formats

4. THROUGH DURABLE EXECUTION (Temporal/Step Functions)
   ┌──────────────────────────────────────────────────────────┐
   │  Temporal Server (event log in DB)                        │
   │   workflow_started → activity_completed → timer_fired ... │
   └──────────────────────────────────────────────────────────┘
   Worker polls, replays event history, resumes from last checkpoint
   ⚠ Workflow code must be DETERMINISTIC — no time.now(), no random()
   ⚠ Activity inputs/outputs must be backward-compatible across versions

Durable Execution Deep Dive

PROBLEM:
  Step 1: Charge card
  Step 2: Send email
  Step 3: Update inventory
  → Process crashes after Step 1 — what happens to Steps 2 and 3?

WITHOUT durable execution:
  - Track state in DB manually (complex, error-prone)
  - Use saga pattern with compensating transactions (complex)
  - Leave steps partially done (data inconsistency)

WITH Temporal:
  @workflow.defn
  def order_workflow(order: Order):
      charge_result = workflow.execute_activity(charge_card, order)  # persisted
      email_result  = workflow.execute_activity(send_email, order)    # persisted
      inv_result    = workflow.execute_activity(update_inventory, order) # persisted
  
  On crash: replay event history → resume at Step 2 (Step 1 already persisted)
  Guarantee: each activity executes at-least-once; workflow is effectively-once

DETERMINISM RULE:
  ❌ time.now()              → use workflow.now()
  ❌ random.random()         → use workflow.random()
  ❌ direct HTTP call        → wrap in execute_activity()
  ❌ global mutable state    → each replay starts fresh

Event-Driven Architecture Patterns

1. EVENT NOTIFICATION
   ┌──────────┐  {orderId: "123"}  ┌──────────┐
   │  Order   │ ─────────────────→ │  Email   │
   │  Service │                    │  Service │──→ GET /orders/123
   └──────────┘                    └──────────┘
   + Tiny events, low coupling
   - Consumer still depends on source API

2. EVENT-CARRIED STATE TRANSFER
   ┌──────────┐  {orderId:"123", items:[...], total:49.99}  ┌───────────┐
   │  Order   │ ──────────────────────────────────────────→ │ Analytics │
   │  Service │                                              │  Service  │
   └──────────┘                                             └───────────┘
   + Consumer is autonomous (no callback needed)
   - Larger events, data duplication, evolution complexity

3. EVENT SOURCING
   Append-only event log:
     [order.created] [item.removed] [payment.received] [order.shipped]
                              ↓ fold/replay
                       Current Order State
   
   + Complete audit trail, time travel, multiple projections
   + CQRS: separate write model (events) from read models (projections)
   - Eventual consistency of projections
   - Schema evolution needs upcasters for old event formats

REST vs gRPC Decision Tree

Is the API consumed by external clients (browsers, mobile apps, third parties)?
├─ YES → REST + JSON + OpenAPI  (human-readable, universally accessible)
└─ NO  → Is it internal microservice-to-microservice?
          ├─ YES, needs streaming → gRPC (bidirectional streaming)
          ├─ YES, needs efficiency → gRPC (2-3x smaller, typed)
          └─ YES, needs simplicity → REST (easier debugging, no protoc setup)

Schema Registry Flow (Kafka + Avro)

PRODUCER:                           CONSUMER:
  schema = load("person.avsc")        msg = kafka.read()
  id = registry.register(schema)      schema_id = msg[:4]  # first 4 bytes
  payload = avro.encode(data,         schema = registry.fetch(schema_id)
                        schema)       data = avro.decode(msg[4:],
  kafka.send([id_bytes][payload])               writer_schema=schema,
                                                reader_schema=MY_SCHEMA)

Key Trade-offs Summary

Choose	When
JSON	Public API, external clients, human debugging needed
Protobuf	Internal microservices, gRPC, high-performance, typed contracts
Avro	Kafka pipelines, Hadoop, schema registry governance required
Durable Execution	Multi-step workflow, long-running, human-in-loop, crash safety critical
Message Queue	Short async tasks, work distribution, competing consumers
Event Sourcing	Full audit trail needed, time-travel queries, CQRS, multiple projections

Red Flags

Using Java Serializable / Python pickle for inter-service communication
Changing a Protobuf field tag number in an existing schema
Adding a required field to a Protobuf schema with existing data
Avro field renamed without also adding an alias and updating all consumers
Workflow code calling time.now() directly instead of the framework timer API
Not using a schema registry with Avro-encoded Kafka topics
ORM that silently drops unknown fields on update (destroys forward-compat fields)

Green Flags

All new Protobuf/Avro fields are optional with sensible default values
Schema compatibility checked in CI (Buf breaking change detection)
Schema registry with BACKWARD or FULL compatibility mode
Durable execution used for workflows with >1 external call
Event sourcing with upcaster pipeline for schema-evolved events
API versioning via /v2/ URL prefix for breaking changes

Quick Comparison: Durable Execution vs Alternatives

	Cron + DB	Message Queue	Durable Execution
Crash recovery	Manual (poll DB)	Re-delivery	Automatic replay
State management	You write it	You write it	Framework handles
Multi-step	Complex sagas	Complex choreography	Natural code flow
Long timers (days)	Cron jobs	Not designed for	Built-in timers
Visibility	Custom logging	Queue metrics	Full event history
Overhead	Low	Low	Medium (framework)

Quick Revision Time: 5 minutes
Interview Prep: 15 minutes
Last Updated: 2026-05-29

Study Notes by Niladri & AI

Explorer

ch05-cheatsheet

Chapter 5 Cheat Sheet — Encoding and Evolution

One-Line Summaries

Encoding Format Comparison

Compatibility Rules Quick Reference

Avro Schema Resolution

Protobuf Wire Format

Three Modes of Dataflow

Durable Execution Deep Dive

Event-Driven Architecture Patterns

REST vs gRPC Decision Tree

Schema Registry Flow (Kafka + Avro)

Key Trade-offs Summary

Red Flags

Green Flags

Quick Comparison: Durable Execution vs Alternatives

Graph View

Table of Contents