Chapter 6 Cheat Sheet - Partitioning
One-Line Summaries
| Concept | One-Liner |
|---|---|
| Partitioning | Splitting data so each node holds a subset (sharding) |
| Key-range partitioning | Adjacent keys stored together; enables range queries but risks hot spots |
| Hash partitioning | Hash key for uniform distribution; range queries become expensive |
| Hot spot | One partition receives disproportionate reads/writes |
| Consistent hashing | Ring-based; adding nodes only moves adjacent keys |
| Scatter/gather | Query all partitions in parallel (for local secondary indexes) |
| Rebalancing | Moving partitions between nodes when topology changes |
| Request routing | How clients discover which node holds a given key |
Partitioning Strategy Decision Tree
What are your access patterns?
│
├─ Need range queries (e.g., all events from Jan-Feb)?
│ ├─ Use key-range partitioning
│ └─ Watch out for: hot spots from sequential keys (e.g., timestamps)
│ Fix: Prepend random/varied prefix to key
│
├─ Need uniform write/read distribution, no range needed?
│ └─ Use hash partitioning (consistent hashing)
│
├─ Need both range and even distribution?
│ └─ Cassandra compound key:
│ Partition key (hashed) + Clustering key (sorted within partition)
│
└─ Have hot keys (celebrities, trending items)?
└─ Application-level workaround:
Append random suffix to key (splits writes, merges reads)
Key-Range vs Hash Partitioning
Key-Range Partitioning Hash Partitioning
─────────────────────── ──────────────────────
[A-D] → Node 1 hash(key) % N → Node
[E-M] → Node 2
[N-Z] → Node 3
✅ Range queries efficient ✅ Uniform distribution
✅ Sorted data within partition ✅ No hot spots from key patterns
❌ Hot spots from sequential keys ❌ Range queries require scatter/gather
❌ Manual split boundaries ❌ Can't know adjacent keys
Secondary Index Partitioning
Document-Partitioned (Local): Term-Partitioned (Global):
────────────────────────────── ───────────────────────────
Partition 1: Index for "red":
data: {car: red, ...} → partition 1, row 5
local_idx: red → [row_5] → partition 3, row 12
Partition 2: Index for "blue":
data: {car: red, ...} → partition 2, row 7
local_idx: red → [row_12]
WRITE: Update local partition only WRITE: Update global index (cross-partition)
READ: Scatter to ALL partitions READ: Lookup index partition only
Gather results (but index may be slightly stale = eventual)
Used by: MongoDB, Elasticsearch Used by: DynamoDB GSI, Riak Search
Rebalancing Strategies
1. Fixed number of partitions (Riak, Elasticsearch, Couchbase)
Create 1000 partitions for 10 nodes (100/node)
Add node: steal ~50 partitions from each existing node
Con: partition count set at creation, hard to change
2. Dynamic partitioning (HBase, RethinkDB, MongoDB)
Split: partition > 10GB → split into two 5GB partitions
Merge: partition < threshold → merge with neighbor
Pro: Automatic adaptation to data volume
Con: Single node handles all data until first split
3. Proportional to nodes (Cassandra, Ketama)
Each node gets fixed # of partitions (e.g., 256)
Add node: new node randomly steals from existing nodes
Pro: Scales with cluster size naturally
Consistent Hashing Ring
Hash space: 0 ─────────────────────── 2^64
Nodes placed at random positions:
key_B (hash=25)
↓
0 Node A (hash=10) Node B (hash=45) Node C (hash=80) 2^64
──────●─────────────────────●──────────────────●─────────────────●
key_B (hash=25) → goes to Node B (first node clockwise from 25)
Add Node D at position 35:
key_B (hash=25) → now goes to Node D (nearest clockwise)
Only keys between 10-35 are affected!
All other keys unchanged → minimal rebalancing
Request Routing Approaches
Approach 1: Any node (gossip) Approach 2: Routing tier
───────────────────────────── ────────────────────────
Client → Node A Client → Router (ZooKeeper/etcd)
↓ (if not owner) ↓
Node A → Node B Router → Correct Node
↓
Node A ← answer ← Node B
↓
Client ← response
Approach 3: Smart client
─────────────────────────
Client has partition map (from ZooKeeper subscription)
Client → Correct Node directly
Used by: Kafka consumers with topic metadata
Key Trade-offs
| Decision | Pro | Con | When to Use |
|---|---|---|---|
| Key-range partition | Range queries, sorted data | Hot spots from sequential keys | Time-series with composite key |
| Hash partition | Uniform distribution | No range queries | Random keys, uniform load |
| Local secondary index | Single-partition writes | Scatter/gather reads | Write-heavy + range searches |
| Global secondary index | Efficient reads | Cross-partition writes, eventual | Read-heavy secondary lookups |
| Dynamic partitioning | Auto-adapts to data size | Single node starts all data | Unpredictable data growth |
| Manual rebalancing | Safe, controlled | Operational burden | Stability-critical systems |
Hot Spot Mitigation
Hot Key: user_id = "celebrity123" gets 1M reads/sec
Fix 1: Random suffix
Write: key = "celebrity123_0" through "celebrity123_99"
Read: query all 100 keys and combine → extra complexity
Fix 2: Caching layer
Put Redis/Memcached in front
Celebrity's data served from cache, not DB partition
Fix 3: Read replicas for hot partition
Increase replication factor for the hot partition's node
Reads served from multiple replicas
Fix 4: CDN for public data
Celebrity's public posts → CDN → no DB reads at all for most traffic
Red Flags
❌ Using timestamps directly as partition keys (all writes go to today’s partition)
❌ No scatter/gather awareness in queries (undetected performance cliff)
❌ Automatic rebalancing during a failure incident (amplifies the outage)
❌ Ignoring hot spots — assuming hash partitioning magically solves all problems
❌ No partition-aware routing (sending all requests to random node = extra hops)
Green Flags
✅ Test your partitioning scheme with realistic access patterns
✅ Monitor partition sizes and per-partition throughput
✅ Use compound keys (hash + range) when you need both
✅ Global secondary indexes for read-heavy secondary access patterns
✅ Plan for rebalancing before you need it
Modern Additions (2026)
NewSQL auto-sharding:
├─ CockroachDB, Spanner, YugabyteDB
├─ Automatic range splitting and rebalancing
└─ Transparent to application — no shard key design
Kafka KRaft:
├─ Partition metadata in Kafka itself (no ZooKeeper)
├─ KRaft protocol = Raft-based metadata management
└─ Simpler operations, faster partition reassignment
Serverless partitioning:
├─ DynamoDB on-demand: auto-partition scaling with traffic
├─ Neon PostgreSQL: partition at storage layer
└─ No manual sharding decisions needed
Interview Response Templates
When Asked About Sharding Strategy
“I’d first ask what the access patterns are. For sequential writes (like logs or time-series), key-range partitioning risks hot spots — I’d add a prefix (sensor ID, user ID) to distribute across ranges. For random access without range needs, hash partitioning with consistent hashing gives even distribution. If I need both, a compound key (Cassandra-style) gives hash distribution with range within each partition.”
When Asked About Secondary Index Partitioning Trade-off
“Local secondary indexes mean writes touch only one partition (fast) but reads require scatter/gather across all partitions (slow, high tail latency). Global secondary indexes make reads efficient but writes cross partitions and the index may be slightly stale. For a write-heavy system with occasional secondary searches, I’d use local indexes. For a read-heavy system where secondary lookups are critical, global indexes are worth the write cost.”
Quick Revision Time: 5 minutes
Interview Prep: 15 minutes
Last Updated: 2026-04-13