Chapter 5: Consistent Hashing
volume1 consistent-hashing distributed-systems
Status: đźź© Interview ready
Difficulty: Medium-Hard (Core distributed systems concept)
Time to complete: 45-60 min read + practice
Overview
Consistent hashing is a special hashing technique that minimizes key redistribution when hash table size changes. It’s a foundational concept in distributed systems.
Why this matters: Essential for designing scalable distributed caches, databases, and load balancers. Frequently asked in system design interviews.
Real-world uses:
- Amazon DynamoDB (partitioning)
- Apache Cassandra (data distribution)
- Memcached, Redis clusters (cache distribution)
- Akamai CDN (content distribution)
- Discord (server sharding)
The Problem: Rehashing
Naive Hash-Based Distribution
Scenario: Distribute keys across N servers using: serverIndex = hash(key) % N
Example with 4 servers (N=4):
- key1 hash: 1234 → server 1234 % 4 = 2
- key2 hash: 5678 → server 5678 % 4 = 2
- key3 hash: 9012 → server 9012 % 4 = 0
- key4 hash: 3456 → server 3456 % 4 = 0
Works fine… until servers change!
The Rehashing Problem
When a server is added or removed, most keys get redistributed.
Server goes down (N=4 → N=3):
- key1: 1234 % 3 = 1 (was 2) ❌ MOVED
- key2: 5678 % 3 = 2 (was 2) âś… SAME
- key3: 9012 % 3 = 0 (was 0) âś… SAME
- key4: 3456 % 3 = 0 (was 0) âś… SAME
Only lucky keys stay in place!
With many keys, ~75% need to move when going from 4→3 servers.
Why This Is Bad
- Cache stampede: All moved keys miss cache → hit database simultaneously
- Data movement: Massive data transfer between servers
- Downtime: Service degraded during redistribution
- Cascading failures: Remaining servers overwhelmed
Example impact:
- 1M keys in cache
- 1 server fails (4→3 servers)
- 750,000 keys invalidated
- 750,000 cache misses hit database
- Database overload → entire system down
Solution: Consistent Hashing
Core Idea
Instead of hash(key) % N, use a hash ring where both keys and servers are hashed to points on a circle.
Key insight: When a server is added/removed, only keys near that server are affected. Most keys stay on their current server.
How It Works
Step 1: Create a hash ring (0 to 2^32-1)
Imagine a circle with values 0 to 4,294,967,295 (2^32-1)
The circle wraps around: 2^32-1 → 0
Step 2: Hash servers onto the ring
Server A: hash("serverA") = 100
Server B: hash("serverB") = 200
Server C: hash("serverC") = 300
Server D: hash("serverD") = 400
Ring visualization:
0
|
D 400 --- 100 A
| |
C 300 --- 200 B
Step 3: Hash keys onto the ring
key1: hash("key1") = 150
key2: hash("key2") = 250
key3: hash("key3") = 350
key4: hash("key4") = 50
Step 4: Assign each key to the next server clockwise
key1 (150) → goes to Server B (200) [next clockwise]
key2 (250) → goes to Server C (300)
key3 (350) → goes to Server D (400)
key4 (50) → goes to Server A (100)
Ring with keys:
0
key4
| |
D 400 - 100 A
| key1
key3 |
| 200 B
C 300 |
key2
Adding a Server
Add Server E at position 225
Before:
key1 (150) → Server B (200)
key2 (250) → Server C (300)
key3 (350) → Server D (400)
key4 (50) → Server A (100)
After adding Server E (225):
key1 (150) → Server B (200) ✅ NO CHANGE
key2 (250) → Server E (225) ❌ MOVED (was C)
key3 (350) → Server D (400) ✅ NO CHANGE
key4 (50) → Server A (100) ✅ NO CHANGE
Only 1 out of 4 keys moved! (25% vs 75% with naive hashing)
Removing a Server
Remove Server B (200)
Before:
key1 (150) → Server B (200)
key2 (250) → Server C (300)
key3 (350) → Server D (400)
key4 (50) → Server A (100)
After removing Server B:
key1 (150) → Server C (300) ❌ MOVED (was B)
key2 (250) → Server C (300) ✅ NO CHANGE
key3 (350) → Server D (400) ✅ NO CHANGE
key4 (50) → Server A (100) ✅ NO CHANGE
Only key1 affected! Keys from B redistributed to next server (C)
Key property: Only K/N keys need to move (where K = total keys, N = servers)
Problem: Uneven Distribution
With basic consistent hashing, servers might not be evenly distributed on the ring.
Bad distribution example:
Server A: position 10
Server B: position 20
Server C: position 30
Server D: position 300
Result:
- Servers A, B, C handle tiny ranges
- Server D handles huge range (30-300)
- Load imbalance!
Solution: Virtual Nodes (Vnodes)
Idea: Each physical server gets multiple positions on the ring (virtual nodes).
Instead of:
Server A → 1 position
Use:
Server A → vnode-A-1, vnode-A-2, vnode-A-3 (3 positions)
Server B → vnode-B-1, vnode-B-2, vnode-B-3 (3 positions)
...
With 4 servers Ă— 150 vnodes = 600 points on ring
Much more even distribution!
How virtual nodes work:
Physical Server A creates virtual nodes:
- vnode-A-0: hash("A-0") = 1234
- vnode-A-1: hash("A-1") = 5678
- vnode-A-2: hash("A-2") = 9012
- ... (150 virtual nodes)
All vnodes for Server A map back to physical Server A.
Benefits:
- Even distribution: More vnodes = smoother load distribution
- Weighted distribution: Powerful servers can have more vnodes
- Faster rebalancing: Keys spread across many servers when one fails
Trade-off: More memory (need to track all vnodes)
Memory calculation:
- 100 servers Ă— 150 vnodes = 15,000 entries
- Each entry: 32-byte hash + 8-byte pointer = 40 bytes
- Total: 15,000 Ă— 40 = 600 KB (negligible!)
Standard vnode counts:
- Cassandra: 128-256 vnodes per node
- DynamoDB: 100-200 vnodes
- Riak: 64 vnodes (default)
Implementation Details
Hash Function Choice
Requirements:
- Uniform distribution
- Fast computation
- Low collision rate
Popular choices:
- MD5: 128-bit hash, good distribution, widely used
- MurmurHash: Faster than MD5, non-cryptographic
- SHA-1: 160-bit, more secure but slower
- xxHash: Extremely fast, modern choice
Interview answer: “I’d use MD5 or MurmurHash for speed and uniform distribution.”
Data Structure
Store ring as sorted list of (hash, server) pairs
class ConsistentHash:
def __init__(self, vnodes_per_server=150):
self.ring = [] # sorted list of (hash_value, server_id)
self.vnodes_per_server = vnodes_per_server
self.servers = {}
def add_server(self, server_id):
"""Add server with virtual nodes"""
for i in range(self.vnodes_per_server):
vnode_key = f"{server_id}-{i}"
hash_val = self._hash(vnode_key)
self.ring.append((hash_val, server_id))
# Keep ring sorted for binary search
self.ring.sort()
self.servers[server_id] = True
def remove_server(self, server_id):
"""Remove all vnodes for this server"""
self.ring = [(h, s) for h, s in self.ring if s != server_id]
del self.servers[server_id]
def get_server(self, key):
"""Find server for given key"""
if not self.ring:
return None
hash_val = self._hash(key)
# Binary search for first server >= hash_val
idx = self._binary_search(hash_val)
# Wrap around if needed
if idx >= len(self.ring):
idx = 0
return self.ring[idx][1] # return server_id
def _hash(self, key):
"""Hash function (MD5 example)"""
import hashlib
return int(hashlib.md5(key.encode()).hexdigest(), 16) % (2**32)
def _binary_search(self, hash_val):
"""Find insertion point in sorted ring"""
left, right = 0, len(self.ring)
while left < right:
mid = (left + right) // 2
if self.ring[mid][0] < hash_val:
left = mid + 1
else:
right = mid
return leftTime complexity:
add_server: O(V log N) where V = vnodes, N = total vnodesremove_server: O(N) where N = total vnodesget_server: O(log N) binary search
Space complexity: O(S Ă— V) where S = servers, V = vnodes per server
Handling Edge Cases
Empty ring:
if not self.ring:
raise Exception("No servers available")Wrap-around:
# If key hash > all server hashes, wrap to first server
if idx >= len(self.ring):
idx = 0Duplicate hashes (collision):
# Use (hash, server_id) tuples; servers differ even if hash same
# Or use collision resolution in hash functionReal-World Examples
1. Amazon DynamoDB
Usage: Partition data across nodes
- Each node responsible for range of hash values
- Uses consistent hashing with virtual nodes
- 100-200 vnodes per physical node
- Automatic rebalancing when nodes added/removed
Key feature: Preference list for replication
For key K with hash H:
1. Primary: First server clockwise from H
2. Replica 1: Next server clockwise
3. Replica 2: Next server clockwise
If server fails, replicas take over seamlessly
2. Apache Cassandra
Usage: Distribute data across cluster
- Token ring with configurable vnodes (default 256)
- Each node owns multiple ranges on ring
- Read/write requests routed to correct node(s)
Example: 3-node cluster with replication factor 3
Node A: tokens 0-300, owns ranges [0-100, 200-300, ...]
Node B: tokens 100-400, owns ranges [100-200, 300-400, ...]
Node C: tokens 200-500, owns ranges [200-300, 400-500, ...]
Key with hash 250 → stored on Nodes B, C, A (next 3 clockwise)
3. Memcached (with ketama)
Usage: Distribute cache keys across memcached servers
# Client-side consistent hashing
servers = ["cache1:11211", "cache2:11211", "cache3:11211"]
ch = ConsistentHash()
for server in servers:
ch.add_server(server)
# Get key's server
key = "user:12345:profile"
server = ch.get_server(key)
value = memcached_client.get(server, key)Benefit: When cache server fails, only ~1/N keys affected (not all)
4. Load Balancing
Usage: Distribute requests to backend servers
- Hash based on user_id, session_id, or IP
- Ensures same user → same server (sticky sessions)
- Minimal disruption when servers added/removed
Example:
def route_request(user_id):
server = consistent_hash.get_server(user_id)
return proxy_to_server(server, request)Why better than round-robin: Session affinity without sticky sessions
Trade-offs and Limitations
Advantages âś…
- Minimal redistribution: Only K/N keys move (vs all keys)
- Horizontal scalability: Easy to add/remove nodes
- Even distribution: Virtual nodes balance load
- Fault tolerance: Graceful degradation when nodes fail
- No coordination: Each client can compute independently
Disadvantages ❌
- Complexity: More complex than simple
hash % N - Memory overhead: Need to store ring structure
- No range queries: Keys distributed randomly (can’t scan ranges)
- Hotspots possible: Popular keys still create hotspots
- Cascading failure: If node fails, next node gets all its traffic
When to Use
Use consistent hashing when:
- Distributing data across multiple servers
- Servers frequently added/removed (elastic scaling)
- Want to minimize data movement
- Need fault tolerance
- Using distributed cache or database
Don’t use when:
- Single server (not distributed)
- Need range queries (use range-based sharding)
- Stable cluster (simple hash % N is fine)
- Strong ordering requirements
Comparison: Consistent Hashing vs Alternatives
1. Consistent Hashing vs Modulo Hashing
| Aspect | Modulo (hash % N) | Consistent Hashing |
|---|---|---|
| Simplicity | Very simple | More complex |
| Redistribution | ~100% keys move | ~1/N keys move |
| Even distribution | Perfect (if hash good) | Good (with vnodes) |
| Memory | None | O(servers Ă— vnodes) |
| Use case | Stable cluster | Dynamic cluster |
2. Consistent Hashing vs Range-Based Sharding
| Aspect | Range Sharding | Consistent Hashing |
|---|---|---|
| Range queries | Efficient | Inefficient |
| Load balancing | Manual rebalancing | Automatic |
| Hotspots | Common (popular ranges) | Less common |
| Complexity | Simple | Moderate |
| Use case | Ordered data (timestamps) | Random access |
Example: Time-series data (logs, metrics) → Range sharding better
3. Consistent Hashing vs Jump Hash
Jump Hash (Google’s algorithm):
- Simpler than consistent hashing
- Zero memory overhead
- But: Cannot choose which server to add/remove
- Use case: When you control server additions (not failures)
def jump_hash(key, num_buckets):
"""Google's Jump Hash"""
b, j = -1, 0
while j < num_buckets:
b = j
key = ((key * 2862933555777941757) + 1) & 0xffffffffffffffff
j = int((b + 1) * (float(1 << 31) / float((key >> 33) + 1)))
return bInterview Tips
Common Questions
Q: “What is consistent hashing?”
A: “Consistent hashing is a distributed hashing technique that minimizes key redistribution when the number of servers changes. Unlike traditional hash % N, which moves most keys when N changes, consistent hashing moves only about 1/N of keys on average.”
Q: “Why do we need virtual nodes?”
A: “Virtual nodes solve the load imbalance problem. With one position per server on the hash ring, you might get unlucky with distribution—one server could get a huge range. Virtual nodes give each server multiple positions, which statistically balances the load. Real systems use 100-200 vnodes per server.”
Q: “What are the trade-offs?”
A: “Pros: minimal redistribution, easy scaling, fault tolerance. Cons: more complex than simple modulo hashing, uses memory for ring structure, doesn’t support range queries efficiently, and hotspot keys can still overload a single server.”
Q: “How would you implement it?”
A: “Store a sorted array of (hash, server) pairs. When adding a server, hash its vnodes and insert into sorted array. When looking up a key, hash it and use binary search to find the next server clockwise. Time complexity is O(log N) for lookups.”
Q: “When would you NOT use consistent hashing?”
A: “Don’t use it for single-server systems, when you need range queries (use range-based sharding), or when your cluster is completely stable (simple modulo is fine). Also avoid if you have strong ordering requirements.”
Whiteboard Practice
Exercise: Draw consistent hashing on whiteboard
1. Draw circle (hash ring)
2. Add 4 servers at random positions
3. Add 3 keys, show which server each goes to
4. Remove 1 server, show which keys move
5. Explain only 25% of keys moved vs 75% with modulo
Time: Should take 5-7 minutes
Code Implementation Questions
Implement these functions (good practice):
add_server(server_id)with vnodesremove_server(server_id)get_server(key)with binary searchget_n_servers(key, n)for replication
Complexity analysis:
- State time/space complexity for each operation
- Mention binary search optimization
Key Takeaways
-
The problem: Traditional
hash % Ncauses massive redistribution when N changes -
The solution: Map both keys and servers to a ring; each key goes to next server clockwise
-
Virtual nodes: Each server has multiple ring positions for even load distribution
-
Performance: O(log N) lookups with binary search, ~1/N keys move per server change
-
Real-world: Used in DynamoDB, Cassandra, Memcached, CDNs, load balancers
-
Interview focus:
- Explain the rehashing problem clearly
- Draw the ring diagram
- Mention virtual nodes
- Discuss trade-offs vs alternatives
- Know time/space complexity
-
When to use: Distributed caching, dynamic clusters, horizontal scaling, fault tolerance
-
When NOT to use: Range queries needed, stable clusters, single server, strong ordering
Practice Problems
Problem 1: Design Distributed Cache
Given: 1000 cache servers, 10M keys, servers frequently fail
Question: How would you use consistent hashing? How many vnodes?
Answer:
- Use consistent hashing with 150 vnodes per server
- Total ring size: 150,000 positions
- When server fails: ~10M/1000 = 10K keys move
- Replication factor 3: store on 3 consecutive servers
- Monitor: key distribution, hotspots, failure impact
Problem 2: Calculate Redistribution
Given: 10 servers, 1M keys uniformly distributed
Question:
- How many keys move with modulo hashing when 1 server fails?
- How many with consistent hashing?
Answer:
Modulo hashing (hash % N):
- 10 → 9 servers
- New positions: hash % 9 (different from hash % 10)
- Estimate: ~90% keys change position
- Keys moved: ~900,000
Consistent hashing:
- Keys only from failed server move
- Each server has 1M/10 = 100K keys
- Keys moved: ~100K (10x less!)
Problem 3: Implement Weighted Consistent Hashing
Question: Some servers are more powerful. How to give them more load?
Answer:
def add_server(self, server_id, weight=1.0):
"""
weight=1.0: normal (150 vnodes)
weight=2.0: 2x powerful (300 vnodes)
weight=0.5: 0.5x powerful (75 vnodes)
"""
num_vnodes = int(self.vnodes_per_server * weight)
for i in range(num_vnodes):
vnode_key = f"{server_id}-{i}"
hash_val = self._hash(vnode_key)
self.ring.append((hash_val, server_id))
self.ring.sort()Related Topics
- Chapter 1 ch01-scale-zero-to-millions: Database sharding (uses consistent hashing)
- Chapter 6 ch06-key-value-store: Distributed key-value store design
- Distributed Systems distributed-system-components: How consistent hashing fits
External Resources
- Consistent Hashing Paper (1997): Original MIT paper by Karger et al.
- Cassandra Architecture: DataStax docs on token rings
- DynamoDB: AWS whitepaper on partitioning
- Blog: Tom White’s visual explanation
Master this chapter before designing any distributed system! Consistent hashing is fundamental for cache distribution, database sharding, and load balancing.
Practice: Draw the ring diagram 10 times until you can do it from memory in an interview.
Last Updated: 2026-04-09
Status: Complete - Ready for interview preparation