Chapter 5: Consistent Hashing

volume1 consistent-hashing distributed-systems

Status: 🟩 Interview ready
Difficulty: Medium-Hard (Core distributed systems concept)
Time to complete: 45-60 min read + practice

Overview

Consistent hashing is a special hashing technique that minimizes key redistribution when hash table size changes. It’s a foundational concept in distributed systems.

Why this matters: Essential for designing scalable distributed caches, databases, and load balancers. Frequently asked in system design interviews.

Real-world uses:

Amazon DynamoDB (partitioning)
Apache Cassandra (data distribution)
Memcached, Redis clusters (cache distribution)
Akamai CDN (content distribution)
Discord (server sharding)

The Problem: Rehashing

Naive Hash-Based Distribution

Scenario: Distribute keys across N servers using: serverIndex = hash(key) % N

Example with 4 servers (N=4):
- key1 hash: 1234 → server 1234 % 4 = 2
- key2 hash: 5678 → server 5678 % 4 = 2
- key3 hash: 9012 → server 9012 % 4 = 0
- key4 hash: 3456 → server 3456 % 4 = 0

Works fine… until servers change!

The Rehashing Problem

When a server is added or removed, most keys get redistributed.

Server goes down (N=4 → N=3):
- key1: 1234 % 3 = 1 (was 2) ❌ MOVED
- key2: 5678 % 3 = 2 (was 2) ✅ SAME
- key3: 9012 % 3 = 0 (was 0) ✅ SAME
- key4: 3456 % 3 = 0 (was 0) ✅ SAME

Only lucky keys stay in place!

With many keys, ~75% need to move when going from 4→3 servers.

Why This Is Bad

Cache stampede: All moved keys miss cache → hit database simultaneously
Data movement: Massive data transfer between servers
Downtime: Service degraded during redistribution
Cascading failures: Remaining servers overwhelmed

Example impact:

1M keys in cache
1 server fails (4→3 servers)
750,000 keys invalidated
750,000 cache misses hit database
Database overload → entire system down

Solution: Consistent Hashing

Core Idea

Instead of hash(key) % N, use a hash ring where both keys and servers are hashed to points on a circle.

Key insight: When a server is added/removed, only keys near that server are affected. Most keys stay on their current server.

How It Works

Step 1: Create a hash ring (0 to 2^32-1)

Imagine a circle with values 0 to 4,294,967,295 (2^32-1)
The circle wraps around: 2^32-1 → 0

Step 2: Hash servers onto the ring

Server A: hash("serverA") = 100
Server B: hash("serverB") = 200
Server C: hash("serverC") = 300
Server D: hash("serverD") = 400

Ring visualization:
    0
    |
 D 400 --- 100 A
    |       |
 C 300 --- 200 B

Step 3: Hash keys onto the ring

key1: hash("key1") = 150
key2: hash("key2") = 250
key3: hash("key3") = 350
key4: hash("key4") = 50

Step 4: Assign each key to the next server clockwise

key1 (150) → goes to Server B (200) [next clockwise]
key2 (250) → goes to Server C (300)
key3 (350) → goes to Server D (400)
key4 (50)  → goes to Server A (100)

Ring with keys:
        0
      key4
    |     |
 D 400 - 100 A
    |    key1
 key3   |
    |   200 B
 C 300  |
       key2

Adding a Server

Add Server E at position 225

Before:
key1 (150) → Server B (200)
key2 (250) → Server C (300)
key3 (350) → Server D (400)
key4 (50)  → Server A (100)

After adding Server E (225):
key1 (150) → Server B (200) ✅ NO CHANGE
key2 (250) → Server E (225) ❌ MOVED (was C)
key3 (350) → Server D (400) ✅ NO CHANGE
key4 (50)  → Server A (100) ✅ NO CHANGE

Only 1 out of 4 keys moved! (25% vs 75% with naive hashing)

Removing a Server

Remove Server B (200)

Before:
key1 (150) → Server B (200)
key2 (250) → Server C (300)
key3 (350) → Server D (400)
key4 (50)  → Server A (100)

After removing Server B:
key1 (150) → Server C (300) ❌ MOVED (was B)
key2 (250) → Server C (300) ✅ NO CHANGE
key3 (350) → Server D (400) ✅ NO CHANGE
key4 (50)  → Server A (100) ✅ NO CHANGE

Only key1 affected! Keys from B redistributed to next server (C)

Key property: Only K/N keys need to move (where K = total keys, N = servers)

Problem: Uneven Distribution

With basic consistent hashing, servers might not be evenly distributed on the ring.

Bad distribution example:
Server A: position 10
Server B: position 20
Server C: position 30
Server D: position 300

Result:
- Servers A, B, C handle tiny ranges
- Server D handles huge range (30-300)
- Load imbalance!

Solution: Virtual Nodes (Vnodes)

Idea: Each physical server gets multiple positions on the ring (virtual nodes).

Instead of:
Server A → 1 position

Use:
Server A → vnode-A-1, vnode-A-2, vnode-A-3 (3 positions)
Server B → vnode-B-1, vnode-B-2, vnode-B-3 (3 positions)
...

With 4 servers × 150 vnodes = 600 points on ring
Much more even distribution!

How virtual nodes work:

Physical Server A creates virtual nodes:
- vnode-A-0: hash("A-0") = 1234
- vnode-A-1: hash("A-1") = 5678
- vnode-A-2: hash("A-2") = 9012
- ... (150 virtual nodes)

All vnodes for Server A map back to physical Server A.

Benefits:

Even distribution: More vnodes = smoother load distribution
Weighted distribution: Powerful servers can have more vnodes
Faster rebalancing: Keys spread across many servers when one fails

Trade-off: More memory (need to track all vnodes)

Memory calculation:
- 100 servers × 150 vnodes = 15,000 entries
- Each entry: 32-byte hash + 8-byte pointer = 40 bytes
- Total: 15,000 × 40 = 600 KB (negligible!)

Standard vnode counts:

Cassandra: 128-256 vnodes per node
DynamoDB: 100-200 vnodes
Riak: 64 vnodes (default)

Implementation Details

Hash Function Choice

Requirements:

Uniform distribution
Fast computation
Low collision rate

Popular choices:

MD5: 128-bit hash, good distribution, widely used
MurmurHash: Faster than MD5, non-cryptographic
SHA-1: 160-bit, more secure but slower
xxHash: Extremely fast, modern choice

Interview answer: “I’d use MD5 or MurmurHash for speed and uniform distribution.”

Data Structure

Store ring as sorted list of (hash, server) pairs

class ConsistentHash:
    def __init__(self, vnodes_per_server=150):
        self.ring = []  # sorted list of (hash_value, server_id)
        self.vnodes_per_server = vnodes_per_server
        self.servers = {}
 
    def add_server(self, server_id):
        """Add server with virtual nodes"""
        for i in range(self.vnodes_per_server):
            vnode_key = f"{server_id}-{i}"
            hash_val = self._hash(vnode_key)
            self.ring.append((hash_val, server_id))
 
        # Keep ring sorted for binary search
        self.ring.sort()
        self.servers[server_id] = True
 
    def remove_server(self, server_id):
        """Remove all vnodes for this server"""
        self.ring = [(h, s) for h, s in self.ring if s != server_id]
        del self.servers[server_id]
 
    def get_server(self, key):
        """Find server for given key"""
        if not self.ring:
            return None
 
        hash_val = self._hash(key)
 
        # Binary search for first server >= hash_val
        idx = self._binary_search(hash_val)
 
        # Wrap around if needed
        if idx >= len(self.ring):
            idx = 0
 
        return self.ring[idx][1]  # return server_id
 
    def _hash(self, key):
        """Hash function (MD5 example)"""
        import hashlib
        return int(hashlib.md5(key.encode()).hexdigest(), 16) % (2**32)
 
    def _binary_search(self, hash_val):
        """Find insertion point in sorted ring"""
        left, right = 0, len(self.ring)
        while left < right:
            mid = (left + right) // 2
            if self.ring[mid][0] < hash_val:
                left = mid + 1
            else:
                right = mid
        return left

Time complexity:

add_server: O(V log N) where V = vnodes, N = total vnodes
remove_server: O(N) where N = total vnodes
get_server: O(log N) binary search

Space complexity: O(S × V) where S = servers, V = vnodes per server

Handling Edge Cases

Empty ring:

if not self.ring:
    raise Exception("No servers available")

Wrap-around:

# If key hash > all server hashes, wrap to first server
if idx >= len(self.ring):
    idx = 0

Duplicate hashes (collision):

# Use (hash, server_id) tuples; servers differ even if hash same
# Or use collision resolution in hash function

Real-World Examples

1. Amazon DynamoDB

Usage: Partition data across nodes

- Each node responsible for range of hash values
- Uses consistent hashing with virtual nodes
- 100-200 vnodes per physical node
- Automatic rebalancing when nodes added/removed

Key feature: Preference list for replication

For key K with hash H:
1. Primary: First server clockwise from H
2. Replica 1: Next server clockwise
3. Replica 2: Next server clockwise

If server fails, replicas take over seamlessly

2. Apache Cassandra

Usage: Distribute data across cluster

- Token ring with configurable vnodes (default 256)
- Each node owns multiple ranges on ring
- Read/write requests routed to correct node(s)

Example: 3-node cluster with replication factor 3

Node A: tokens 0-300, owns ranges [0-100, 200-300, ...]
Node B: tokens 100-400, owns ranges [100-200, 300-400, ...]
Node C: tokens 200-500, owns ranges [200-300, 400-500, ...]

Key with hash 250 → stored on Nodes B, C, A (next 3 clockwise)

3. Memcached (with ketama)

Usage: Distribute cache keys across memcached servers

# Client-side consistent hashing
servers = ["cache1:11211", "cache2:11211", "cache3:11211"]
ch = ConsistentHash()
for server in servers:
    ch.add_server(server)
 
# Get key's server
key = "user:12345:profile"
server = ch.get_server(key)
value = memcached_client.get(server, key)

Benefit: When cache server fails, only ~1/N keys affected (not all)

4. Load Balancing

Usage: Distribute requests to backend servers

- Hash based on user_id, session_id, or IP
- Ensures same user → same server (sticky sessions)
- Minimal disruption when servers added/removed

Example:

def route_request(user_id):
    server = consistent_hash.get_server(user_id)
    return proxy_to_server(server, request)

Why better than round-robin: Session affinity without sticky sessions

Trade-offs and Limitations

Advantages ✅

Minimal redistribution: Only K/N keys move (vs all keys)
Horizontal scalability: Easy to add/remove nodes
Even distribution: Virtual nodes balance load
Fault tolerance: Graceful degradation when nodes fail
No coordination: Each client can compute independently

Disadvantages ❌

Complexity: More complex than simple hash % N
Memory overhead: Need to store ring structure
No range queries: Keys distributed randomly (can’t scan ranges)
Hotspots possible: Popular keys still create hotspots
Cascading failure: If node fails, next node gets all its traffic

When to Use

Use consistent hashing when:

Distributing data across multiple servers
Servers frequently added/removed (elastic scaling)
Want to minimize data movement
Need fault tolerance
Using distributed cache or database

Don’t use when:

Single server (not distributed)
Need range queries (use range-based sharding)
Stable cluster (simple hash % N is fine)
Strong ordering requirements

Comparison: Consistent Hashing vs Alternatives

1. Consistent Hashing vs Modulo Hashing

Aspect	Modulo (`hash % N`)	Consistent Hashing
Simplicity	Very simple	More complex
Redistribution	~100% keys move	~1/N keys move
Even distribution	Perfect (if hash good)	Good (with vnodes)
Memory	None	O(servers × vnodes)
Use case	Stable cluster	Dynamic cluster

2. Consistent Hashing vs Range-Based Sharding

Aspect	Range Sharding	Consistent Hashing
Range queries	Efficient	Inefficient
Load balancing	Manual rebalancing	Automatic
Hotspots	Common (popular ranges)	Less common
Complexity	Simple	Moderate
Use case	Ordered data (timestamps)	Random access

Example: Time-series data (logs, metrics) → Range sharding better

3. Consistent Hashing vs Jump Hash

Jump Hash (Google’s algorithm):

Simpler than consistent hashing
Zero memory overhead
But: Cannot choose which server to add/remove
Use case: When you control server additions (not failures)

def jump_hash(key, num_buckets):
    """Google's Jump Hash"""
    b, j = -1, 0
    while j < num_buckets:
        b = j
        key = ((key * 2862933555777941757) + 1) & 0xffffffffffffffff
        j = int((b + 1) * (float(1 << 31) / float((key >> 33) + 1)))
    return b

Interview Tips

Common Questions

Q: “What is consistent hashing?”

A: “Consistent hashing is a distributed hashing technique that minimizes key redistribution when the number of servers changes. Unlike traditional hash % N, which moves most keys when N changes, consistent hashing moves only about 1/N of keys on average.”

Q: “Why do we need virtual nodes?”

A: “Virtual nodes solve the load imbalance problem. With one position per server on the hash ring, you might get unlucky with distribution—one server could get a huge range. Virtual nodes give each server multiple positions, which statistically balances the load. Real systems use 100-200 vnodes per server.”

Q: “What are the trade-offs?”

A: “Pros: minimal redistribution, easy scaling, fault tolerance. Cons: more complex than simple modulo hashing, uses memory for ring structure, doesn’t support range queries efficiently, and hotspot keys can still overload a single server.”

Q: “How would you implement it?”

A: “Store a sorted array of (hash, server) pairs. When adding a server, hash its vnodes and insert into sorted array. When looking up a key, hash it and use binary search to find the next server clockwise. Time complexity is O(log N) for lookups.”

Q: “When would you NOT use consistent hashing?”

A: “Don’t use it for single-server systems, when you need range queries (use range-based sharding), or when your cluster is completely stable (simple modulo is fine). Also avoid if you have strong ordering requirements.”

Whiteboard Practice

Exercise: Draw consistent hashing on whiteboard

1. Draw circle (hash ring)
2. Add 4 servers at random positions
3. Add 3 keys, show which server each goes to
4. Remove 1 server, show which keys move
5. Explain only 25% of keys moved vs 75% with modulo

Time: Should take 5-7 minutes

Code Implementation Questions

Implement these functions (good practice):

add_server(server_id) with vnodes
remove_server(server_id)
get_server(key) with binary search
get_n_servers(key, n) for replication

Complexity analysis:

State time/space complexity for each operation
Mention binary search optimization

Key Takeaways

The problem: Traditional hash % N causes massive redistribution when N changes
The solution: Map both keys and servers to a ring; each key goes to next server clockwise
Virtual nodes: Each server has multiple ring positions for even load distribution
Performance: O(log N) lookups with binary search, ~1/N keys move per server change
Real-world: Used in DynamoDB, Cassandra, Memcached, CDNs, load balancers
Interview focus:
- Explain the rehashing problem clearly
- Draw the ring diagram
- Mention virtual nodes
- Discuss trade-offs vs alternatives
- Know time/space complexity
When to use: Distributed caching, dynamic clusters, horizontal scaling, fault tolerance
When NOT to use: Range queries needed, stable clusters, single server, strong ordering

Practice Problems

Problem 1: Design Distributed Cache

Given: 1000 cache servers, 10M keys, servers frequently fail

Question: How would you use consistent hashing? How many vnodes?

Answer:

Use consistent hashing with 150 vnodes per server
Total ring size: 150,000 positions
When server fails: ~10M/1000 = 10K keys move
Replication factor 3: store on 3 consecutive servers
Monitor: key distribution, hotspots, failure impact

Problem 2: Calculate Redistribution

Given: 10 servers, 1M keys uniformly distributed

Question:

How many keys move with modulo hashing when 1 server fails?
How many with consistent hashing?

Answer:

Modulo hashing (hash % N):
- 10 → 9 servers
- New positions: hash % 9 (different from hash % 10)
- Estimate: ~90% keys change position
- Keys moved: ~900,000

Consistent hashing:
- Keys only from failed server move
- Each server has 1M/10 = 100K keys
- Keys moved: ~100K (10x less!)

Problem 3: Implement Weighted Consistent Hashing

Question: Some servers are more powerful. How to give them more load?

Answer:

def add_server(self, server_id, weight=1.0):
    """
    weight=1.0: normal (150 vnodes)
    weight=2.0: 2x powerful (300 vnodes)
    weight=0.5: 0.5x powerful (75 vnodes)
    """
    num_vnodes = int(self.vnodes_per_server * weight)
    for i in range(num_vnodes):
        vnode_key = f"{server_id}-{i}"
        hash_val = self._hash(vnode_key)
        self.ring.append((hash_val, server_id))
    self.ring.sort()

Chapter 1 ch01-scale-zero-to-millions: Database sharding (uses consistent hashing)
Chapter 6 ch06-key-value-store: Distributed key-value store design
Distributed Systems distributed-system-components: How consistent hashing fits

External Resources

Consistent Hashing Paper (1997): Original MIT paper by Karger et al.
Cassandra Architecture: DataStax docs on token rings
DynamoDB: AWS whitepaper on partitioning
Blog: Tom White’s visual explanation

Master this chapter before designing any distributed system! Consistent hashing is fundamental for cache distribution, database sharding, and load balancing.

Practice: Draw the ring diagram 10 times until you can do it from memory in an interview.

Last Updated: 2026-04-09
Status: Complete - Ready for interview preparation

Study Notes by Niladri & AI

Explorer

ch05-consistent-hashing

Chapter 5: Consistent Hashing

Overview

The Problem: Rehashing

Naive Hash-Based Distribution

The Rehashing Problem

Why This Is Bad

Solution: Consistent Hashing

Core Idea

How It Works

Adding a Server

Removing a Server

Problem: Uneven Distribution

Solution: Virtual Nodes (Vnodes)

Implementation Details

Hash Function Choice

Data Structure

Handling Edge Cases

Real-World Examples

1. Amazon DynamoDB

2. Apache Cassandra

3. Memcached (with ketama)

4. Load Balancing

Trade-offs and Limitations

Advantages ✅

Disadvantages ❌

When to Use

Comparison: Consistent Hashing vs Alternatives

1. Consistent Hashing vs Modulo Hashing

2. Consistent Hashing vs Range-Based Sharding

3. Consistent Hashing vs Jump Hash

Interview Tips

Common Questions

Whiteboard Practice

Code Implementation Questions

Key Takeaways

Practice Problems

Problem 1: Design Distributed Cache

Problem 2: Calculate Redistribution

Problem 3: Implement Weighted Consistent Hashing

Related Topics

External Resources

Graph View

Table of Contents

Backlinks