Chapter 5: Consistent Hashing

volume1 consistent-hashing distributed-systems

Status: đźź© Interview ready
Difficulty: Medium-Hard (Core distributed systems concept)
Time to complete: 45-60 min read + practice


Overview

Consistent hashing is a special hashing technique that minimizes key redistribution when hash table size changes. It’s a foundational concept in distributed systems.

Why this matters: Essential for designing scalable distributed caches, databases, and load balancers. Frequently asked in system design interviews.

Real-world uses:

  • Amazon DynamoDB (partitioning)
  • Apache Cassandra (data distribution)
  • Memcached, Redis clusters (cache distribution)
  • Akamai CDN (content distribution)
  • Discord (server sharding)

The Problem: Rehashing

Naive Hash-Based Distribution

Scenario: Distribute keys across N servers using: serverIndex = hash(key) % N

Example with 4 servers (N=4):
- key1 hash: 1234 → server 1234 % 4 = 2
- key2 hash: 5678 → server 5678 % 4 = 2
- key3 hash: 9012 → server 9012 % 4 = 0
- key4 hash: 3456 → server 3456 % 4 = 0

Works fine… until servers change!

The Rehashing Problem

When a server is added or removed, most keys get redistributed.

Server goes down (N=4 → N=3):
- key1: 1234 % 3 = 1 (was 2) ❌ MOVED
- key2: 5678 % 3 = 2 (was 2) âś… SAME
- key3: 9012 % 3 = 0 (was 0) âś… SAME
- key4: 3456 % 3 = 0 (was 0) âś… SAME

Only lucky keys stay in place!

With many keys, ~75% need to move when going from 4→3 servers.

Why This Is Bad

  1. Cache stampede: All moved keys miss cache → hit database simultaneously
  2. Data movement: Massive data transfer between servers
  3. Downtime: Service degraded during redistribution
  4. Cascading failures: Remaining servers overwhelmed

Example impact:

  • 1M keys in cache
  • 1 server fails (4→3 servers)
  • 750,000 keys invalidated
  • 750,000 cache misses hit database
  • Database overload → entire system down

Solution: Consistent Hashing

Core Idea

Instead of hash(key) % N, use a hash ring where both keys and servers are hashed to points on a circle.

Key insight: When a server is added/removed, only keys near that server are affected. Most keys stay on their current server.

How It Works

Step 1: Create a hash ring (0 to 2^32-1)

Imagine a circle with values 0 to 4,294,967,295 (2^32-1)
The circle wraps around: 2^32-1 → 0

Step 2: Hash servers onto the ring

Server A: hash("serverA") = 100
Server B: hash("serverB") = 200
Server C: hash("serverC") = 300
Server D: hash("serverD") = 400

Ring visualization:
    0
    |
 D 400 --- 100 A
    |       |
 C 300 --- 200 B

Step 3: Hash keys onto the ring

key1: hash("key1") = 150
key2: hash("key2") = 250
key3: hash("key3") = 350
key4: hash("key4") = 50

Step 4: Assign each key to the next server clockwise

key1 (150) → goes to Server B (200) [next clockwise]
key2 (250) → goes to Server C (300)
key3 (350) → goes to Server D (400)
key4 (50)  → goes to Server A (100)

Ring with keys:
        0
      key4
    |     |
 D 400 - 100 A
    |    key1
 key3   |
    |   200 B
 C 300  |
       key2

Adding a Server

Add Server E at position 225

Before:
key1 (150) → Server B (200)
key2 (250) → Server C (300)
key3 (350) → Server D (400)
key4 (50)  → Server A (100)

After adding Server E (225):
key1 (150) → Server B (200) ✅ NO CHANGE
key2 (250) → Server E (225) ❌ MOVED (was C)
key3 (350) → Server D (400) ✅ NO CHANGE
key4 (50)  → Server A (100) ✅ NO CHANGE

Only 1 out of 4 keys moved! (25% vs 75% with naive hashing)

Removing a Server

Remove Server B (200)

Before:
key1 (150) → Server B (200)
key2 (250) → Server C (300)
key3 (350) → Server D (400)
key4 (50)  → Server A (100)

After removing Server B:
key1 (150) → Server C (300) ❌ MOVED (was B)
key2 (250) → Server C (300) ✅ NO CHANGE
key3 (350) → Server D (400) ✅ NO CHANGE
key4 (50)  → Server A (100) ✅ NO CHANGE

Only key1 affected! Keys from B redistributed to next server (C)

Key property: Only K/N keys need to move (where K = total keys, N = servers)


Problem: Uneven Distribution

With basic consistent hashing, servers might not be evenly distributed on the ring.

Bad distribution example:
Server A: position 10
Server B: position 20
Server C: position 30
Server D: position 300

Result:
- Servers A, B, C handle tiny ranges
- Server D handles huge range (30-300)
- Load imbalance!

Solution: Virtual Nodes (Vnodes)

Idea: Each physical server gets multiple positions on the ring (virtual nodes).

Instead of:
Server A → 1 position

Use:
Server A → vnode-A-1, vnode-A-2, vnode-A-3 (3 positions)
Server B → vnode-B-1, vnode-B-2, vnode-B-3 (3 positions)
...

With 4 servers Ă— 150 vnodes = 600 points on ring
Much more even distribution!

How virtual nodes work:

Physical Server A creates virtual nodes:
- vnode-A-0: hash("A-0") = 1234
- vnode-A-1: hash("A-1") = 5678
- vnode-A-2: hash("A-2") = 9012
- ... (150 virtual nodes)

All vnodes for Server A map back to physical Server A.

Benefits:

  1. Even distribution: More vnodes = smoother load distribution
  2. Weighted distribution: Powerful servers can have more vnodes
  3. Faster rebalancing: Keys spread across many servers when one fails

Trade-off: More memory (need to track all vnodes)

Memory calculation:
- 100 servers Ă— 150 vnodes = 15,000 entries
- Each entry: 32-byte hash + 8-byte pointer = 40 bytes
- Total: 15,000 Ă— 40 = 600 KB (negligible!)

Standard vnode counts:

  • Cassandra: 128-256 vnodes per node
  • DynamoDB: 100-200 vnodes
  • Riak: 64 vnodes (default)

Implementation Details

Hash Function Choice

Requirements:

  • Uniform distribution
  • Fast computation
  • Low collision rate

Popular choices:

  • MD5: 128-bit hash, good distribution, widely used
  • MurmurHash: Faster than MD5, non-cryptographic
  • SHA-1: 160-bit, more secure but slower
  • xxHash: Extremely fast, modern choice

Interview answer: “I’d use MD5 or MurmurHash for speed and uniform distribution.”

Data Structure

Store ring as sorted list of (hash, server) pairs

class ConsistentHash:
    def __init__(self, vnodes_per_server=150):
        self.ring = []  # sorted list of (hash_value, server_id)
        self.vnodes_per_server = vnodes_per_server
        self.servers = {}
 
    def add_server(self, server_id):
        """Add server with virtual nodes"""
        for i in range(self.vnodes_per_server):
            vnode_key = f"{server_id}-{i}"
            hash_val = self._hash(vnode_key)
            self.ring.append((hash_val, server_id))
 
        # Keep ring sorted for binary search
        self.ring.sort()
        self.servers[server_id] = True
 
    def remove_server(self, server_id):
        """Remove all vnodes for this server"""
        self.ring = [(h, s) for h, s in self.ring if s != server_id]
        del self.servers[server_id]
 
    def get_server(self, key):
        """Find server for given key"""
        if not self.ring:
            return None
 
        hash_val = self._hash(key)
 
        # Binary search for first server >= hash_val
        idx = self._binary_search(hash_val)
 
        # Wrap around if needed
        if idx >= len(self.ring):
            idx = 0
 
        return self.ring[idx][1]  # return server_id
 
    def _hash(self, key):
        """Hash function (MD5 example)"""
        import hashlib
        return int(hashlib.md5(key.encode()).hexdigest(), 16) % (2**32)
 
    def _binary_search(self, hash_val):
        """Find insertion point in sorted ring"""
        left, right = 0, len(self.ring)
        while left < right:
            mid = (left + right) // 2
            if self.ring[mid][0] < hash_val:
                left = mid + 1
            else:
                right = mid
        return left

Time complexity:

  • add_server: O(V log N) where V = vnodes, N = total vnodes
  • remove_server: O(N) where N = total vnodes
  • get_server: O(log N) binary search

Space complexity: O(S Ă— V) where S = servers, V = vnodes per server

Handling Edge Cases

Empty ring:

if not self.ring:
    raise Exception("No servers available")

Wrap-around:

# If key hash > all server hashes, wrap to first server
if idx >= len(self.ring):
    idx = 0

Duplicate hashes (collision):

# Use (hash, server_id) tuples; servers differ even if hash same
# Or use collision resolution in hash function

Real-World Examples

1. Amazon DynamoDB

Usage: Partition data across nodes

- Each node responsible for range of hash values
- Uses consistent hashing with virtual nodes
- 100-200 vnodes per physical node
- Automatic rebalancing when nodes added/removed

Key feature: Preference list for replication

For key K with hash H:
1. Primary: First server clockwise from H
2. Replica 1: Next server clockwise
3. Replica 2: Next server clockwise

If server fails, replicas take over seamlessly

2. Apache Cassandra

Usage: Distribute data across cluster

- Token ring with configurable vnodes (default 256)
- Each node owns multiple ranges on ring
- Read/write requests routed to correct node(s)

Example: 3-node cluster with replication factor 3

Node A: tokens 0-300, owns ranges [0-100, 200-300, ...]
Node B: tokens 100-400, owns ranges [100-200, 300-400, ...]
Node C: tokens 200-500, owns ranges [200-300, 400-500, ...]

Key with hash 250 → stored on Nodes B, C, A (next 3 clockwise)

3. Memcached (with ketama)

Usage: Distribute cache keys across memcached servers

# Client-side consistent hashing
servers = ["cache1:11211", "cache2:11211", "cache3:11211"]
ch = ConsistentHash()
for server in servers:
    ch.add_server(server)
 
# Get key's server
key = "user:12345:profile"
server = ch.get_server(key)
value = memcached_client.get(server, key)

Benefit: When cache server fails, only ~1/N keys affected (not all)

4. Load Balancing

Usage: Distribute requests to backend servers

- Hash based on user_id, session_id, or IP
- Ensures same user → same server (sticky sessions)
- Minimal disruption when servers added/removed

Example:

def route_request(user_id):
    server = consistent_hash.get_server(user_id)
    return proxy_to_server(server, request)

Why better than round-robin: Session affinity without sticky sessions


Trade-offs and Limitations

Advantages âś…

  1. Minimal redistribution: Only K/N keys move (vs all keys)
  2. Horizontal scalability: Easy to add/remove nodes
  3. Even distribution: Virtual nodes balance load
  4. Fault tolerance: Graceful degradation when nodes fail
  5. No coordination: Each client can compute independently

Disadvantages ❌

  1. Complexity: More complex than simple hash % N
  2. Memory overhead: Need to store ring structure
  3. No range queries: Keys distributed randomly (can’t scan ranges)
  4. Hotspots possible: Popular keys still create hotspots
  5. Cascading failure: If node fails, next node gets all its traffic

When to Use

Use consistent hashing when:

  • Distributing data across multiple servers
  • Servers frequently added/removed (elastic scaling)
  • Want to minimize data movement
  • Need fault tolerance
  • Using distributed cache or database

Don’t use when:

  • Single server (not distributed)
  • Need range queries (use range-based sharding)
  • Stable cluster (simple hash % N is fine)
  • Strong ordering requirements

Comparison: Consistent Hashing vs Alternatives

1. Consistent Hashing vs Modulo Hashing

AspectModulo (hash % N)Consistent Hashing
SimplicityVery simpleMore complex
Redistribution~100% keys move~1/N keys move
Even distributionPerfect (if hash good)Good (with vnodes)
MemoryNoneO(servers Ă— vnodes)
Use caseStable clusterDynamic cluster

2. Consistent Hashing vs Range-Based Sharding

AspectRange ShardingConsistent Hashing
Range queriesEfficientInefficient
Load balancingManual rebalancingAutomatic
HotspotsCommon (popular ranges)Less common
ComplexitySimpleModerate
Use caseOrdered data (timestamps)Random access

Example: Time-series data (logs, metrics) → Range sharding better

3. Consistent Hashing vs Jump Hash

Jump Hash (Google’s algorithm):

  • Simpler than consistent hashing
  • Zero memory overhead
  • But: Cannot choose which server to add/remove
  • Use case: When you control server additions (not failures)
def jump_hash(key, num_buckets):
    """Google's Jump Hash"""
    b, j = -1, 0
    while j < num_buckets:
        b = j
        key = ((key * 2862933555777941757) + 1) & 0xffffffffffffffff
        j = int((b + 1) * (float(1 << 31) / float((key >> 33) + 1)))
    return b

Interview Tips

Common Questions

Q: “What is consistent hashing?”

A: “Consistent hashing is a distributed hashing technique that minimizes key redistribution when the number of servers changes. Unlike traditional hash % N, which moves most keys when N changes, consistent hashing moves only about 1/N of keys on average.”

Q: “Why do we need virtual nodes?”

A: “Virtual nodes solve the load imbalance problem. With one position per server on the hash ring, you might get unlucky with distribution—one server could get a huge range. Virtual nodes give each server multiple positions, which statistically balances the load. Real systems use 100-200 vnodes per server.”

Q: “What are the trade-offs?”

A: “Pros: minimal redistribution, easy scaling, fault tolerance. Cons: more complex than simple modulo hashing, uses memory for ring structure, doesn’t support range queries efficiently, and hotspot keys can still overload a single server.”

Q: “How would you implement it?”

A: “Store a sorted array of (hash, server) pairs. When adding a server, hash its vnodes and insert into sorted array. When looking up a key, hash it and use binary search to find the next server clockwise. Time complexity is O(log N) for lookups.”

Q: “When would you NOT use consistent hashing?”

A: “Don’t use it for single-server systems, when you need range queries (use range-based sharding), or when your cluster is completely stable (simple modulo is fine). Also avoid if you have strong ordering requirements.”

Whiteboard Practice

Exercise: Draw consistent hashing on whiteboard

1. Draw circle (hash ring)
2. Add 4 servers at random positions
3. Add 3 keys, show which server each goes to
4. Remove 1 server, show which keys move
5. Explain only 25% of keys moved vs 75% with modulo

Time: Should take 5-7 minutes

Code Implementation Questions

Implement these functions (good practice):

  1. add_server(server_id) with vnodes
  2. remove_server(server_id)
  3. get_server(key) with binary search
  4. get_n_servers(key, n) for replication

Complexity analysis:

  • State time/space complexity for each operation
  • Mention binary search optimization

Key Takeaways

  1. The problem: Traditional hash % N causes massive redistribution when N changes

  2. The solution: Map both keys and servers to a ring; each key goes to next server clockwise

  3. Virtual nodes: Each server has multiple ring positions for even load distribution

  4. Performance: O(log N) lookups with binary search, ~1/N keys move per server change

  5. Real-world: Used in DynamoDB, Cassandra, Memcached, CDNs, load balancers

  6. Interview focus:

    • Explain the rehashing problem clearly
    • Draw the ring diagram
    • Mention virtual nodes
    • Discuss trade-offs vs alternatives
    • Know time/space complexity
  7. When to use: Distributed caching, dynamic clusters, horizontal scaling, fault tolerance

  8. When NOT to use: Range queries needed, stable clusters, single server, strong ordering


Practice Problems

Problem 1: Design Distributed Cache

Given: 1000 cache servers, 10M keys, servers frequently fail

Question: How would you use consistent hashing? How many vnodes?

Answer:

  • Use consistent hashing with 150 vnodes per server
  • Total ring size: 150,000 positions
  • When server fails: ~10M/1000 = 10K keys move
  • Replication factor 3: store on 3 consecutive servers
  • Monitor: key distribution, hotspots, failure impact

Problem 2: Calculate Redistribution

Given: 10 servers, 1M keys uniformly distributed

Question:

  • How many keys move with modulo hashing when 1 server fails?
  • How many with consistent hashing?

Answer:

Modulo hashing (hash % N):
- 10 → 9 servers
- New positions: hash % 9 (different from hash % 10)
- Estimate: ~90% keys change position
- Keys moved: ~900,000

Consistent hashing:
- Keys only from failed server move
- Each server has 1M/10 = 100K keys
- Keys moved: ~100K (10x less!)

Problem 3: Implement Weighted Consistent Hashing

Question: Some servers are more powerful. How to give them more load?

Answer:

def add_server(self, server_id, weight=1.0):
    """
    weight=1.0: normal (150 vnodes)
    weight=2.0: 2x powerful (300 vnodes)
    weight=0.5: 0.5x powerful (75 vnodes)
    """
    num_vnodes = int(self.vnodes_per_server * weight)
    for i in range(num_vnodes):
        vnode_key = f"{server_id}-{i}"
        hash_val = self._hash(vnode_key)
        self.ring.append((hash_val, server_id))
    self.ring.sort()

External Resources


Master this chapter before designing any distributed system! Consistent hashing is fundamental for cache distribution, database sharding, and load balancing.

Practice: Draw the ring diagram 10 times until you can do it from memory in an interview.


Last Updated: 2026-04-09
Status: Complete - Ready for interview preparation