Chapter 7: Design a Unique ID Generator in Distributed Systems

volume1 unique-id distributed-systems snowflake

Status: 🟩 Interview ready
Difficulty: Medium
Time to complete: 40 min read + practice


Overview

Generating unique IDs in a distributed system is surprisingly tricky. Every record in a database needs a unique identifier. In a single server, a simple auto-increment integer works. In distributed systems with multiple servers and sharded databases, auto-increment breaks down.

Why this matters:

  • Very common interview question at top companies
  • Tests understanding of distributed systems constraints and trade-offs
  • Twitter’s Snowflake algorithm is a canonical solution (used by Twitter, Discord, Instagram)
  • Forces discussion of clock synchronization and bit manipulation

Real-world usage:

  • Twitter/X: Tweet IDs, user IDs (Snowflake)
  • Instagram: Photo IDs (modified Snowflake)
  • Discord: Message IDs (Snowflake)
  • Stripe: Payment IDs (ULIDs)
  • MongoDB: ObjectIDs (12-byte ID with timestamp + machine + counter)

Problem Statement

Design a unique ID generator that:

  • Generates unique IDs across a distributed system (multiple machines, multiple datacenters)
  • IDs are 64-bit numbers (can be stored as BIGINT)
  • IDs are sortable by time (chronological ordering)
  • Generates 10,000+ IDs per second
  • Works without central coordination bottleneck

Step 1: Requirements & Scope (5 min)

Functional Requirements

Clarifying questions:

  • What kind of ID? Numbers only, or can include characters? β†’ Numbers only
  • How unique? Globally unique across all machines? β†’ Yes, globally unique
  • Must IDs be sortable? β†’ Yes, sortable by time
  • What size? β†’ 64-bit integers (fits in a BIGINT SQL column)
  • What scale? β†’ 10,000+ IDs per second

Scope:

  • IDs must be globally unique (no collisions anywhere in the system)
  • IDs are 64-bit integers (numeric only)
  • IDs are ordered by time (newer IDs are larger numbers)
  • System generates at minimum 10,000 IDs per second
  • System is highly available

Non-Functional Requirements

  • Uniqueness: Absolute β€” no two IDs can ever collide
  • Sortability: IDs must sort chronologically (first sort key is time)
  • Scalability: 10,000+ IDs per second system-wide
  • Low latency: ID generation must be fast (microseconds)
  • High availability: ID generator itself must never be a single point of failure

Step 2: High-Level Design β€” 4 Approaches (10 min)

Approach 1: Multi-Master Replication (Auto-Increment with Step)

How it works:

  • Each database server auto-increments, but uses a step equal to the number of servers
  • Server 1 generates: 1, 3, 5, 7, …
  • Server 2 generates: 2, 4, 6, 8, …
  • Server 3 generates: 3, 6, 9, 12, …
DB Server 1:  1 β†’ 4 β†’ 7 β†’ 10 ...  (start=1, step=3)
DB Server 2:  2 β†’ 5 β†’ 8 β†’ 11 ...  (start=2, step=3)
DB Server 3:  3 β†’ 6 β†’ 9 β†’ 12 ...  (start=3, step=3)

Pros:

  • βœ… No coordination needed between servers
  • βœ… Numeric IDs
  • βœ… Works with existing databases

Cons:

  • ❌ Not scalable β€” adding or removing a server requires changing the step on all servers (risky migration)
  • ❌ Not globally sortable β€” IDs from different servers interleave but don’t reflect time order
  • ❌ IDs don’t scale across multiple datacenters
  • ❌ Hard to determine which server generated a given ID

When to use: Never for new systems. Legacy databases may already use this.


Approach 2: UUID (128-bit)

How it works:

  • UUID (Universally Unique Identifier) is a 128-bit random or pseudo-random number
  • Can be generated independently on any machine with no coordination
  • Format: 550e8400-e29b-41d4-a716-446655440000
import uuid
id = str(uuid.uuid4())  # "550e8400-e29b-41d4-a716-446655440000"

Pros:

  • βœ… No coordination needed β€” each server generates independently
  • βœ… Simple to implement (one library call)
  • βœ… Works across datacenters, languages, databases
  • βœ… Very low collision probability (~1 in 10^36 with UUIDv4)

Cons:

  • ❌ 128-bit β€” twice the storage of 64-bit IDs, larger indexes
  • ❌ Not sortable β€” random, no time ordering, not numerically sorted
  • ❌ Not numeric β€” UUID is alphanumeric (with dashes), can’t use as BIGINT
  • ❌ Poor database index performance β€” random inserts cause B-tree fragmentation

When to use: When you don’t need sortability and storage is not a concern (e.g., internal trace IDs, session tokens).


Approach 3: Ticket Server (Centralized)

How it works:

  • Dedicated β€œticket server” is the only source of IDs
  • All application servers call the ticket server for a new ID
  • Ticket server uses a single database with auto-increment
App Server 1 ─┐
App Server 2 ─┼──→ [Ticket Server] β†’ DB (auto_increment)
App Server 3 β”€β”˜

Pros:

  • βœ… Numeric IDs
  • βœ… Simple to implement
  • βœ… Works in distributed systems

Cons:

  • ❌ Single point of failure β€” if ticket server goes down, entire system can’t generate IDs
  • ❌ Bottleneck β€” all ID requests serialized through one server
  • ❌ Need multiple ticket servers for HA β†’ synchronization problem between them
  • ❌ Network round trip adds latency to every ID request

When to use: Small-to-medium scale systems with simple requirements (Flickr used this).


Approach 4: Twitter Snowflake (The Winner!)

How it works:

  • Each machine generates its own IDs locally β€” no coordination
  • 64-bit ID structured as: timestamp + datacenter + machine + sequence
  • Timestamp prefix ensures time-sortability
  • Datacenter + Machine IDs ensure uniqueness across servers
64-bit Snowflake ID:
β”Œβ”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚0β”‚         41-bit timestamp                 β”‚ 5   β”‚ 5   β”‚  12-bit    β”‚
β”‚ β”‚    (milliseconds since epoch)            β”‚ DC  β”‚ Machβ”‚  sequence  β”‚
β”‚ β”‚                                          β”‚ ID  β”‚ ID  β”‚  number    β”‚
β””β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  ↑                                            ↑     ↑       ↑
Sign bit                                   Datacenter Machine  4096 IDs
(always 0)                                  (32 DCs) (32/DC)  per ms

Pros:

  • βœ… 64-bit β€” fits in BIGINT, half the size of UUID
  • βœ… Sortable by time β€” timestamp is the most significant bits
  • βœ… Numeric only β€” just a 64-bit integer
  • βœ… No coordination β€” each machine generates independently
  • βœ… Very high throughput β€” 4096 IDs per millisecond per machine
  • βœ… Decentralized β€” no single point of failure

When to use: Default choice! Twitter, Discord, Instagram all use variants.


4 Approaches Comparison

ApproachUnique64-bitSortableScalableCoordinationUse Case
Multi-Master Auto-Incβœ…βœ…βŒβŒNoneLegacy only
UUIDβœ…βŒ (128-bit)βŒβœ…NoneTrace IDs, sessions
Ticket Serverβœ…βœ…βœ…βŒCentralizedSmall scale
Snowflakeβœ…βœ…βœ…βœ…NoneDefault choice

Winner: Twitter Snowflake β€” meets all requirements.


Step 3: Deep Dive β€” Twitter Snowflake (20 min)

64-Bit Structure (Memorize This!)

Bit position:
63        22 21   17 16  12 11           0
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚   41     β”‚   5   β”‚  5   β”‚     12      β”‚
 β”‚timestamp β”‚  DC   β”‚ Mach β”‚  sequence   β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Wait β€” what about bit 63?

63  62                          22 21   17 16  12 11           0
 β”Œβ”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 1β”‚           41 bits            β”‚  5    β”‚  5   β”‚    12 bits  β”‚
 β”‚ 0β”‚         timestamp            β”‚  DC   β”‚ Mach β”‚  sequence   β”‚
 β”‚  β”‚  (ms since custom epoch)     β”‚  ID   β”‚  ID  β”‚  number     β”‚
 β””β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Sign                                             Total = 64 bits
bit

Bit-by-Bit Breakdown

Bit 63 β€” Sign bit (1 bit):

  • Always set to 0
  • Ensures IDs are positive numbers (avoids negative BIGINT issues)
  • Wasted bit but necessary for compatibility

Bits 62-22 β€” Timestamp (41 bits):

  • Milliseconds since a custom epoch (not Unix epoch)
  • Twitter uses: November 4, 2010 at 01:42:54 UTC (their launch)
  • Why custom epoch? Maximizes the range of the timestamp field
Max value of 41 bits = 2^41 - 1 = 2,199,023,255,551 ms
                     β‰ˆ 69.7 years

If Twitter started epoch in 2010:
  2010 + 69.7 years β‰ˆ 2079 before overflow!

With Unix epoch (1970):
  1970 + 69.7 years β‰ˆ 2039 (sooner!)

Bits 21-17 β€” Datacenter ID (5 bits):

  • 2^5 = 32 possible datacenter IDs (0-31)
  • Assigned at deployment time β€” all machines in a DC share the same DC ID
  • Set via configuration, not at runtime

Bits 16-12 β€” Machine ID (5 bits):

  • 2^5 = 32 machine IDs per datacenter (0-31)
  • Total unique machines: 32 DCs Γ— 32 machines = 1,024 machines
  • Assigned at startup from a ZooKeeper registry or config management tool
  • Each machine in the fleet has a globally unique (DC ID, Machine ID) pair

Bits 11-0 β€” Sequence Number (12 bits):

  • 2^12 = 4,096 possible sequence values per millisecond per machine
  • Incremented for every ID generated within the same millisecond
  • Reset to 0 at the start of each new millisecond
  • If 4,096 IDs exhausted within 1 ms β†’ wait until next ms

Capacity Calculations

IDs per millisecond per machine = 2^12 = 4,096
IDs per second per machine      = 4,096 Γ— 1,000 = 4,096,000 β‰ˆ 4 million
IDs per second across 1,024 machines = 4 million Γ— 1,024 β‰ˆ 4 billion/sec

For 10,000 IDs/sec:
  Even 1 machine can generate 4,096,000 IDs/sec >> 10,000 requirement

ID Generation Code (Pseudocode)

class SnowflakeIDGenerator:
    EPOCH = 1288834974657  # Nov 4, 2010 (Twitter epoch in ms)
    DC_BITS = 5
    MACHINE_BITS = 5
    SEQUENCE_BITS = 12
 
    MAX_DC_ID = (1 << DC_BITS) - 1       # 31
    MAX_MACHINE_ID = (1 << MACHINE_BITS) - 1  # 31
    MAX_SEQUENCE = (1 << SEQUENCE_BITS) - 1   # 4095
 
    def __init__(self, dc_id, machine_id):
        assert 0 <= dc_id <= self.MAX_DC_ID
        assert 0 <= machine_id <= self.MAX_MACHINE_ID
 
        self.dc_id = dc_id
        self.machine_id = machine_id
        self.sequence = 0
        self.last_timestamp = -1
 
    def generate_id(self):
        timestamp = self.current_ms()
 
        if timestamp < self.last_timestamp:
            raise Exception("Clock moved backwards! Refusing to generate ID.")
 
        if timestamp == self.last_timestamp:
            self.sequence = (self.sequence + 1) & self.MAX_SEQUENCE
            if self.sequence == 0:
                # Sequence exhausted β€” wait for next millisecond
                timestamp = self.wait_next_ms(self.last_timestamp)
        else:
            self.sequence = 0  # New millisecond, reset sequence
 
        self.last_timestamp = timestamp
 
        # Assemble the 64-bit ID
        id = ((timestamp - self.EPOCH) << 22)  # 41 bits timestamp
        id |= (self.dc_id << 17)               # 5 bits DC
        id |= (self.machine_id << 12)           # 5 bits machine
        id |= self.sequence                     # 12 bits sequence
 
        return id
 
    def current_ms(self):
        return int(time.time() * 1000)
 
    def wait_next_ms(self, last_ts):
        ts = self.current_ms()
        while ts <= last_ts:
            ts = self.current_ms()
        return ts

Clock Synchronization Problem

The issue: Snowflake assumes the system clock is monotonically increasing and roughly synchronized across machines. What if the clock goes backward?

Causes of clock skew:

  • NTP (Network Time Protocol) corrections can move the clock backward
  • Server reboots, VM migrations, hardware drift
  • Even a few milliseconds of backward movement causes ID collisions

What happens if clock moves backward?

t=1000ms: generate ID with timestamp 1000
Clock skips back to t=998ms
t=998ms: generate ID with timestamp 998
t=998ms: generate ID with timestamp 998 + same sequence as before
β†’ COLLISION! Same (timestamp, dc_id, machine_id, sequence)

Solutions:

1. Refuse to generate (Snowflake default):

if timestamp < last_timestamp:
    raise Exception("Clock moved backwards!")
    # Wait or throw error β€” don't generate potentially duplicate ID

2. Wait until clock catches up:

while current_ms() < last_timestamp:
    time.sleep(0.001)  # Wait 1ms and check again

3. Use monotonic clock (preferred):

  • Operating systems provide monotonic clocks that never go backward
  • Even if wall clock is corrected, monotonic clock just pauses
  • Python: time.monotonic(), Java: System.nanoTime()

4. Use logical clocks (Hybrid Logical Clocks):

  • Combine wall clock + logical counter
  • Never goes backward, stays loosely synchronized
  • Used by CockroachDB, YugabyteDB

Timestamp Extraction β€” Sortability Explained

Why IDs are sortable:

ID = [41-bit timestamp][5-bit DC][5-bit machine][12-bit sequence]
    ← most significant                         least significant β†’

Two IDs generated at different times:
ID1 (t=1000ms): 01111101000 00001 00010 000000000001
ID2 (t=1001ms): 01111101001 00001 00010 000000000001

ID2 > ID1 because the timestamp bits (most significant) are larger.
Sorting by numeric value = sorting by time. βœ…

Extracting timestamp from ID:

def timestamp_from_id(id, epoch=1288834974657):
    timestamp_ms = (id >> 22) + epoch
    return datetime.fromtimestamp(timestamp_ms / 1000)

This means you can determine when any ID was created β€” useful for debugging and data partitioning.

Assigning Machine IDs

Problem: How does each machine know its unique machine ID?

Option 1: ZooKeeper:

  • Machines register on startup, get a unique ID from ZooKeeper
  • ZooKeeper maintains the registry, ensures no duplicates
  • Used by original Twitter Snowflake

Option 2: Configuration management:

  • Machine IDs assigned via environment variables, config files, Kubernetes labels
  • Simpler, no dependency on ZooKeeper

Option 3: IP-based:

  • Hash the IP address of the machine β†’ machine ID
  • Risk: IP reuse after decommission causes collisions
  • Not recommended for production

Alternatives to Snowflake

ULID (Universally Unique Lexicographically Sortable Identifier)

01ARZ3NDEKTSV4RRFFQ69G5FAV
β””β”€β”€β”€β”€β”€β”€β”˜β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 48-bit    80-bit random
timestamp
(ms)

- 128-bit total (like UUID but sortable)
- Base32 encoded (alphanumeric)
- Monotonically sortable within same ms
- No machine ID required β€” randomness handles uniqueness
- Used by: Stripe (their KSUID variant)

KSUID (K-Sortable Unique Identifier)

- 160-bit: 32-bit timestamp (seconds) + 128-bit random payload
- Sortable but less precise (second-level, not ms-level)
- Used by: Segment (analytics company)

UUID v7 (Draft Standard)

- 128-bit: 48-bit ms timestamp + version bits + random
- Sortable (unlike UUID v4 which is random)
- Standard RFC, widely supported
- Good replacement for UUID v4 when sortability needed

Comparison Table

AlgorithmBitsSortableCoordinationClock DependentUse Case
Snowflake64βœ… ms-levelNone (machine ID at startup)βœ… (skew risk)Twitter, Discord, High throughput
ULID128βœ… ms-levelNoneβœ… (minor)General purpose, no machine ID setup
UUID v4128❌None❌Session tokens, trace IDs
UUID v7128βœ… ms-levelNoneβœ… (minor)Standard replacement for v4
KSUID160βœ… sec-levelNoneβœ… (minor)Analytics, lower resolution ok

Step 4: Summary & Trade-offs

Final Decision Tree

Need unique IDs?
  |
  β”œβ”€β”€ Need 64-bit integers?
  |     β”œβ”€β”€ YES β†’ Need time-sortable?
  |     |           β”œβ”€β”€ YES β†’ Need high throughput (10K+/s)?
  |     |           |           └── YES β†’ Twitter Snowflake βœ… (BEST CHOICE)
  |     |           └── NO β†’ Ticket Server (small scale only)
  |     |
  |     └── NO (128-bit OK) β†’ Need time-sortable?
  |                 β”œβ”€β”€ YES β†’ ULID or UUID v7
  |                 └── NO  β†’ UUID v4
  |
  └── Need no coordination?
            └── YES β†’ Snowflake, ULID, UUID v4/v7 (all work)

Key Design Decisions

DecisionChoiceReasoning
ID size64-bitFits in BIGINT, half the size of UUID
AlgorithmTwitter Snowflake64-bit, sortable, no coordination, high throughput
SortabilityTimestamp prefix (41 bits)Most significant bits = time = numeric sort = time sort
Clock issuesRefuse on backward skewSafety over availability for ID generation
Machine ID assignmentZooKeeper or configOne-time assignment at startup, immutable
EpochCustom (not Unix)Maximizes timestamp range (~70 years from launch)

Design Summary

Snowflake ID Structure (Bit Layout)

63  62                          22 21   17 16  12 11           0
 β”Œβ”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 0β”‚     41-bit timestamp          β”‚  5    β”‚  5   β”‚   12 bits   β”‚
 β”‚  β”‚  ms since custom epoch       β”‚  DC   β”‚ Mach β”‚  sequence   β”‚
 β”‚  β”‚  (~69.7 years)               β”‚(0-31) β”‚(0-31)β”‚  (0-4095)   β”‚
 β””β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Max capacity per machine:  4,096 IDs/ms = 4.096 million IDs/sec
Max datacenters:          32
Max machines per DC:      32
Total unique machines:    1,024
Timestamp range:          ~69.7 years from custom epoch

Full System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Deployment Setup                             β”‚
β”‚                                                                 β”‚
β”‚   ZooKeeper / Config Service                                    β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                    β”‚
β”‚   β”‚ Machine Registry       β”‚                                    β”‚
β”‚   β”‚ DC1-Machine1: id=0     β”‚                                    β”‚
β”‚   β”‚ DC1-Machine2: id=1     β”‚                                    β”‚
β”‚   β”‚ DC2-Machine1: id=0     β”‚                                    β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                    β”‚
β”‚        ↓ (one-time at startup)                                  β”‚
β”‚   Each machine learns its (dc_id, machine_id)                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Runtime ID Generation                        β”‚
β”‚                                                                 β”‚
β”‚  Client Request β†’ Application Server                            β”‚
β”‚                         |                                       β”‚
β”‚                         v                                       β”‚
β”‚              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚              β”‚   Snowflake Generator    β”‚                       β”‚
β”‚              β”‚   (in-process library)   β”‚                       β”‚
β”‚              β”‚                          β”‚                       β”‚
β”‚              β”‚  1. Get current ms       β”‚                       β”‚
β”‚              β”‚  2. Check clock skew     β”‚                       β”‚
β”‚              β”‚  3. Increment sequence   β”‚                       β”‚
β”‚              β”‚  4. Assemble 64-bit ID   β”‚                       β”‚
β”‚              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚                         |                                       β”‚
β”‚                         v                                       β”‚
β”‚              Return ID (no network round trip!)                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key insight: ID generation is entirely local to the machine.
No network call, no coordination, no lock. Microsecond latency.

Interview Questions & Answers

Q: Why can’t we use auto-increment IDs in a distributed database?
A: Auto-increment works per-database-instance. When you have multiple shards or replicas, each generates its own sequence starting from 1. Two different shards will independently generate ID=1, ID=2, etc., causing collisions when data is merged or cross-shard queries run. You’d need a central sequencer (single point of failure) or a step-based scheme that’s hard to scale.

Q: Walk me through the 64-bit Snowflake structure. Why those bit counts?
A: Sign bit (1) = always 0 for positive numbers. Timestamp (41 bits) = 2^41 ms β‰ˆ 69.7 years of IDs before overflow, using a custom epoch. Datacenter ID (5 bits) = 32 DCs. Machine ID (5 bits) = 32 machines per DC = 1,024 total machines. Sequence (12 bits) = 4,096 IDs per millisecond per machine. The most significant bits are the timestamp, ensuring that sorting by numeric value = sorting by time.

Q: What is the clock skew problem in Snowflake and how do you handle it?
A: If NTP or another mechanism causes the system clock to go backward, a new ID might get the same (timestamp, machine_id, sequence) as a previously-generated ID β†’ collision. Solutions: (1) Refuse to generate IDs when clock goes backward (throw exception, let caller retry). (2) Wait until the clock catches up past the last timestamp. (3) Use monotonic clocks that never go backward. (4) Hybrid Logical Clocks (HLC) combine physical and logical time and never go backward.

Q: How many IDs per second can one Snowflake machine generate?
A: 2^12 = 4,096 IDs per millisecond = 4,096,000 IDs per second β‰ˆ 4 million IDs/sec. With 1,024 machines total (32 DCs Γ— 32 machines), the system can generate about 4 billion IDs per second globally. The 10,000 IDs/second requirement is trivially met by even a single machine.

Q: How do you ensure machine IDs are unique when machines are added or replaced?
A: Use a coordination service like ZooKeeper. On startup, each machine registers itself and receives a unique ID from ZooKeeper’s sequential node feature. ZooKeeper guarantees atomicity β€” no two machines ever get the same ID. Alternative: Use config management (Kubernetes labels, environment variables) to assign IDs before deployment. The key requirement is that the (datacenter_id, machine_id) pair is globally unique and immutable for the lifetime of the machine.


Key Takeaways

  1. Auto-increment fails in distributed systems because each shard generates its own sequence β€” leading to collisions when shards are queried together.
  2. Twitter Snowflake is the standard answer for 64-bit, time-sortable, numerically unique IDs at scale β€” no coordination needed at runtime.
  3. Memorize the 64-bit structure: 1 sign + 41 timestamp + 5 datacenter + 5 machine + 12 sequence = 64 bits total.
  4. Sortability comes from the timestamp prefix β€” since timestamp occupies the most significant bits, sorting by numeric value is sorting by time.
  5. Each machine generates locally β€” no network round trip, no lock. ID generation takes microseconds. Throughput limited only by the sequence space (4096/ms).
  6. Clock synchronization is a real problem β€” NTP can push clocks backward. The safe solution is to refuse generation when the clock goes backward, and use monotonic clocks.
  7. UUID is not the answer for most production systems β€” 128 bits, not sortable, bad database index performance due to random inserts.


Practice this design! Medium difficulty but requires memorizing the Snowflake bit structure. Be ready to:

  1. Recite the 64-bit Snowflake layout from memory
  2. Calculate max capacity (IDs/sec per machine and total)
  3. Explain clock skew problem and solutions
  4. Compare all 4 approaches with trade-offs
  5. Explain why timestamp prefix guarantees sortability

Last Updated: 2026-04-13
Status: Medium difficulty β€” Must memorize Snowflake structure!