Chapter 13: Design a Search Autocomplete System

volume1 autocomplete trie search typeahead

Status: 🟩 Interview ready
Difficulty: Medium-Hard
Time to complete: 45 min read + practice


Overview

A search autocomplete system (also called typeahead or search suggestions) shows the top-K most popular query completions as a user types each character. Classic examples: Google search bar showing 5 suggestions as you type, Amazon product search, YouTube video search.

Why this matters:

  • Very common interview question (Google, Amazon, LinkedIn all ask this)
  • Teaches trie data structures in a practical context
  • Combines data pipeline design with low-latency query serving
  • Forces discussion of trade-offs: accuracy vs speed vs memory

Problem Statement

Design a search autocomplete system that:

  • Returns top 5 query suggestions as user types each character
  • Suggestions ranked by historical search frequency
  • Responds in < 100ms (feels instant to users)
  • Handles 10M DAU each performing ~10 searches/day
  • Matches from beginning of query (prefix match), not middle

Step 1: Requirements & Scope (5 min)

Functional Requirements

Clarifying questions:

  • How many suggestions to return? β†’ Top 5
  • Match from beginning or middle of query? β†’ Beginning only (prefix match, not substring)
  • Sort by? β†’ Historical search frequency (popularity)
  • How fresh are suggestions? β†’ Weekly update (not real-time, simplification)
  • Personalization? β†’ No (same suggestions for all users, mention as extension)
  • Fuzzy matching? β†’ No (exact prefix only, mention as extension)
  • Multi-language? β†’ English only (mention multi-language as extension)
  • Scale? β†’ 10M DAU, ~10 searches per user = 100M searches/day

Scope:

  • Prefix-based autocomplete (not substring, not fuzzy)
  • Top 5 results sorted by search frequency
  • Suggestions updated weekly based on historical data
  • English only, no personalization

Non-Functional Requirements

  • Low latency: < 100ms response time (fast enough to keep up with typing)
  • High availability: 99.9% uptime (search must always work)
  • Scalability: 10M DAU, 100M searches/day
  • Freshness: Suggestions updated at least weekly
  • Relevance: Show truly popular queries, filter spam/offensive content

Capacity Estimation

DAU: 10M
Searches per user per day: 10
Total searches/day: 10M Γ— 10 = 100M/day
Peak QPS: 100M / 86400 Γ— 2 (peak factor) β‰ˆ 2,300 QPS

Characters per search query: avg 20 characters
Autocomplete requests per search: ~20 (one per character typed)
Total autocomplete requests: 100M Γ— 20 = 2B/day
Autocomplete QPS: 2B / 86400 β‰ˆ 23,000 QPS peak β‰ˆ 50K at peak

Storage:
- 10M daily unique queries Γ— 200 bytes (query + metadata) = 2 GB/day
- Weekly: 14 GB of new query data
- Trie for top 5B unique queries: ~100 GB (estimated)

Step 2: High-Level Design (10 min)

Two Separate Services

The autocomplete system splits into two distinct services with very different requirements:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Search Autocomplete System                 β”‚
β”‚                                                         β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚   β”‚  Data Gathering     β”‚   β”‚   Query Service       β”‚  β”‚
β”‚   β”‚  Service            β”‚   β”‚                       β”‚  β”‚
β”‚   β”‚                     β”‚   β”‚  - Serve top-5 for    β”‚  β”‚
β”‚   β”‚  - Log all searches β”‚   β”‚    any prefix         β”‚  β”‚
β”‚   β”‚  - Aggregate counts β”‚   β”‚  - < 100ms response   β”‚  β”‚
β”‚   β”‚  - Update weekly    β”‚   β”‚  - Cache in Redis      β”‚  β”‚
β”‚   β”‚  - Build trie       β”‚   β”‚  - Read-heavy          β”‚  β”‚
β”‚   β”‚  - Write-heavy      β”‚   β”‚                       β”‚  β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why separate?

  • Data gathering is batch/offline (weekly rebuild, can be slow)
  • Query service is online/real-time (must respond in < 100ms)
  • Different scaling needs: gathering is write-heavy batch, query is read-heavy online

Data Flow Overview

User searches "twitter" β†’ Logged
User searches "twitter bootstrap" β†’ Logged
User searches "twitter api" β†’ Logged
...millions of searches logged weekly...

Weekly batch job:
  1. Aggregate raw logs β†’ count per query
  2. Filter spam/offensive content
  3. Build optimized trie from top queries
  4. Load trie into cache (Redis)

User types "tw":
  β†’ Query Service β†’ Redis cache β†’ Returns ["twitter", "twitch", "twitter bootstrap", ...]

Basic Architecture

[User types "tw"]
        β”‚
        β–Ό HTTP GET /autocomplete?prefix=tw
[Load Balancer]
        β”‚
        β–Ό
[Query Servers]
        β”‚
        β”œβ”€β”€β†’ [Redis Trie Cache] ─── HIT ──→ Return top-5 results
        β”‚                           MISS
        └──────────────────────→ [Trie DB (persistent)] β†’ Return top-5 results
                                    (fallback)

Data Gathering Pipeline:

[Search Logs]
     β”‚
     β–Ό (real-time)
[Log Aggregator / Apache Kafka]
     β”‚
     β–Ό (weekly batch)
[Spark / Hadoop MapReduce]
  - Count query frequencies
  - Filter and rank
     β”‚
     β–Ό
[Trie Builder Service]
  - Build optimized trie from top-N queries
  - Serialize to key-value store
     β”‚
     β–Ό
[Trie DB (HBase/DynamoDB)]
     β”‚
     β–Ό (load into cache)
[Redis Trie Cache]

Step 3: Deep Dive - Trie Data Structure (20 min)

What is a Trie?

A trie (prefix tree) is a tree data structure where each node represents a character and paths from root to nodes represent prefixes/words.

Queries: "tree", "trie", "try", "twitter", "twitch"

                root
                 β”‚
                 t
                 β”‚
                 r ─────────────── w
                 β”‚                 β”‚
           ─────────────       ────────────
           e        i   y      i           i
           β”‚        β”‚          β”‚           β”‚
           e        e         tter        tch
           β”‚        β”‚
         [tree]   [trie]    [try]    [twitter]  [twitch]

Node structure:
- children: map of char β†’ child node
- is_word: bool (does a valid query end here?)
- top_queries: list of top-5 queries with this prefix  ← KEY OPTIMIZATION

Naive Trie: How to Find Top-5

Approach: Traverse to prefix node, then do DFS to find all matching queries, sort by frequency.

def get_top_k(trie, prefix, k=5):
    # Step 1: Navigate to prefix node  O(p) where p = prefix length
    node = trie.root
    for char in prefix:
        if char not in node.children:
            return []  # No matches
        node = node.children[char]
 
    # Step 2: DFS to find all queries with this prefix  O(n) where n = descendant nodes
    all_queries = []
    dfs(node, prefix, all_queries)
 
    # Step 3: Sort by frequency and return top-k  O(n log n)
    all_queries.sort(key=lambda q: q.frequency, reverse=True)
    return all_queries[:k]

Naive trie time complexity:

  • Navigate to prefix: O(p)
  • DFS to collect all matches: O(n) where n = nodes with this prefix
  • Sort: O(n log n)
  • Total: O(p + n log n) β€” too slow when n is large!

Optimized Trie: Store Top-K at Each Node

Key insight: Cache the top-k results at every node. No need to DFS or sort at query time.

class TrieNode:
    def __init__(self):
        self.children = {}        # char β†’ TrieNode
        self.is_word = False
        self.frequency = 0        # frequency of this exact query
        self.top_queries = []     # Top-k queries for this PREFIX (cached!)
 
def get_top_k_OPTIMIZED(trie, prefix, k=5):
    # Step 1: Navigate to prefix node  O(p)
    node = trie.root
    for char in prefix:
        if char not in node.children:
            return []
        node = node.children[char]
 
    # Step 2: Return cached top-k (no DFS needed!)  O(1)
    return node.top_queries[:k]

Optimized trie time complexity:

  • Navigate to prefix: O(p) where p = prefix length
  • Return cached results: O(1)
  • Total: O(p) β‰ˆ O(1) for practical prefix lengths (avg 20 chars)

Trade-off: Space vs Speed:

AspectNaive TrieOptimized Trie
Query timeO(p + n log n)O(p)
Build timeO(n)O(n log k) per node
MemoryO(n)O(n Γ— k) β€” stores top-k at every node
UpdateRecompute on insertRecompute all ancestor nodes
When to useToy/small datasetsProduction at scale

Memory cost of optimization:

  • 1M unique prefix nodes Γ— 5 results Γ— 50 bytes/result = 250 MB
  • Very acceptable trade-off for O(1) query performance

Building the Optimized Trie

def build_trie(queries_with_frequency):
    """queries_with_frequency = list of (query_string, count)"""
    root = TrieNode()
 
    # Phase 1: Insert all queries, building the tree
    for query, freq in queries_with_frequency:
        node = root
        for char in query:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_word = True
        node.frequency = freq
 
    # Phase 2: Bottom-up computation of top-k at each node
    # (DFS post-order traversal)
    def compute_top_k(node, current_prefix):
        # Collect top-k from all children
        all_queries = []
        if node.is_word:
            all_queries.append((current_prefix, node.frequency))
        for char, child in node.children.items():
            child_top = compute_top_k(child, current_prefix + char)
            all_queries.extend(child_top)
 
        # Keep only top-k
        all_queries.sort(key=lambda x: x[1], reverse=True)
        node.top_queries = all_queries[:5]
        return node.top_queries
 
    compute_top_k(root, "")
    return root

Trie Serialization to Key-Value Store

Problem: In-memory trie is great for a single server, but:

  • Can’t fit 100 GB trie in one Redis instance
  • Can’t reload trie quickly after server restart
  • Can’t share trie across query servers

Solution: Serialize trie to a distributed key-value store.

Trie Serialization:
Key = prefix string
Value = top-5 queries with frequencies (JSON)

"t"   β†’ [("twitter",9000), ("twitch",8500), ("twitter api",7000), ...]
"tw"  β†’ [("twitter",9000), ("twitch",8500), ("twitter api",7000), ...]
"twi" β†’ [("twitter",9000), ("twitch",8500), ("twitter api",7000), ...]
"twit"β†’ [("twitter",9000), ("twitch",8500), ("twitter api",7000), ...]
"twitt"β†’[("twitter",9000), ("twitter api",7000), ("twitter bootstrap",5000), ...]
# Serialize trie to key-value store
def serialize_to_kv(trie_node, prefix, kv_store):
    kv_store.set(prefix, trie_node.top_queries)
    for char, child in trie_node.children.items():
        serialize_to_kv(child, prefix + char, kv_store)
 
# Query using key-value store (no in-memory trie needed!)
def query_autocomplete(prefix, kv_store):
    return kv_store.get(prefix)  # O(1) lookup!

With serialized trie:

  • Any query server can look up prefix β†’ top-5 in O(1)
  • Redis serves as the trie cache layer
  • DB (HBase/DynamoDB) as persistent backup
  • No need to load entire trie in memory on each server

Data Gathering Pipeline (Detailed)

Step 1: Log Collection
─────────────────────
Every search query logged to Kafka with:
{ query: "twitter", user_id: 123, timestamp: 1640000000 }

Step 2: Weekly Batch Aggregation (Spark/Hadoop)
────────────────────────────────────────────────
Input: Raw search logs from past week
Process:
  - Group by query string
  - Count occurrences
  - Deduplicate (don't count same user twice for same query)
  - Apply decay factor (older queries count less)
Output: (query, weekly_count) pairs

Step 3: Frequency Calculation with Decay
─────────────────────────────────────────
new_frequency = (1 - decay_factor) Γ— old_frequency + decay_factor Γ— weekly_count
decay_factor = 0.1 to 0.5 (tune based on how quickly trends change)
This prevents old popular queries from dominating forever.

Step 4: Filtering
──────────────────
Remove: offensive/profane content (blocklist), spam queries (bot-generated),
        personally identifying info (SSNs, emails in queries)
Method: Blocklist filter + ML classifier for new offensive content

Step 5: Trie Builder
─────────────────────
Take top-N queries (by frequency) after filtering
Build optimized trie: insert all queries, compute top-k at each node
Serialize trie to key-value store (HBase/DynamoDB)

Step 6: Cache Refresh
──────────────────────
New trie data loaded into Redis cache
Old trie data expires (TTL-based or explicit invalidation)

Update frequency: Weekly rebuild is simple and acceptable. Why not real-time?

  • Trie rebuild is expensive (hours of compute)
  • Search trends don’t change hourly β€” weekly is fresh enough
  • Simpler architecture (no complex incremental update logic)

Query Service Architecture (Detailed)

User types "tw" in search box:
Browser sends: GET /autocomplete?prefix=tw

                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚      Query Service            β”‚
                     β”‚                              β”‚
Request "tw" ───────►│  1. Check Redis Trie Cache   β”‚
                     β”‚     key = "tw"               β”‚
                     β”‚     result = ["twitter",...] │──► Return top-5 (hit!)
                     β”‚                              β”‚
                     β”‚  2. Cache MISS?              β”‚
                     β”‚     β†’ Query Trie DB (HBase)  │──► Return top-5 + cache result
                     β”‚       key = "tw"             β”‚
                     β”‚                              β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Performance optimization: Client-side caching + debouncing:

// Client-side optimization
let debounceTimer;
const cache = new Map();  // In-browser cache
 
searchInput.addEventListener('input', (e) => {
    clearTimeout(debounceTimer);
    const prefix = e.target.value;
 
    // Don't query for very short prefixes
    if (prefix.length < 2) return;
 
    // Check browser cache first
    if (cache.has(prefix)) {
        showSuggestions(cache.get(prefix));
        return;
    }
 
    // Debounce: only fire after 100ms of no typing
    debounceTimer = setTimeout(() => {
        fetch(`/autocomplete?prefix=${prefix}`)
            .then(r => r.json())
            .then(results => {
                cache.set(prefix, results);
                showSuggestions(results);
            });
    }, 100);
});

Trie Sharding for Scale

Problem: Trie data is too large for single server (100 GB+).

Option 1: Shard by starting character (Simple):

Shard 1: a, b, c (queries starting with a/b/c)
Shard 2: d, e, f
...
Shard 9: v, w, x, y, z

Problems:

  • Uneven distribution (β€œs” has far more queries than β€œx”)
  • Hot shards (popular letters)

Option 2: Shard by prefix range (Better, but complex):

Analyze query distribution.
Shard by frequency-weighted prefix splits:
  Shard 1: a - am  (balanced by query volume)
  Shard 2: an - az + b
  Shard 3: c - ...
  ...

Option 3: Consistent hashing on prefix (Recommended for simplicity):

# Route prefix to shard
def get_shard(prefix):
    shard_id = hash(prefix) % NUM_SHARDS
    return shard_servers[shard_id]
 
# Query
def get_suggestions(prefix):
    shard = get_shard(prefix)
    return shard.query(prefix)

Benefits:

  • Even distribution if hash function is good
  • Easy to add shards (re-hash subset)
  • Consistent routing (same prefix always goes to same shard)

Note: With serialized trie in Redis, sharding is just Redis Cluster’s built-in sharding of keys β€” the prefix string is the key, Redis Cluster distributes automatically.

Handling Special Cases

Multi-language support:

Separate trie per language:
  - English trie: english_trie_
  - Spanish trie: spanish_trie_
  - Chinese trie: chinese_trie_ (trie on characters, not pinyin)

Route based on user's locale or browser language:
  Accept-Language: zh-CN β†’ query chinese_trie_

Personalization (advanced extension):

Base suggestions = global trie (population-level frequencies)
Personal layer = user's own recent search history (stored per user)

Merge: top-2 personal + top-3 global
Implementation: Redis per-user hash storing recent query counts
                Blend with global trie at query time

Filtering offensive content:

Approach 1: Blocklist filter (for known bad terms)
  - Maintain list of blocked terms
  - Filter during trie build and at query time

Approach 2: ML classifier (for new offensive content)
  - Train classifier on labeled data
  - Run on new queries in data pipeline before trie build

Real-time filter: Block queries containing certain patterns (regex)

Design Summary

Complete Data Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     DATA GATHERING SERVICE                       β”‚
β”‚                                                                  β”‚
β”‚  [Search Requests]                                               β”‚
β”‚       β”‚                                                          β”‚
β”‚       β–Ό (every search logged)                                    β”‚
β”‚  [Kafka Log Stream]                                              β”‚
β”‚       β”‚                                                          β”‚
β”‚       β–Ό (weekly batch)                                           β”‚
β”‚  [Spark Aggregation]                                             β”‚
β”‚   - Count & rank queries                                         β”‚
β”‚   - Apply decay (old queries fade)                               β”‚
β”‚   - Filter spam/offensive                                        β”‚
β”‚       β”‚                                                          β”‚
β”‚       β–Ό                                                          β”‚
β”‚  [Trie Builder]                                                  β”‚
β”‚   - Build optimized trie (top-5 at each node)                    β”‚
β”‚   - Serialize to key-value store                                 β”‚
β”‚       β”‚                                                          β”‚
β”‚       β–Ό                                                          β”‚
β”‚  [Trie DB: HBase / DynamoDB]  ──────────────────────────────┐   β”‚
β”‚                                                              β”‚   β”‚
β”‚       β–Ό (cache load)                                         β”‚   β”‚
β”‚  [Redis Trie Cache]  ◄───────────────────────────────────── β”‚   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό (weekly refresh)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                       QUERY SERVICE                              β”‚
β”‚                                                                  β”‚
β”‚  User types "tw"                                                 β”‚
β”‚       β”‚                                                          β”‚
β”‚       β–Ό                                                          β”‚
β”‚  [Load Balancer]                                                 β”‚
β”‚       β”‚                                                          β”‚
β”‚       β–Ό                                                          β”‚
β”‚  [Query Servers]                                                 β”‚
β”‚   1. Check Redis cache (key="tw")  ──HIT──► Return top-5        β”‚
β”‚   2. MISS β†’ query Trie DB          ──────► Return top-5 + cache β”‚
β”‚                                                                  β”‚
β”‚  Response: ["twitter","twitch","twitter api","tween","twilio"]   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Design Decisions Summary

DecisionChoiceReasoning
Data structureOptimized trie (top-k at each node)O(p) query time vs O(p + n log n) naive
StorageSerialized to KV store (HBase/DynamoDB)Can’t fit trie in one server’s memory
CacheRedisO(1) lookup by prefix string as key
Update frequencyWeekly batchTrends don’t change hourly; rebuild is expensive
Update mechanismFull rebuild (not incremental)Simpler; avoids stale partial updates
FilteringBlocklist + ML classifierRemove offensive content before trie build
ShardingRedis Cluster (auto-shards by key)Prefix = key, Redis distributes automatically
Client optimizationBrowser cache + 100ms debounceReduce server requests significantly

Interview Questions & Answers

Q: How does the optimized trie achieve O(p) query time?
A: By caching the top-k results at every node during trie construction. Instead of doing a DFS over all descendant nodes and sorting at query time, we precompute the top-k for each prefix during the offline build phase. At query time, we just navigate to the prefix node in O(p) steps and return the cached list in O(1). The trade-off is more memory (store k results at each of n nodes) and longer build time, but query time is O(p) β‰ˆ O(1) in practice.

Q: Why use weekly batch updates instead of real-time trie updates?
A: Trie updates are expensive β€” updating a single query requires recomputing the top-k list at all ancestor nodes (O(p) nodes each requiring re-sorting). At 100M queries/day, real-time updates would cause constant trie rebuilds. Search trends also don’t change meaningfully within hours. Weekly batch is simpler, cheaper, and fresh enough. For truly real-time (like breaking news), a two-tier approach can be used: a fast approximate layer (recent query log) + the slow accurate layer (weekly trie).

Q: How do you handle a query prefix that is not in the trie?
A: If navigating the trie reaches a dead-end (character not in node’s children), return an empty list immediately in O(p). This means the prefix has no matching popular queries. We can also return a fallback: generic popular queries (trending searches), but this usually looks odd to users, so empty is cleaner. The trie only contains the top-N queries (say top 5 billion), so very rare prefixes legitimately have no suggestions.

Q: How would you add personalization to autocomplete?
A: Two-tier approach: (1) Global trie gives population-level popular suggestions. (2) Per-user layer: store each user’s recent search history (last 30 days) in Redis as a sorted set (query β†’ count). At query time, merge personal results with global results β€” show up to 2 personal matches + fill remaining slots from global trie. The personal layer is small and fast. Cold users (no history) fall back entirely to global suggestions.

Q: How do you prevent offensive queries from appearing in autocomplete?
A: Multi-layered approach: (1) Blocklist filter at data pipeline level β€” any query containing blocked terms is removed before trie build. (2) ML classifier for novel offensive content β€” train on labeled data to detect new patterns not in blocklist. (3) Real-time filter at query serving β€” regex/blocklist check on suggestions before returning to client. (4) User reporting β€” flag surfaced suggestions as inappropriate, use as training data. Blocklist handles known terms instantly; ML handles novel cases.


Key Takeaways

  1. Optimized trie stores top-k at every node, achieving O(p) query time at the cost of more memory β€” the right trade-off for production
  2. Two services with different update cadences: data gathering (weekly batch) vs query serving (real-time, < 100ms)
  3. Serialize trie to key-value store (prefix β†’ top-k) to avoid loading giant trie in memory on every server
  4. Redis as trie cache β€” prefix string is the key, top-k list is the value β€” simple and fast
  5. Weekly rebuild is acceptable because search trends don’t change hourly and incremental updates are complex
  6. Client-side debounce + cache reduces server load significantly (100ms debounce can cut requests by 50%+)
  7. Sharding: Redis Cluster handles it automatically since prefix is the key β€” no manual sharding logic needed


Practice this design! Common Medium-Hard interview question. Be ready to:

  1. Draw the trie and explain naive vs optimized with complexity
  2. Explain the two-service split (gathering vs query)
  3. Walk through how weekly batch update pipeline works
  4. Discuss serializing trie to key-value store (the key insight)
  5. Handle edge cases: no results, offensive content, personalization

Last Updated: 2026-04-13
Status: Interview ready - Know the trie optimization cold!