Practice Session 13: Tries (Prefix Trees)
Overview
A Trie is a tree where each path from root to a node spells out a prefix shared by all descendants. Unlike a HashMap that stores complete keys, a Trie shares prefixes structurally — the word “apple” and “app” share three nodes rather than storing 8 characters twice. This makes it the only data structure that answers prefix queries in O(L) time where L is the query length, completely independent of dictionary size. At senior level the skill is not just “use a Trie” but choosing between array-children (fast, fixed alphabet) vs HashMap-children (flexible, sparse), knowing when the O(26×L×W) memory cost is justified, and recognising when the Trie’s DFS structure eliminates entire branches of grid search that a per-word brute-force approach cannot.
Trie Internals
TrieNode Structure: Array[26] vs HashMap Children
| Dimension | TrieNodeArray — children[26] | TrieNodeMap — HashMap<Character, TrieNode> |
|---|---|---|
| Child lookup | O(1) — direct index ch - 'a' | O(1) amortised — hash + equality check |
| Memory per node | 26 × 8 bytes = 208 bytes (64-bit JVM) regardless of actual children | ~48 bytes empty map + 32 bytes per actual entry |
| Memory efficiency | Poor for sparse alphabets (most slots null) | Excellent — only real children allocated |
| Cache behaviour | Array is contiguous in memory; CPU prefetcher friendly | HashMap pointer-chasing; cache misses on traversal |
| Alphabet constraint | Fixed lowercase a-z only | Arbitrary character set (Unicode, digits, symbols) |
| When to prefer | Competitive programming, LeetCode, speed-critical | Production (Redis keys, DNS labels, URL segments, Unicode words) |
| Iteration order | Natural alphabetical (index 0..25) | Insertion order (Java 8+ LinkedHashMap) or undefined |
The isEnd flag: marks that a complete word terminates at this node. Without it, “app” inserted before “apple” would report “apple” found even if only “app” was inserted — because the path exists but the word boundary is never recorded. The flag decouples prefix existence (path reachable) from word existence (path ends here).
Memory Complexity
Array-children Trie:
O(ALPHABET_SIZE × key_length × number_of_keys)
= O(26 × L × W)
Worst case: 1M words of length 10 → 26 × 10 × 1,000,000 = 260M nodes
At 208 bytes each → ~54 GB — not viable for large dictionaries without compression.
HashMap-children Trie:
O(total_characters_across_all_keys) = O(L × W) shared prefixes counted once
Better by a factor of ALPHABET_SIZE in the average case.
Compressed Trie (Patricia / Radix):
O(W) nodes — each internal node stores a whole edge label (substring), not one char.
A single path "automobile" → "auto" is 1 node storing "tomobile" not 8 nodes.
Problem 1: Implement Trie (Prefix Tree)
Link: LeetCode 208
Quick Summary: Design a data structure supporting insert(word), search(word) (exact match), and startsWith(prefix) (prefix existence check), all in O(L) time.
Industry Context: This is the exact ADT powering Google search autocomplete — every keystroke calls startsWith on the query prefix to retrieve the candidate node, then a ranked traversal of descendants produces the suggestion list. The same structure backs T9 phone keyboards (digit sequence → letter sequence prefix matching), DNS authoritative resolvers (hierarchical label lookup from TLD down), and CIDR longest-prefix-match in BGP routing tables (binary trie over IPv4/IPv6 bit strings).
Approach 1: Array[26] Children — Fast, Fixed Alphabet
// All operations: Time O(L) Space O(L) for new nodes inserted
static class TrieArray {
private final TrieNodeArray root = new TrieNodeArray();
public void insert(String word) {
TrieNodeArray curr = root;
for (char ch : word.toCharArray()) {
int idx = ch - 'a';
if (curr.children[idx] == null)
curr.children[idx] = new TrieNodeArray();
curr = curr.children[idx];
}
curr.isEnd = true;
}
public boolean search(String word) {
TrieNodeArray node = walkTo(word);
return node != null && node.isEnd; // path exists AND is a word terminal
}
public boolean startsWith(String prefix) {
return walkTo(prefix) != null; // only path existence required
}
private TrieNodeArray walkTo(String key) {
TrieNodeArray curr = root;
for (char ch : key.toCharArray()) {
int idx = ch - 'a';
if (curr.children[idx] == null) return null;
curr = curr.children[idx];
}
return curr;
}
}- Time: O(L) for all three operations — one node access per character
- Space: O(L) per new word inserted (amortised O(1) for shared prefixes); O(26 × L × W) total
- Tradeoff: 26-pointer array per node is fast but memory-hungry; justified only when the alphabet is fixed a-z and query throughput is high.
Approach 2: HashMap Children — Flexible, Sparse
// All operations: Time O(L) Space O(L) for new nodes, but ~26x less memory per node
static class TrieMap {
private final TrieNodeMap root = new TrieNodeMap();
public void insert(String word) {
TrieNodeMap curr = root;
for (char ch : word.toCharArray()) {
curr.children.putIfAbsent(ch, new TrieNodeMap());
curr = curr.children.get(ch);
}
curr.isEnd = true;
}
public boolean search(String word) {
TrieNodeMap node = walkTo(word);
return node != null && node.isEnd;
}
public boolean startsWith(String prefix) {
return walkTo(prefix) != null;
}
}- Time: O(L) — same asymptotic; constant factor ~2-3x slower per node due to hash computation
- Space: O(L × W) — no wasted null pointers for unpopulated letters; supports any character set
- Key Insight: Choose HashMap children whenever the key alphabet is not constrained to a-z — URLs, Redis key segments, DNS labels, phone numbers. The memory saving is proportional to (ALPHABET_SIZE - average_branching_factor): for a sparse English word trie with average 3 children per node, HashMap uses ~95% less memory per node than array[26].
Problem 2: Add and Search Word
Link: LeetCode 211
Quick Summary: Design addWord(word) and search(word) where word may contain '.' matching any single character. Return true if any added word matches the pattern.
Industry Context: This is the core of regex prefix matching in search engines and shell glob expansion. Elasticsearch uses a generalisation of this trie-DFS to answer fuzzy queries. Redis SCAN pattern matching (e.g., user:*:session) is structurally identical — a trie walk with wildcard branching at each * or ?. Firewall ACL engines use the same technique to match IP prefix patterns with wildcard octets (e.g., 192.168.*.1).
Approach 1: Brute Force with HashSet
// addWord: O(1) search: O(W × L) where W = words stored, L = word length
static class WordDictionaryBrute {
private final Set<String> words = new HashSet<>();
public void addWord(String word) { words.add(word); }
public boolean search(String word) {
if (!word.contains(".")) return words.contains(word); // fast path — no wildcard
for (String stored : words) {
if (stored.length() != word.length()) continue;
boolean match = true;
for (int i = 0; i < word.length(); i++) {
if (word.charAt(i) != '.' && word.charAt(i) != stored.charAt(i)) {
match = false; break;
}
}
if (match) return true;
}
return false;
}
}- Time:
addWordO(1) amortised |searchO(W × L) — must scan every stored word - Space: O(W × L)
- Tradeoff: Works but degrades linearly with dictionary size. For a 200k-word spell-checker dictionary and 1000 queries per second, this is 200M character comparisons per second — unusable without caching.
Approach 2: Trie with DFS Wildcard Search (Optimal)
// addWord: O(L) search: O(26^k × L) worst case where k = number of '.' in pattern
static class WordDictionaryTrie {
private final TrieNodeArray root = new TrieNodeArray();
public void addWord(String word) {
TrieNodeArray curr = root;
for (char ch : word.toCharArray()) {
int idx = ch - 'a';
if (curr.children[idx] == null) curr.children[idx] = new TrieNodeArray();
curr = curr.children[idx];
}
curr.isEnd = true;
}
public boolean search(String word) {
return dfSearch(root, word, 0);
}
private boolean dfSearch(TrieNodeArray curr, String word, int pos) {
if (pos == word.length()) return curr.isEnd;
char ch = word.charAt(pos);
if (ch == '.') {
// wildcard: branch into all 26 possible children
for (TrieNodeArray child : curr.children)
if (child != null && dfSearch(child, word, pos + 1)) return true;
return false;
} else {
int idx = ch - 'a';
if (curr.children[idx] == null) return false; // prune — char absent
return dfSearch(curr.children[idx], word, pos + 1);
}
}
}- Time:
addWordO(L) |searchO(L) for no wildcards; O(26^k × L) worst case for k wildcards — but the Trie prunes null children instantly, making practical performance far better than brute-force full-scan - Space: O(W × L × 26) for the Trie | O(L) recursion stack during search
- Key Insight: The Trie DFS prunes dead ends structurally. A
'.'at depth d branches into at most 26 children, but the Trie already knows which of those 26 letters actually appear at that depth across all stored words — null children are skipped in O(1). Compare to brute-force which must inspect every stored word’s character at position d regardless.
Problem 3: Word Search II
Link: LeetCode 212
Quick Summary: Given an m × n character board and a list of words, find all words that can be constructed by sequentially adjacent (4-directional) cells without reusing any cell in a single path.
Industry Context: This is the algorithm behind Boggle-style word game solvers and Scrabble move generators — the board is a physical tile grid and the word list is the dictionary. More broadly, it models any “multi-pattern simultaneous match over a 2-D state space” problem: malware signature scanning over memory pages (multiple byte patterns matched simultaneously), OCR post-processing (candidate word fragments matched against a lexicon trie on a character-grid), and DNA motif search across a sequence grid.
Approach 1: Brute Force — Independent DFS Per Word
// Time: O(W × R × C × 4^L) where W=words, R×C=board, L=max word length
// Space: O(L) recursion stack
public static List<String> findWordsBrute(char[][] board, String[] words) {
List<String> result = new ArrayList<>();
Set<String> wordSet = new HashSet<>(Arrays.asList(words));
Set<String> found = new HashSet<>();
int rows = board.length, cols = board[0].length;
for (String word : wordSet) {
outer:
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
if (dfsBrute(board, word, r, c, 0)) { found.add(word); break outer; }
}
}
}
result.addAll(found);
return result;
}
private static boolean dfsBrute(char[][] board, String word, int r, int c, int pos) {
if (pos == word.length()) return true;
int rows = board.length, cols = board[0].length;
if (r < 0 || r >= rows || c < 0 || c >= cols || board[r][c] != word.charAt(pos)) return false;
char saved = board[r][c];
board[r][c] = '*'; // mark visited — prevent reuse in same path
boolean found = dfsBrute(board, word, r+1, c, pos+1) || dfsBrute(board, word, r-1, c, pos+1)
|| dfsBrute(board, word, r, c+1, pos+1) || dfsBrute(board, word, r, c-1, pos+1);
board[r][c] = saved; // backtrack
return found;
}- Time: O(W × R × C × 4^L) — each word triggers an independent DFS sweep of the board
- Space: O(L) recursion stack
- Tradeoff: Correct and simple. Fails completely on large word lists — 1000 words × 12×12 board × 4^10 paths = billions of operations. No sharing of work between words with common prefixes.
Approach 2: Trie + DFS Grid Backtracking with Branch Pruning (Optimal)
// Time: O(W × L + R × C × 4^L) Space: O(W × L) for Trie + O(L) recursion stack
public static List<String> findWordsTrie(char[][] board, String[] words) {
// Build Trie — store full word string at terminal node
TrieNodeWord root = new TrieNodeWord();
for (String word : words) {
TrieNodeWord curr = root;
for (char ch : word.toCharArray()) {
int idx = ch - 'a';
if (curr.children[idx] == null) curr.children[idx] = new TrieNodeWord();
curr = curr.children[idx];
}
curr.word = word; // store word at terminal to avoid path reconstruction
}
List<String> result = new ArrayList<>();
for (int r = 0; r < board.length; r++)
for (int c = 0; c < board[0].length; c++)
dfsWordSearch(board, r, c, root, result);
return result;
}
private static void dfsWordSearch(char[][] board, int r, int c, TrieNodeWord node, List<String> result) {
if (r < 0 || r >= board.length || c < 0 || c >= board[0].length || board[r][c] == '*') return;
char ch = board[r][c];
TrieNodeWord next = node.children[ch - 'a'];
if (next == null) return; // prune — no word in Trie starts with this path
if (next.word != null) {
result.add(next.word);
next.word = null; // prevent duplicate results from different DFS starts
}
board[r][c] = '*'; // mark visited for this path
dfsWordSearch(board, r+1, c, next, result);
dfsWordSearch(board, r-1, c, next, result);
dfsWordSearch(board, r, c+1, next, result);
dfsWordSearch(board, r, c-1, next, result);
board[r][c] = ch; // backtrack
// Prune exhausted branches: if next has no remaining children, unlink from parent.
// This shrinks the active Trie progressively, eliminating already-found words
// from future DFS iterations — the key performance win over brute force.
if (isLeaf(next)) node.children[ch - 'a'] = null;
}- Time: O(W × L) to build the Trie + O(R × C × 4^L) for the DFS — the Trie collapses W independent DFS sweeps into one
- Space: O(W × L) Trie + O(L) recursion stack
- Key Insight: Three layered optimisations make this transformatively faster than brute force:
- Shared prefix walk: words “eat”, “each”, “ear” all share the “ea” trie path — the DFS reaches the ‘e’→‘a’ node once, not three times as in brute force.
- Early termination:
if (next == null) returnprunes any DFS branch where no word in the entire dictionary could match — this eliminates the majority of grid paths. - Trie shrinkage: after a word is found and all its descendants are exhausted,
isLeafpruning unlinks the node from its parent. Subsequent DFS starts from other cells treat those letters as dead ends immediately, avoiding redundant re-traversal of already-resolved dictionary entries.
Trie Variants
Compressed Trie / Patricia Trie / Radix Trie
A standard trie has O(W × L) nodes. A compressed trie merges chains of single-child nodes into a single edge labelled with the full substring. The word “automobile” that shares no prefix with other words becomes one edge, not 10 nodes.
When it matters at senior level: IP routing tables (Linux kernel uses a compressed binary trie called a “LC-trie” or “DIR-24-8” for IPv4 forwarding), DNS resolvers (label-compressed trie), and any trie holding long string keys with sparse branching. Patricia tries are the standard for implementing java.util.TreeMap over bit-keyed data and are the internal structure of many production router FIBs. Mention “compressed trie” when asked “how would you scale this to millions of keys?” — it reduces memory from O(ALPHABET × L × W) to O(W) nodes.
Suffix Trie / Suffix Array
A suffix trie stores all suffixes of a single string S. Given S = "banana", it stores "banana", "anana", "nana", "ana", "na", "a". Querying startsWith(pattern) answers “does pattern appear as a substring of S?” in O(|pattern|) time.
When it matters: Full-text search substring matching (grep, strstr), biological sequence alignment (finding all occurrences of a DNA motif in a genome), and data compression (LZ77/LZ78 use a suffix trie to find the longest previous match). In practice, suffix arrays + LCP arrays replace suffix tries due to better cache behaviour — but the conceptual structure is the same trie with all suffix paths.
Ternary Search Trie (TST)
A TST is a hybrid between a BST and a trie. Each node stores one character and has three children: left (characters less than current), equal (next character in word), and right (characters greater). This achieves near-trie lookup speed with memory closer to a BST — no wasted null pointers for an unused alphabet.
When it matters: TSTs are the preferred structure in embedded systems (router firmware, IoT devices) where memory is severely constrained but the key set is dynamic and the alphabet is large. They also handle non-uniform key distributions better than array tries because the BST property on the character dimension allows short-circuiting via ordering. Mention TSTs when the interviewer pushes back on memory overhead of the array[26] variant.
Key Takeaways
When Trie Beats HashMap
- Prefix queries:
startsWith(prefix)is O(L) in a Trie vs O(W × L) for a HashMap that must scan all keys. This is the single biggest reason to use a Trie. - Wildcard / pattern search: Trie DFS branches only on existing characters; HashMap has no structural shortcut — you must iterate all keys.
- Simultaneous multi-word matching: Word Search II — one DFS pass over the grid replaces W separate passes. This is only possible with a Trie because all words share the same traversal structure.
- Lexicographic enumeration: iterating a Trie in DFS order yields all keys in sorted order for free. A HashMap requires collecting and sorting all keys separately.
Memory Overhead Warning
A Trie with array[26] children uses ~208 bytes per node on a 64-bit JVM. For a 1-million-word dictionary with average word length 7, this is 7M nodes × 208 bytes = ~1.5 GB. Always ask whether this is acceptable before committing to a Trie in a system design answer. Alternatives: compressed trie (Patricia) reduces to O(W) nodes; HashMap-children trie reduces by ×26 on average; suffix array replaces suffix trie with O(L) total space.
When NOT to Use Trie
- Simple exact-match lookup: a
HashSet<String>is O(L) for lookup and uses far less memory. A Trie is overkill when you never query prefixes. - Large Unicode alphabets without compression: array[26] only handles a-z. HashMap trie works but has higher constant factor. For Unicode (65536 code points), a compressed trie or suffix array is the right tool.
- One-time queries over a small word list: the O(W × L) Trie build cost is not amortised if you only run one or two searches. HashSet dominates here.
- Memory-constrained environments: if your budget is kilobytes (embedded system, stack-allocated), a sorted array with binary search is more appropriate than a Trie.
Pattern Recognition Triggers
- “Autocomplete / search suggestion” → Trie + prefix walk + subtree enumeration
- “Find all words matching a pattern with wildcards” → Trie + DFS with wildcard branching
- “Find all words on a 2-D grid from a word list” → Build Trie from word list + DFS grid backtracking
- “Longest common prefix of a list of strings” → Trie: find deepest node with exactly one child
- “IP routing / CIDR longest prefix match” → Binary Trie on bit representation
Common Pitfalls
- Forgetting
isEnd: the path to a node proves a prefix exists, not that a complete word was inserted. Always set and checkisEndfor exact-match queries. - Not clearing
isEnd/wordafter finding a match in Word Search II: failure to null out the terminal marker causes the same word to be added multiple times if it appears on the board via different paths. - Not pruning exhausted Trie branches in Word Search II: omitting
isLeafpruning causes O(W) redundant DFS work for every subsequent grid cell, defeating the key performance advantage over brute force. - Array[26] with non-lowercase input: if the problem allows uppercase, digits, or spaces, you must either normalise the input or switch to a HashMap-children Trie.
ch - 'a'will produce a negative index for uppercase letters and an ArrayIndexOutOfBoundsException at runtime. - Forgetting the backtrack step: in grid DFS,
board[r][c] = savedafter recursion is mandatory. Omitting it permanently marks cells visited, causing false negatives for words reachable via different paths.
System Design Connections
| Trie Use Case | Production System |
|---|---|
| Prefix search / autocomplete | Google / Bing search bar, IDE code completion (IntelliJ, VSCode) |
| Wildcard pattern matching | Redis SCAN MATCH pattern, Elasticsearch fuzzy query, shell glob(3) |
| IP longest-prefix-match | Linux kernel FIB (forwarding information base), BGP routers |
| DNS label hierarchy | Authoritative DNS resolvers, BIND, CoreDNS |
| T9 / keyboard prediction | Phone input method engines, SwiftKey |
| Spell-check suggestions | Hunspell, LibreOffice spell engine, browser spell-check |
| Word game AI | Scrabble/Boggle move generators, Wordle solvers |
| Compressed key storage | Redis cluster keyspace partitioning, etcd key prefix watches |