Google Interview Prep
Interview Profile
- Format: 4–5 technical rounds + Googleyness/Leadership round
- Coding bar: Highest of FAANG — Hard problems are routine, not exceptional; edge cases probed aggressively; expect follow-ups that push O(n²) solutions to O(n log n) or better
- Senior bar: Derive time and space complexity rigorously from first principles (not just state it); discuss scalability without prompting; mathematical elegance matters — brute force followed by optimal is expected, not optional
- What distinguishes a senior answer: Connecting the algorithm to a specific Google product (Maps, Search, Spanner, Bigtable, Ads), voluntarily discussing NP-hardness or approximation algorithms for applicable problems, and demonstrating awareness of constant factors in Big-O (cache locality, branch prediction)
- What Google cares about: “Googleyness” (intellectual curiosity, collaborative problem-solving under pressure), algorithmic thinking at scale, depth of CS fundamentals
Pattern Frequency Breakdown
| Pattern | Frequency | Why Google Emphasizes It |
|---|---|---|
| Graphs (Dijkstra / BFS / DFS / Topo) | 30% | Google Maps, web crawl graph, Knowledge Graph |
| Trees | 20% | DOM tree, query plan trees, Bigtable B-tree structure |
| Dynamic Programming | 20% | Ads bid optimization, spell checker, translation alignment |
| String / Trie | 15% | Google Search autocomplete, tokenization, NLP pipelines |
| Math / Bit Manipulation | 10% | Hashing, cryptographic protocols, distributed ID generation |
| Heap / Greedy | 5% | Calendar scheduling, job queue management |
Problems From This Repo
| Problem | Session Link | Why Google |
|---|---|---|
| Network Delay Time (LC 743) | Network Delay Time (LC 743) | Google Maps SSSP — single-source shortest path for turn-by-turn routing |
| Alien Dictionary (LC 269) | Alien Dictionary (LC 269) | Compiler/language ordering — topological sort on inferred character DAG |
| Word Ladder (LC 127) | Word Ladder (LC 127) | BFS on implicit graphs; Google Search “did you mean?” transformation graph |
| Course Schedule (LC 207) | Course Schedule (LC 207) | Build system dependency graph (Blaze/Bazel) — cycle detection via topological sort |
| Word Search II (LC 212) | Word Search II (LC 212) | Trie + DFS grid — multi-keyword search in document corpus |
| Implement Trie (LC 208) | Implement Trie (LC 208) | Google Search autocomplete prefix tree |
| Binary Tree Maximum Path Sum (LC 124) | Binary Tree Maximum Path Sum (LC 124) | Google classic hard tree problem; query plan optimization |
| Edit Distance (LC 72) | Edit Distance (LC 72) | Spell checker, “did you mean?”, Google Translate alignment |
| N-Queens (LC 51) | N-Queens (LC 51) | Constraint satisfaction — general backtracking template |
| Time-Based Key-Value Store (LC 981) | Time-Based Key-Value Store (LC 981) | Spanner MVCC (Multi-Version Concurrency Control), Bigtable versioned row access |
New Problems (Google-Specific)
Problem 1: Sliding Window Maximum (LC 239) — Hard
Link: LeetCode 239
Quick Summary: Given an integer array nums and a window size k, return the maximum value in each sliding window of size k as the window moves from left to right.
Why Google: Google’s infrastructure monitoring systems compute rolling-window maximums for capacity planning — “what was the peak CPU utilization in any 5-minute window over the last 24 hours?” is this exact query. The monotonic deque solution enables O(1) amortized max queries over a sliding window, which is critical for high-throughput telemetry pipelines processing millions of data points per second.
Sub-pattern note: This introduces the monotonic deque — a double-ended queue that maintains elements in sorted order. It is distinct from the monotonic stack (Session 5), which only supports LIFO. The deque’s head always holds the current window maximum, and stale elements (outside the window) are evicted from the front.
Senior talking point: Discuss how this maps to Google’s Monarch time-series monitoring system — sliding window aggregations (MAX, MIN, P99) over streaming metrics use exactly this deque-based approach. Ask the interviewer: “Should the window be time-based or count-based?” — a real production distinction that demonstrates systems awareness.
Approach 1: Brute Force — Scan Every Window
// Time: O(n * k) Space: O(n - k + 1) output only
public int[] maxSlidingWindowBrute(int[] nums, int k) {
int n = nums.length;
int[] result = new int[n - k + 1];
for (int i = 0; i <= n - k; i++) {
int max = nums[i];
for (int j = i + 1; j < i + k; j++) {
max = Math.max(max, nums[j]);
}
result[i] = max;
}
return result;
}- Time: O(n * k) — for each of n-k+1 windows, scan k elements
- Space: O(1) auxiliary (O(n) for output array)
- Tradeoff: Simple but O(n*k) is unacceptable for large windows. When k ≈ n/2, this degrades to O(n²). The repeated work is that each element participates in up to k windows, yet it is scanned fresh each time.
Approach 2: Monotonic Deque (Optimal)
// Time: O(n) Space: O(k)
public int[] maxSlidingWindow(int[] nums, int k) {
int n = nums.length;
int[] result = new int[n - k + 1];
// Deque stores INDICES; front = index of current window max
// Maintained in monotonically DECREASING order of nums[index]
Deque<Integer> deque = new ArrayDeque<>();
for (int i = 0; i < n; i++) {
// 1. Remove indices outside the current window from the front
while (!deque.isEmpty() && deque.peekFirst() < i - k + 1) {
deque.pollFirst();
}
// 2. Remove indices from the back whose values are <= nums[i]
// They can never be the max for any future window (nums[i] is larger
// and will outlast them in the window)
while (!deque.isEmpty() && nums[deque.peekLast()] <= nums[i]) {
deque.pollLast();
}
// 3. Add current index
deque.offerLast(i);
// 4. Record result once the first full window is formed
if (i >= k - 1) {
result[i - k + 1] = nums[deque.peekFirst()];
}
}
return result;
}- Time: O(n) — each element is added and removed from the deque at most once (amortized O(1) per element)
- Space: O(k) — deque holds at most k indices at any time
- Key Insight: The deque maintains a monotonically decreasing sequence of candidate maximums. When a new element arrives, we evict from the back any elements it dominates (they are smaller and will leave the window sooner — they can never be the max while the new element is present). Eviction from the front handles window expiry. The deque head is always the index of the current window’s maximum. Every element enters and exits the deque exactly once, giving O(n) total.
Problem 2: Data Stream as Disjoint Intervals (LC 352) — Hard
Link: LeetCode 352
Quick Summary: Design a data structure that receives integers from a data stream and returns a list of disjoint intervals representing all integers seen so far, after each addNum call.
Why Google: Google Spanner uses MVCC (Multi-Version Concurrency Control) where each transaction holds a range of committed timestamp intervals. Tracking which timestamp ranges are committed vs. pending is exactly the online interval merging problem. floorKey / higherKey on a TreeMap provides O(log n) lookup of the immediately preceding and succeeding intervals — the same operations Spanner uses to check committed range coverage.
Sub-pattern note: This introduces the TreeMap floor/ceiling/higher/lower navigation API — not covered in the existing sessions, which use HashMap. The TreeMap merge pattern (check left neighbor, check right neighbor, merge if overlapping) is reused in LC 57 (Insert Interval) and LC 715 (Range Module).
Senior talking point: “In Spanner, this interval set is maintained per-shard and stored in a sorted structure backed by a B-tree (not a Java TreeMap, but the algorithmic contract is identical). The merge operation runs on every transaction commit — it must be O(log n) where n is the number of distinct committed ranges, not O(n).”
Approach 1: Sorted List with Linear Scan (Naive)
// Time: O(n) per addNum Space: O(n)
class SummaryRangesNaive {
private List<int[]> intervals = new ArrayList<>();
public void addNum(int val) {
intervals.add(new int[]{val, val});
// Sort and merge — O(n log n) per call
intervals.sort((a, b) -> a[0] - b[0]);
List<int[]> merged = new ArrayList<>();
for (int[] iv : intervals) {
if (merged.isEmpty() || merged.get(merged.size() - 1)[1] < iv[0] - 1) {
merged.add(iv);
} else {
merged.get(merged.size() - 1)[1] =
Math.max(merged.get(merged.size() - 1)[1], iv[1]);
}
}
intervals = merged;
}
public int[][] getIntervals() {
return intervals.toArray(new int[0][]);
}
}- Time: O(n log n) per
addNum(sort dominates) - Space: O(n)
- Tradeoff: Correct but O(n log n) per insert is far too slow for a high-throughput stream. The sort is wasted work — we only need to merge the new value with its immediate neighbors.
Approach 2: TreeMap with Floor/Higher Key Navigation (Optimal)
// Time: O(log n) per addNum Space: O(n)
class SummaryRanges {
// Key = interval start, Value = interval end
// TreeMap keeps intervals sorted by start — enables O(log n) neighbor lookup
private TreeMap<Integer, Integer> map = new TreeMap<>();
// Time: O(log n)
public void addNum(int val) {
if (map.containsKey(val)) return; // already covered
// Check left neighbor: largest start <= val
Integer leftStart = map.floorKey(val);
// Check right neighbor: smallest start > val
Integer rightStart = map.higherKey(val);
boolean mergeLeft = leftStart != null && map.get(leftStart) >= val - 1;
boolean mergeRight = rightStart != null && rightStart <= val + 1;
if (mergeLeft && mergeRight) {
// Merge both: extend left interval to cover right interval's end
map.put(leftStart, Math.max(map.get(leftStart), map.get(rightStart)));
map.remove(rightStart);
} else if (mergeLeft) {
// Extend left interval rightward to include val
map.put(leftStart, Math.max(map.get(leftStart), val));
} else if (mergeRight) {
// Extend right interval leftward to include val
int rightEnd = map.get(rightStart);
map.remove(rightStart);
map.put(val, rightEnd);
} else {
// No neighbors — new standalone interval
map.put(val, val);
}
}
// Time: O(n)
public int[][] getIntervals() {
int[][] result = new int[map.size()][2];
int idx = 0;
for (Map.Entry<Integer, Integer> e : map.entrySet()) {
result[idx][0] = e.getKey();
result[idx][1] = e.getValue();
idx++;
}
return result;
}
}- Time: O(log n) per
addNum— TreeMap operations are O(log n); O(n) pergetIntervals - Space: O(n) — at most n intervals in the map
- Key Insight: The TreeMap stores intervals sorted by start, allowing O(log n) lookup of the immediately left (
floorKey) and right (higherKey) neighboring intervals. The merge logic is four cases: merge both neighbors, merge left only, merge right only, or insert standalone. The>= val - 1(not>= val) condition handles adjacency: [1,3] and 4 should merge to [1,4] because they are consecutive, not just overlapping.
Problem 3: Palindrome Pairs (LC 336) — Hard
Link: LeetCode 336
Quick Summary: Given a list of unique words, find all pairs (i, j) such that words[i] + words[j] is a palindrome. Return all such [i, j] pairs.
Why Google: Google’s NLP pipelines perform extensive string classification and structural analysis. Palindrome pair detection exercises deep understanding of string structure and Trie composition — the same skills needed to implement efficient pattern matchers, tokenizers, and prefix/suffix indexes. It is a Google interview classic specifically because it tests whether candidates can enumerate all structural cases without missing one.
Sub-pattern note: This problem tests the 5-case palindrome pair decomposition — not a mechanical Trie traversal. Understanding and enumerating the five structural cases (split word into prefix+suffix where one half is palindrome and the other half’s reverse is in the dictionary) is the entire challenge. The Trie/HashMap acts as the dictionary lookup, not as the core algorithm.
Senior talking point: “The brute-force O(n² * k) approach (check all pairs) becomes O(n * k²) with the 5-case HashMap approach. At Google’s NLP pipeline scale (indexing billions of documents), even O(n * k) total operations matter — this is why building a reversed-word index and doing targeted lookups beats pairwise comparison.”
Approach 1: Brute Force — Check All Pairs
// Time: O(n² * k) Space: O(n * k) for storing words
// k = average word length
public List<List<Integer>> palindromePairsBrute(String[] words) {
List<List<Integer>> result = new ArrayList<>();
for (int i = 0; i < words.length; i++) {
for (int j = 0; j < words.length; j++) {
if (i != j && isPalindrome(words[i] + words[j])) {
result.add(Arrays.asList(i, j));
}
}
}
return result;
}
private boolean isPalindrome(String s) {
int l = 0, r = s.length() - 1;
while (l < r) {
if (s.charAt(l++) != s.charAt(r--)) return false;
}
return true;
}- Time: O(n² * k) — n² pairs, each concatenation + palindrome check is O(k)
- Space: O(n²) output in worst case
- Tradeoff: Works for n < 1000, but O(n² * k) for n = 10^4, k = 10 is 10^9 operations — too slow.
Approach 2: HashMap Reversed-Word Lookup with 5-Case Analysis (Optimal)
// Time: O(n * k²) Space: O(n * k)
public List<List<Integer>> palindromePairs(String[] words) {
// Map from reversed word → its original index
Map<String, Integer> reverseMap = new HashMap<>();
for (int i = 0; i < words.length; i++) {
reverseMap.put(new StringBuilder(words[i]).reverse().toString(), i);
}
List<List<Integer>> result = new ArrayList<>();
for (int i = 0; i < words.length; i++) {
String word = words[i];
int len = word.length();
for (int cut = 0; cut <= len; cut++) {
String prefix = word.substring(0, cut); // words[i][0..cut)
String suffix = word.substring(cut); // words[i][cut..len)
// Case 1: prefix is palindrome, reversed(suffix) is in dict
// → reversed(suffix) + words[i] forms a palindrome
if (isPalindrome(prefix)) {
Integer j = reverseMap.get(suffix);
// j != i: avoid self-pairing; j != null: exists in dict
if (j != null && j != i) {
result.add(Arrays.asList(j, i));
}
}
// Case 2: suffix is palindrome, reversed(prefix) is in dict
// → words[i] + reversed(prefix) forms a palindrome
// Guard: cut < len avoids re-adding the full-word case when cut == len
// (cut == 0 prefix case already handles "" empty string scenario)
if (cut < len && isPalindrome(suffix)) {
Integer j = reverseMap.get(prefix);
if (j != null && j != i) {
result.add(Arrays.asList(i, j));
}
}
}
}
return result;
}
private boolean isPalindrome(String s) {
int l = 0, r = s.length() - 1;
while (l < r) {
if (s.charAt(l++) != s.charAt(r--)) return false;
}
return true;
}- Time: O(n * k²) — for each of n words, try k+1 cuts; each cut does an O(k) palindrome check and an O(k) map lookup
- Space: O(n * k) — reversed word map
- Key Insight: For
words[i] + words[j]to be a palindrome, one of five structural relationships must hold. Rather than checking all O(n²) pairs, for each word we split it at every cut point and ask: “Is there a word in the dictionary that, placed before or after this word, would complete a palindrome?” The reversed-word map turns the lookup into O(k) (hashing). The guardcut < lenon Case 2 prevents duplicate entries when bothprefix = ""(cut=0 in Case 1) andsuffix = ""(cut=len in Case 2) would add the same pair. This is the hardest part to get right in an interview — acknowledge the deduplication guard explicitly.
Problem 4: Maximum Profit in Job Scheduling (LC 1235) — Hard
Link: LeetCode 1235
Quick Summary: Given n jobs each with a start time, end time, and profit, schedule non-overlapping jobs to maximize total profit.
Why Google: Google Calendar’s “auto-schedule” feature and Google Ads slot assignment are both weighted interval scheduling problems. “Which set of non-overlapping meetings (or ad impressions) maximizes total value?” — this is the canonical senior-level DP + Binary Search combination at Google. It tests whether candidates can identify that sorting by end time plus binary search for the latest non-overlapping predecessor gives O(n log n).
Sub-pattern note: This is the canonical DP + Binary Search on sorted intervals pattern — not present elsewhere in the repo. The combination of interval DP (1D, indexed by sorted end time) with binary search to find the latest non-conflicting job is a recurring Google/FAANG pattern (also appears in LC 646 — Maximum Length of Pair Chain, LC 2008 — Maximum Earnings From Taxi).
Senior talking point: “The weighted interval scheduling problem has a known O(n log n) exact solution via DP+BS. For the unweighted variant, greedy (Earliest Deadline First) suffices. Always clarify: are profits uniform? If so, greedy is optimal and simpler. If profits are heterogeneous (weighted), DP+BS is necessary. This is the kind of problem where a senior candidate asks the clarifying question before writing any code.”
Approach 1: Brute Force Backtracking
// Time: O(2^n) Space: O(n) recursion stack
public int jobSchedulingBrute(int[] startTime, int[] endTime, int[] profit) {
int n = startTime.length;
// Sort by end time for the backtracking to be sensible
Integer[] idx = new Integer[n];
for (int i = 0; i < n; i++) idx[i] = i;
Arrays.sort(idx, (a, b) -> endTime[a] - endTime[b]);
return backtrack(idx, startTime, endTime, profit, 0, 0);
}
private int backtrack(Integer[] idx, int[] start, int[] end, int[] profit,
int i, int currentEnd) {
if (i == idx.length) return 0;
int job = idx[i];
int skip = backtrack(idx, start, end, profit, i + 1, currentEnd);
int take = 0;
if (start[job] >= currentEnd) {
take = profit[job] + backtrack(idx, start, end, profit, i + 1, end[job]);
}
return Math.max(skip, take);
}- Time: O(2^n) — exponential subset choices
- Space: O(n) recursion stack
- Tradeoff: Exponential — illustrates the subproblem structure (take vs. skip each job) before optimizing.
Approach 2: DP + Binary Search on Sorted End Times (Optimal)
// Time: O(n log n) Space: O(n)
public int jobScheduling(int[] startTime, int[] endTime, int[] profit) {
int n = startTime.length;
// Bundle jobs and sort by END time — DP processes jobs in order of completion
int[][] jobs = new int[n][3];
for (int i = 0; i < n; i++) {
jobs[i][0] = startTime[i];
jobs[i][1] = endTime[i];
jobs[i][2] = profit[i];
}
Arrays.sort(jobs, (a, b) -> a[1] - b[1]);
// dp[i] = max profit considering jobs[0..i-1] (1-indexed for convenience)
// dp[0] = 0 (no jobs)
int[] dp = new int[n + 1];
// Store sorted end times for binary search
int[] endTimes = new int[n];
for (int i = 0; i < n; i++) endTimes[i] = jobs[i][1];
for (int i = 0; i < n; i++) {
int jobStart = jobs[i][0];
int jobProfit = jobs[i][2];
// Binary search: find the latest job that ends <= jobStart
// i.e., the largest index j such that endTimes[j] <= jobStart
int lo = 0, hi = i - 1, prev = -1;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2;
if (endTimes[mid] <= jobStart) {
prev = mid;
lo = mid + 1;
} else {
hi = mid - 1;
}
}
// Option 1: skip job i → dp[i]
// Option 2: take job i → dp[prev+1] + jobProfit (prev+1 because dp is 1-indexed)
int take = (prev >= 0 ? dp[prev + 1] : 0) + jobProfit;
dp[i + 1] = Math.max(dp[i], take);
}
return dp[n];
}- Time: O(n log n) — sort O(n log n) + n iterations each with O(log n) binary search
- Space: O(n) — dp array + endTimes array
- Key Insight: Sorting by end time imposes a natural DP order —
dp[i]represents the maximum profit from the firstijobs (by end time). When considering jobi, we binary search for the latest job that ends at or beforejobs[i].start(non-overlapping predecessor).dp[prev+1]gives the best profit achievable before this job starts, without any overlap. The 1-indexed dp array simplifies the “no previous job” base case:dp[0] = 0covers the case where no earlier job is compatible. This DP+BS combination is O(n log n) total — the binary search is the key optimization over an O(n²) naive DP that scans all predecessors linearly.
Problem 5: Design Search Autocomplete System (LC 642) — Hard
Link: LeetCode 642
Quick Summary: Design a search autocomplete system. Given historical sentences and their frequencies, implement input(char c) which returns the top 3 hottest sentences with the given prefix after each character is typed.
Why Google: This is literally the algorithmic core of Google Search Suggest. The problem directly exercises Trie + min-heap composition, incremental prefix construction, and ranking by frequency. At Google, candidates are expected to discuss in-memory Trie vs. disk-backed prefix index (for trillion-query history), and how re-ranking works with freshness signals layered on top of raw frequency.
Sub-pattern note: This tests Trie + min-heap composition — a pattern not addressed individually in the repo. The Trie is extended to store per-node frequency maps, and a size-bounded min-heap (size 3) extracts the top-k efficiently. The incremental input management (appending to a StringBuilder, flushing on #) is also a design-level detail that separates careful implementations from sloppy ones.
Senior talking point: “In Google’s production autocomplete, the Trie is sharded by prefix across many machines (the first 2 characters determine the shard). Each shard serves a subtree. The input() call routes to the correct shard, which maintains the top-k candidates locally. The aggregation of top-k from multiple shards is a distributed merge (LC 23 — Merge K Sorted Lists). The in-memory Trie in this problem is a single-shard model.”
Approach 1: HashMap Prefix Scan (Naive)
// Time: O(n * l + m) per input call Space: O(n * l)
// n = total sentences, l = avg length, m = matching sentences
class AutocompleteSystemNaive {
private Map<String, Integer> freqMap = new HashMap<>();
private StringBuilder current = new StringBuilder();
public AutocompleteSystemNaive(String[] sentences, int[] times) {
for (int i = 0; i < sentences.length; i++) {
freqMap.put(sentences[i], times[i]);
}
}
public List<String> input(char c) {
if (c == '#') {
// Commit current input as a new sentence
String typed = current.toString();
freqMap.merge(typed, 1, Integer::sum);
current.setLength(0);
return new ArrayList<>();
}
current.append(c);
String prefix = current.toString();
// Scan all sentences for matching prefix — O(n * l)
List<String> matches = new ArrayList<>();
for (String sentence : freqMap.keySet()) {
if (sentence.startsWith(prefix)) matches.add(sentence);
}
// Sort: descending frequency, then lexicographic
matches.sort((a, b) -> {
int cmp = freqMap.get(b) - freqMap.get(a);
return cmp != 0 ? cmp : a.compareTo(b);
});
return matches.subList(0, Math.min(3, matches.size()));
}
}- Time: O(n * l) per
input— scans all sentences for prefix match - Space: O(n * l) — HashMap stores all sentence strings
- Tradeoff: Correct but O(n * l) per keystroke is unacceptable for large sentence histories. For Google’s scale (billions of query logs), this would be millions of string comparisons per character typed.
Approach 2: Trie + Min-Heap for Top-3 Retrieval (Optimal)
// Time: O(l * log 3) per input call (l = current prefix length)
// Space: O(total characters across all sentences)
class AutocompleteSystem {
// Trie node: maps char → child node, stores frequency of sentences ending here
private static class TrieNode {
Map<Character, TrieNode> children = new HashMap<>();
Map<String, Integer> counts = new HashMap<>(); // sentence → frequency at this node
}
private TrieNode root;
private TrieNode currentNode; // tracks traversal position as user types
private StringBuilder current = new StringBuilder();
public AutocompleteSystem(String[] sentences, int[] times) {
root = new TrieNode();
currentNode = root;
for (int i = 0; i < sentences.length; i++) {
insert(sentences[i], times[i]);
}
}
private void insert(String sentence, int freq) {
TrieNode node = root;
for (char c : sentence.toCharArray()) {
node.children.putIfAbsent(c, new TrieNode());
node = node.children.get(c);
// Store sentence+freq at every prefix node — enables O(1) top-k at that prefix
node.counts.merge(sentence, freq, Integer::sum);
}
}
// Time: O(l + k log k) per input where l = prefix length, k = candidates at prefix
// In practice, k is bounded by history size; min-heap of size 3 makes extraction O(k log 3)
public List<String> input(char c) {
if (c == '#') {
String typed = current.toString();
insert(typed, 1); // commit typed sentence with frequency +1
current.setLength(0);
currentNode = root; // reset traversal
return new ArrayList<>();
}
current.append(c);
// Advance currentNode by one character — O(1)
if (currentNode != null) {
currentNode = currentNode.children.get(c);
}
if (currentNode == null) {
// No sentences with this prefix
return new ArrayList<>();
}
// Use min-heap of size 3 to extract top-3 efficiently
// Min-heap ordered by: frequency ASC, then lexicographic DESC
// (so the "worst" of the top-3 is always at the top for easy eviction)
PriorityQueue<Map.Entry<String, Integer>> minHeap = new PriorityQueue<>(
(a, b) -> a.getValue().equals(b.getValue())
? b.getKey().compareTo(a.getKey()) // lexicographic desc → smaller lex at top
: a.getValue() - b.getValue() // freq asc → smaller freq at top
);
for (Map.Entry<String, Integer> entry : currentNode.counts.entrySet()) {
minHeap.offer(entry);
if (minHeap.size() > 3) minHeap.poll(); // evict worst candidate
}
// Drain heap into result (heap is min-ordered; reverse for descending output)
LinkedList<String> result = new LinkedList<>();
while (!minHeap.isEmpty()) {
result.addFirst(minHeap.poll().getKey());
}
return result;
}
}- Time: O(l * s) to build Trie for sentence of length l (s sentences); O(k log 3) per
inputwhere k = sentences matching the current prefix - Space: O(total sentence characters × number of prefixes) — each node’s
countsmap stores all sentences passing through it; this is O(n * l²) in the worst case, which is the standard tradeoff for a “fat” Trie that enables O(1) prefix retrieval - Key Insight: Storing the full sentence→frequency map at every prefix node (a “fat” Trie) makes prefix lookup O(1) — you simply read the
countsmap at the current node rather than DFS-collecting all descendant leaves. This trades space (O(n * l²) vs O(n * l) for a standard Trie) for query time. The min-heap of size 3 extracts the top-3 in O(k log 3) ≈ O(k) time, and thecurrentNodepointer eliminates re-traversing the prefix on every keystroke. ResettingcurrentNodetorooton#is the easy-to-miss state management detail — forgetting this causes the traversal pointer to persist into the next query.
System Design Connections
Google Maps → Dijkstra (Session 14)
Turn-by-turn routing is SSSP on a weighted directed graph (road segments as edges, travel time as weight). At Google’s scale, the graph has 10^9+ nodes (road intersections globally). Pre-computation (Contraction Hierarchies, ALT landmarks) reduces query time from O(E log V) Dijkstra to milliseconds. Session 14 LC 743 is the algorithmic core; the follow-up is always “how do you scale this to the entire road graph?”
Google Search → Trie + Heap (Sessions 13, LC 642)
The autocomplete Trie is sharded by prefix — the first 2 characters of a query determine the serving shard. Each shard runs LC 642-style ranking. The aggregation layer merges top-k lists from multiple shards (LC 23 — Merge K Sorted Lists). Freshness signals (recency of query) are weighted into the frequency score using an exponential decay function.
Spanner → TreeMap Intervals (LC 352)
Spanner’s MVCC tracking of committed timestamp ranges is the interval merging problem at massive scale. Each Spanner tablet tracks committed ranges as a sorted interval set (equivalent to a TreeMap). The floorKey/higherKey operations map to B-tree predecessor/successor lookups in Spanner’s implementation.
Bigtable → Binary Search on Sorted Keys (Session 3)
Bigtable stores rows in sorted row-key order within each tablet. Versioned values (multiple timestamps per row-key+column) are stored in sorted timestamp order within each cell. LC 981 (Time-Based Key-Value Store) models exactly this: the get(key, timestamp) operation is a binary search for the largest timestamp ≤ the query timestamp — the same as Bigtable’s get(row, col, maxTimestamp) API.
Google Ads → DP + Binary Search (LC 1235)
Ad slot assignment maximizes revenue from non-overlapping ad impressions. Each impression has a time window and a bid price. The weighted interval scheduling solution (LC 1235) is the exact algorithm used in offline auction clearing. Online variants (real-time bidding) use relaxed greedy approaches, but the DP+BS solution is the benchmark.
2-Week Sprint Plan
Week 1 — Graphs, Trees, and DP
| Day | Focus | Problems |
|---|---|---|
| 1 | SSSP + Topo sort | LC 743 (Dijkstra), LC 269 (Alien Dict), LC 207 (Course Schedule) |
| 2 | Implicit graph BFS | LC 127 (Word Ladder) — bidirectional BFS optimization |
| 3 | Hard trees | LC 124 (Max Path Sum), LC 236 (LCA) |
| 4 | DP 2D | LC 72 (Edit Distance), LC 1143 (LCS) |
| 5 | DP + Binary Search | LC 1235 (Job Scheduling) |
| 6–7 | Review + mock | Rederive Big-O from scratch; timed 45-min sessions |
Week 2 — Strings, Design, and Advanced Patterns
| Day | Focus | Problems |
|---|---|---|
| 8 | Trie problems | LC 208, LC 212 (Word Search II), LC 642 (Autocomplete) |
| 9 | Monotonic deque | LC 239 (Sliding Window Max) — compare with Session 5 mono-stack |
| 10 | TreeMap intervals | LC 352 (Disjoint Intervals), LC 57 (Insert Interval) |
| 11 | Hard strings | LC 336 (Palindrome Pairs) — 5-case decomposition |
| 12 | Backtracking | LC 51 (N-Queens), LC 79 (Word Search) |
| 13–14 | Full mock interviews | 2 × 45-min sessions; practice deriving Big-O verbally |
Senior-Level Talking Points
Google expects you to derive Big-O from first principles, not just state it. For every solution, walk through the reasoning out loud.
-
On Sliding Window Maximum: “The amortized O(1) per element comes from the observation that each element is pushed onto the deque at most once and popped at most once. The total number of push+pop operations across all n elements is therefore at most 2n, giving O(n) total. The deque size is bounded by k because we evict expired indices.”
-
On Job Scheduling DP+BS: “The DP recurrence is:
dp[i] = max(dp[i-1], dp[pred(i)] + profit[i])wherepred(i)is the latest job ending beforejobs[i].start. Without binary search, findingpred(i)for all i is O(n²) total. Since we sort by end time, the end-time array is sorted — binary search reducespred(i)lookup to O(log n) per job, giving O(n log n) total.” -
On NP-hardness awareness: “If you change Job Scheduling to allow jobs with dependencies (a job can only start after all its prerequisite jobs complete), the problem becomes NP-hard (it generalizes the Scheduling with Precedence Constraints problem). Google interviewers sometimes probe this: ‘What if jobs have dependencies?’ — the correct answer is ‘that’s NP-hard in general; for DAG dependencies, we can add topological ordering, but the weighted variant remains NP-hard.’”
-
On Palindrome Pairs edge case: “The deduplication guard (
cut < lenon Case 2) prevents double-counting the pair where one word is the full reverse of the other. If words are['abc', 'cba'],cut=3in Case 1 andcut=0in Case 2 would both add the pair[1, 0]— the guard avoids this. Stating this explicitly in the interview signals careful case analysis, which Google values highly.” -
On Autocomplete system design escalation: “For a follow-up ‘how do you handle 1 billion distinct queries?’, the in-memory Trie breaks — 1B * avg 10 chars = 10GB, infeasible for a single machine. The solution is a prefix-partitioned Trie cluster: shard by first 2 characters (26² = 676 shards), each shard serves ~1.5M unique prefixes. The
input()API becomes a fan-out to the relevant shard, with a top-k merge step. This maps directly to Google’s Suggest serving infrastructure.” -
On deriving Big-O verbally (Google-specific): “Always trace through the algorithm step by step when explaining complexity: ‘The outer loop runs n times. Inside, the binary search runs O(log n). The dp array update is O(1). So the total is O(n) * O(log n) = O(n log n). The sort at the beginning is also O(n log n), which dominates the DP phase only by a constant, so the overall complexity is O(n log n).’ Google interviewers explicitly check that you can produce this derivation, not just recite the answer.”