Practice Session 8: Graph — BFS, DFS, Topological Sort

Overview

Graphs are the universal model for relationships between things. Any problem involving reachability, ordering with dependencies, or spreading through a network is a graph problem. At senior level the key skill is not just “run BFS/DFS” but choosing between them deliberately: BFS guarantees shortest path in unweighted graphs, DFS is cheaper when you only need existence (is there a path? is there a cycle?), and topological sort expresses an ordering constraint that maps directly to build systems and package managers. Recognising the grid-as-graph encoding and the reverse BFS trick for ocean / multi-source problems is what separates a 10-year engineer’s answer from a junior’s.


Problem 1: Number of Islands

Link: LeetCode 200

Quick Summary: Given a 2-D binary grid of '1' (land) and '0' (water), count the number of islands. An island is a group of adjacent (4-directionally) land cells surrounded by water.

Industry Context: Connected-component labelling is the algorithmic core of image segmentation pipelines (medical imaging, autonomous-vehicle perception), geographic information systems (counting contiguous land parcels), and network topology analysis (finding isolated subnets in a data-centre fabric).


Approach 1: DFS Flood-Fill (in-place mark)

// O(m × n) time  O(m × n) space (recursion stack)
public static int numIslandsDFS(char[][] grid) {
    if (grid == null || grid.length == 0) return 0;
    int rows = grid.length, cols = grid[0].length, islands = 0;
    for (int r = 0; r < rows; r++) {
        for (int c = 0; c < cols; c++) {
            if (grid[r][c] == '1') {
                islands++;
                dfsFlood(grid, r, c);   // mutates grid: '1' → '0'
            }
        }
    }
    return islands;
}
  • Time: O(m × n) — every cell visited at most once
  • Space: O(m × n) worst-case recursion depth (a snake-shaped island)
  • Tradeoff: Concise and idiomatic. Mutates the input, which is unacceptable if the grid must be preserved. In large grids (satellite imagery, 10⁶ × 10⁶) the recursion depth triggers StackOverflowError.

Approach 2: BFS with Explicit Visited Matrix (non-mutating)

// O(m × n) time  O(m × n) space
public static int numIslandsBFS(char[][] grid) {
    int rows = grid.length, cols = grid[0].length, islands = 0;
    boolean[][] visited = new boolean[rows][cols];
    int[][] dirs = {{1,0},{-1,0},{0,1},{0,-1}};
    for (int r = 0; r < rows; r++) {
        for (int c = 0; c < cols; c++) {
            if (grid[r][c] == '1' && !visited[r][c]) {
                islands++;
                Queue<int[]> q = new ArrayDeque<>();
                q.offer(new int[]{r, c});
                visited[r][c] = true;
                while (!q.isEmpty()) {
                    int[] cell = q.poll();
                    for (int[] d : dirs) {
                        int nr = cell[0]+d[0], nc = cell[1]+d[1];
                        if (nr>=0 && nr<rows && nc>=0 && nc<cols
                                && grid[nr][nc]=='1' && !visited[nr][nc]) {
                            visited[nr][nc] = true;
                            q.offer(new int[]{nr, nc});
                        }
                    }
                }
            }
        }
    }
    return islands;
}
  • Time: O(m × n)
  • Space: O(min(m, n)) queue in best case (thin island); O(m × n) worst case
  • Key Insight: Mark visited[nr][nc] = true before enqueuing, not after dequeuing. Marking after dequeue allows the same cell to be enqueued multiple times, which blows up the queue size and gives wrong island counts. BFS also uses heap memory for the queue instead of call-stack memory, making it safe for arbitrarily deep grids in production.

Problem 2: Clone Graph

Link: LeetCode 133

Quick Summary: Given a reference to a node in a connected undirected graph, return a deep copy of the entire graph.

Industry Context: Deep-copying a graph with shared references is exactly what happens when a workflow engine (Apache Airflow, Temporal) snapshots a running DAG for replay, or when a compiler IR clones a function call graph for inlining and optimisation. The HashMap<original, clone> pattern is the standard solution to the object-graph serialisation problem in ORMs and marshalling libraries.


Approach 1: DFS with HashMap

// O(V + E) time  O(V) space
public static GraphNode cloneGraphDFS(GraphNode node) {
    if (node == null) return null;
    return dfsClone(node, new HashMap<>());
}
 
private static GraphNode dfsClone(GraphNode orig, Map<GraphNode, GraphNode> map) {
    if (map.containsKey(orig)) return map.get(orig);
    GraphNode clone = new GraphNode(orig.val);
    map.put(orig, clone);                          // register BEFORE recursing
    for (GraphNode n : orig.neighbors)
        clone.neighbors.add(dfsClone(n, map));
    return clone;
}
  • Time: O(V + E) — each node and edge processed once
  • Space: O(V) for the map + O(V) recursion depth in the worst case
  • Tradeoff: Elegant and minimal code. The critical ordering is: create the clone and register it in the map before recursing into neighbours. Reversing this order causes infinite loops on cyclic graphs.

Approach 2: BFS with HashMap (stack-overflow safe)

// O(V + E) time  O(V) space
public static GraphNode cloneGraphBFS(GraphNode node) {
    if (node == null) return null;
    Map<GraphNode, GraphNode> map = new HashMap<>();
    Queue<GraphNode> queue = new ArrayDeque<>();
    map.put(node, new GraphNode(node.val));
    queue.offer(node);
    while (!queue.isEmpty()) {
        GraphNode curr = queue.poll();
        for (GraphNode neighbor : curr.neighbors) {
            if (!map.containsKey(neighbor)) {
                map.put(neighbor, new GraphNode(neighbor.val));
                queue.offer(neighbor);
            }
            map.get(curr).neighbors.add(map.get(neighbor));
        }
    }
    return map.get(node);
}
  • Time: O(V + E)
  • Space: O(V) for map + queue
  • Key Insight: The HashMap simultaneously acts as the clone registry and the visited set. Creating the clone entry before enqueuing ensures that when the same node is reached via multiple paths (in an undirected graph each undirected edge is two directed edges), we always return the single canonical clone rather than creating duplicates.

Problem 3: Course Schedule

Link: LeetCode 207

Quick Summary: Given numCourses and a list of prerequisite pairs [a, b] meaning “must take b before a”, determine if it is possible to finish all courses (i.e., the prerequisite graph contains no cycle).

Industry Context: Kahn’s algorithm is npm/Maven/Gradle dependency resolution. When you run npm install, the package manager builds a directed graph of package dependencies and runs topological sort; a cycle means two packages each depend on the other — an error that must be surfaced before any installation begins. The same pattern appears in Makefiles, Bazel build graphs, and database migration runners that must apply SQL scripts in dependency order.


Approach 1: DFS 3-Colour Cycle Detection

// O(V + E) time  O(V + E) space
public static boolean canFinishDFS(int numCourses, int[][] prerequisites) {
    List<List<Integer>> adj = buildAdjList(numCourses, prerequisites);
    int[] color = new int[numCourses]; // 0=unvisited 1=visiting 2=done
    for (int i = 0; i < numCourses; i++)
        if (color[i] == 0 && hasCycleDFS(adj, color, i)) return false;
    return true;
}
 
private static boolean hasCycleDFS(List<List<Integer>> adj, int[] color, int u) {
    color[u] = 1;
    for (int v : adj.get(u)) {
        if (color[v] == 1) return true;            // back-edge → cycle
        if (color[v] == 0 && hasCycleDFS(adj, color, v)) return true;
    }
    color[u] = 2;
    return false;
}
  • Time: O(V + E)
  • Space: O(V + E) for adjacency list + O(V) recursion
  • Tradeoff: 3-colour (white/grey/black) is essential for directed graphs. A simple boolean visited array only works for undirected graphs — in a directed graph a grey node reached via a different path is a back-edge (cycle), not just a revisit.

Approach 2: Kahn’s BFS Topological Sort (production standard)

// O(V + E) time  O(V + E) space
public static boolean canFinishKahn(int numCourses, int[][] prerequisites) {
    List<List<Integer>> adj = buildAdjList(numCourses, prerequisites);
    int[] inDegree = new int[numCourses];
    for (int[] pre : prerequisites) inDegree[pre[0]]++;
 
    Queue<Integer> queue = new ArrayDeque<>();
    for (int i = 0; i < numCourses; i++)
        if (inDegree[i] == 0) queue.offer(i);
 
    int processed = 0;
    while (!queue.isEmpty()) {
        int course = queue.poll();
        processed++;
        for (int next : adj.get(course))
            if (--inDegree[next] == 0) queue.offer(next);
    }
    return processed == numCourses;
}
  • Time: O(V + E)
  • Space: O(V + E)
  • Key Insight: If processed < numCourses after the BFS, some nodes were never enqueued — meaning they always had at least one unresolved prerequisite, which is only possible if those nodes form a cycle. This single check replaces all the bookkeeping of DFS coloring. Kahn’s is also iterative and therefore safe for graphs with thousands of nodes where DFS would overflow the stack.

Problem 4: Pacific Atlantic Water Flow

Link: LeetCode 417

Quick Summary: Given an m × n integer grid of heights, find all cells from which water can flow to both the Pacific Ocean (top/left border) and the Atlantic Ocean (bottom/right border). Water flows to adjacent cells with equal or lesser height.

Industry Context: Multi-source reverse-BFS models any problem where you want to find all cells “reachable from a set of sinks”. In infrastructure: finding all data-centre nodes that can route traffic to both of two redundant upstream providers. In ecology: finding terrain cells in a watershed that drain to both river systems. The multi-source BFS pattern also models CDN cache invalidation (propagate invalidation backwards from edge nodes to origin) and wildfire spread simulation from multiple simultaneous ignition points.


Approach 1: Reverse DFS from Both Ocean Borders

// O(m × n) time  O(m × n) space
public static List<List<Integer>> pacificAtlanticDFS(int[][] heights) {
    int rows = heights.length, cols = heights[0].length;
    boolean[][] pac = new boolean[rows][cols];
    boolean[][] atl = new boolean[rows][cols];
    for (int r = 0; r < rows; r++) {
        dfsOcean(heights, pac, r, 0,        Integer.MIN_VALUE);
        dfsOcean(heights, atl, r, cols - 1, Integer.MIN_VALUE);
    }
    for (int c = 0; c < cols; c++) {
        dfsOcean(heights, pac, 0,        c, Integer.MIN_VALUE);
        dfsOcean(heights, atl, rows - 1, c, Integer.MIN_VALUE);
    }
    List<List<Integer>> res = new ArrayList<>();
    for (int r = 0; r < rows; r++)
        for (int c = 0; c < cols; c++)
            if (pac[r][c] && atl[r][c]) res.add(List.of(r, c));
    return res;
}
  • Time: O(m × n) — each cell visited at most twice (once per ocean)
  • Space: O(m × n) for visited arrays + recursion stack
  • Tradeoff: Conceptually clean once you see the reversal insight. Recursion depth is O(m × n) in worst case — use BFS variant for production with large grids.

Approach 2: Multi-Source BFS (optimal, stack-overflow safe)

// O(m × n) time  O(m × n) space
public static List<List<Integer>> pacificAtlanticBFS(int[][] heights) {
    int rows = heights.length, cols = heights[0].length;
    boolean[][] pac = new boolean[rows][cols];
    boolean[][] atl = new boolean[rows][cols];
    Queue<int[]> pQ = new ArrayDeque<>(), aQ = new ArrayDeque<>();
    for (int r = 0; r < rows; r++) {
        pac[r][0] = true;       pQ.offer(new int[]{r, 0});
        atl[r][cols-1] = true;  aQ.offer(new int[]{r, cols-1});
    }
    for (int c = 0; c < cols; c++) {
        pac[0][c] = true;       pQ.offer(new int[]{0, c});
        atl[rows-1][c] = true;  aQ.offer(new int[]{rows-1, c});
    }
    bfsOcean(heights, pQ, pac);
    bfsOcean(heights, aQ, atl);
    List<List<Integer>> res = new ArrayList<>();
    for (int r = 0; r < rows; r++)
        for (int c = 0; c < cols; c++)
            if (pac[r][c] && atl[r][c]) res.add(List.of(r, c));
    return res;
}
  • Time: O(m × n)
  • Space: O(m × n)
  • Key Insight: Seeding all border cells of each ocean into the BFS queue simultaneously is the multi-source pattern. It is mathematically equivalent to adding a virtual super-source node connected to all border cells, which collapses what would otherwise be O(border size) separate BFS calls into one. Mark a cell visited before enqueuing to prevent it from being added to the queue multiple times, which would multiply work proportional to in-degree.

Problem 5: Word Ladder

Link: LeetCode 127

Quick Summary: Given a beginWord, endWord, and a dictionary, find the length of the shortest transformation sequence from beginWord to endWord, changing exactly one letter at a time, with each intermediate word in the dictionary. Return 0 if no such sequence exists.

Industry Context: BFS word ladder is the exact algorithm behind spell-checker suggestion distance — the minimum number of single-character edits separating a misspelled word from a dictionary word. It also underpins genetic sequence mutation analysis (minimum SNP hops between two alleles), password-strength analysis (how many single-char changes until a password matches a known weak password), and network routing where “one hop” means one autonomous system boundary crossing.


Approach 1: Naive BFS (26-character brute scan)

// O(M² × N) time  O(M × N) space  where M = word length, N = dict size
public static int ladderLengthNaive(String beginWord, String endWord, List<String> wordList) {
    Set<String> wordSet = new HashSet<>(wordList);
    if (!wordSet.contains(endWord)) return 0;
    Queue<String> queue = new ArrayDeque<>();
    queue.offer(beginWord);
    wordSet.remove(beginWord); // visited
    int level = 1;
    while (!queue.isEmpty()) {
        int size = queue.size();
        for (int i = 0; i < size; i++) {
            char[] word = queue.poll().toCharArray();
            for (int j = 0; j < word.length; j++) {
                char orig = word[j];
                for (char ch = 'a'; ch <= 'z'; ch++) {
                    if (ch == orig) continue;
                    word[j] = ch;
                    String next = new String(word);
                    if (next.equals(endWord)) return level + 1;
                    if (wordSet.contains(next)) {
                        wordSet.remove(next);
                        queue.offer(next);
                    }
                }
                word[j] = orig;
            }
        }
        level++;
    }
    return 0;
}
  • Time: O(M² × N) — for each word (N) × each position (M) × 26 chars, each new String is O(M)
  • Space: O(M × N)
  • Tradeoff: The HashSet membership check is O(M) per probe (hashing the string). Correct and reasonably fast for small dictionaries; fine for interview communication, but the pattern-bucket version has better cache behaviour for large dictionaries.

Approach 2: BFS with Prebuilt Pattern Buckets (optimal)

// O(M² × N) time  O(M² × N) space
public static int ladderLengthOptimal(String beginWord, String endWord, List<String> wordList) {
    Set<String> wordSet = new HashSet<>(wordList);
    if (!wordSet.contains(endWord)) return 0;
    int M = beginWord.length();
 
    // Pre-group: pattern → [words matching that pattern]
    Map<String, List<String>> patternMap = new HashMap<>();
    for (String word : wordList) {
        for (int j = 0; j < M; j++) {
            String pat = word.substring(0, j) + "*" + word.substring(j + 1);
            patternMap.computeIfAbsent(pat, k -> new ArrayList<>()).add(word);
        }
    }
 
    Queue<String> queue = new ArrayDeque<>();
    Set<String> visited = new HashSet<>();
    queue.offer(beginWord);
    visited.add(beginWord);
    int level = 1;
    while (!queue.isEmpty()) {
        int size = queue.size();
        for (int i = 0; i < size; i++) {
            String word = queue.poll();
            for (int j = 0; j < M; j++) {
                String pat = word.substring(0, j) + "*" + word.substring(j + 1);
                for (String neighbor : patternMap.getOrDefault(pat, List.of())) {
                    if (neighbor.equals(endWord)) return level + 1;
                    if (!visited.contains(neighbor)) {
                        visited.add(neighbor);
                        queue.offer(neighbor);
                    }
                }
            }
        }
        level++;
    }
    return 0;
}
  • Time: O(M² × N) — preprocessing O(M²N), BFS O(M²N)
  • Space: O(M² × N) for the pattern map
  • Key Insight: The pattern map is an inverted index — exactly the data structure behind full-text search engines and spell-checker backends. By paying a one-time O(M²N) preprocessing cost, each BFS expansion retrieves all edit-distance-1 neighbours via a single HashMap lookup rather than iterating 26 × M character substitutions per word. The wildcard pattern "h*t" maps "hat", "hit", "hot", "hut" to a shared bucket; a lookup on the current word’s pattern immediately yields all reachable neighbours.

Graph Algorithm Decision Guide

BFS vs DFS vs Topological Sort

SituationAlgorithmWhy
Shortest path (unweighted)BFSBFS explores by layers; first time a node is reached = minimum hops
Reachability / connected componentsDFS or BFS (either)DFS has smaller constant, BFS is stack-overflow safe
Cycle detection (directed graph)DFS 3-colour or Kahn’sDFS: back-edge to grey node = cycle; Kahn’s: unprocessed nodes = cycle
Cycle detection (undirected graph)DFS with parent tracking or Union-FindNever colour 2-colour for directed — use parent parameter instead
Ordering with dependencies (DAG)Topological sort (Kahn’s BFS)Natural fit; Kahn’s is iterative and production-safe
Exhaustive path search / all pathsDFS with backtrackingBFS would need to track full path state; DFS backtracking is natural
Multi-source spreadingBFS with all seeds in initial queueCollapses N separate BFS calls into one by virtual super-source
Detecting if graph is bipartiteBFS with 2-coloringBFS level alternation maps to bipartite partition

Cycle Detection Techniques

TechniqueGraph TypeComplexityNotes
DFS 3-colour (white/grey/black)DirectedO(V + E)Back-edge to grey = cycle; only for directed
DFS with parent parameterUndirectedO(V + E)Skip edge back to direct parent; cross-edges allowed
Union-Find (DSU)UndirectedO(α(V)) per opBest when edges arrive as a stream (online); also detects bridges
Kahn’s BFS in-degreeDirectedO(V + E)Cycle ↔ processed < V; also produces topological order

Adjacency Matrix vs Adjacency List

Adjacency MatrixAdjacency List
SpaceO(V²) — bad for sparse graphsO(V + E) — standard for sparse
Edge lookupO(1)O(degree)
Enumerate neighboursO(V) per nodeO(degree) per node
When to useDense graphs (E ≈ V²), quick edge presence check, Floyd-WarshallAlmost always in interview problems; real-world social/dependency graphs
Real-worldRouting tables in small networks, game state adjacencyWeb graphs, social networks, build graphs

Grid-as-Graph Pattern

A 2-D grid is an implicit adjacency list. The standard encoding:

int[][] dirs = {{1,0},{-1,0},{0,1},{0,-1}}; // 4-directional
// For 8-directional (diagonal): add {1,1},{1,-1},{-1,1},{-1,-1}
 
// Neighbour iteration
for (int[] d : dirs) {
    int nr = row + d[0], nc = col + d[1];
    if (nr >= 0 && nr < rows && nc >= 0 && nc < cols) {
        // process neighbour (nr, nc)
    }
}

Key decisions on grids:

  • Mark visited before enqueuing (not after dequeuing) to prevent the same cell entering the queue O(in-degree) times
  • DFS flood: safe to mutate the grid in-place if a copy is not needed — avoids allocating a separate visited array
  • Multi-source BFS: seed all “origin” cells into the initial queue; equivalent to a virtual super-source with zero-cost edges to all origins

Key Takeaways

Pattern Recognition Triggers

  • “Count connected groups / components” → DFS or BFS flood-fill (islands, friend circles)
  • “Shortest path, fewest steps, minimum transformations” → BFS (guarantees minimum level = minimum cost)
  • “Can all tasks/courses be completed? Is there a cycle?” → Topological sort (Kahn’s BFS or DFS 3-colour)
  • “Find cells reachable from a boundary / multiple sources” → Multi-source BFS (pacific-atlantic, rotting oranges)
  • “Clone / deep-copy a graph with cycles” → DFS/BFS + HashMap<original, clone> — register clone before recursing
  • “Transformation sequence between states (word ladder, puzzle)” → BFS with state as node + pattern/bucket map

Common Pitfalls

  • Forgetting the visited set entirely: BFS/DFS without a visited set loops forever on any cycle. Always mark visited before enqueuing, not after dequeuing — marking after dequeue allows O(in-degree) duplicate entries.
  • Directed vs undirected cycle detection: using a simple boolean visited[] for directed graphs misses the case where a node is reachable via two separate non-cyclic paths. You need 3-colour DFS or Kahn’s for directed graphs.
  • DFS recursion depth on large grids: a 1000×1000 grid yields recursion depth up to 10⁶. Always prefer iterative BFS (or an explicit DFS stack) in production for grid problems.
  • Mutating shared state in DFS flood: if numIslandsDFS is called multiple times on the same grid instance it gives wrong answers after the first call. Use a separate visited[][] or copy the grid.
  • Word Ladder — not seeding endWord check correctly: the BFS returns level + 1 (not level) because level counts the words already expanded, and endWord is the next level.
  • Kahn’s — off-by-one on inDegree: decrement and then check == 0, not check then decrement. A node becomes available exactly when its last prerequisite is processed.

System Design Connections

AlgorithmSystem Design Analogue
BFS shortest pathNetwork routing (hop-count minimisation); Dijkstra generalises BFS to weighted edges
Kahn’s topological sortnpm/Maven/Gradle dependency resolution; database migration runner ordering; Kubernetes init-container DAG
DFS cycle detectionDeadlock detection in transaction managers (wait-for graph cycle = deadlock); circular import detection in Python/Java module loaders
Multi-source BFSWildfire / epidemic spread simulation; CDN cache invalidation wave; distance-to-nearest-facility in logistics
Word Ladder BFS + inverted indexSpell-checker suggestion engine; fuzzy search with edit distance; genetic sequence mutation distance
Clone Graph (HashMap)Object-graph serialisation in ORMs (Hibernate session detach); workflow snapshot for replay in Temporal/Airflow; JVM object deep-copy in serialisation frameworks

Graph in Language / Platform Design

The JVM’s garbage collector runs a graph reachability algorithm (mark-and-sweep) from GC roots to decide which objects are live. Understanding BFS vs DFS traversal order helps explain generational GC tradeoffs (young-gen is a BFS frontier; old-gen is the settled “black” set). Kubernetes controllers reconcile desired vs actual state using topological ordering of dependent resources — the same pattern as Kahn’s algorithm applied to a live cluster graph.