Graphs¶
graph 用于建模关系与连接:从社交网络到道路导航,再到依赖链。本文件覆盖表示方法、BFS、DFS、shortest path、topological sort 与 connected component,并给出图算法面试中最常见的遍历与路径模式。
-
我们已经在第 12 章学习过 graph theory(adjacency matrix、Laplacian、spectral 性质),在第 13 章学习过更偏理论的内容(tree、planarity、colouring)。本文件聚焦算法模式:如何在代码中遍历、搜索与优化 graph。
-
graph 题最核心的两大算法是 BFS 与 DFS。绝大多数 graph 问题都可归约到这两者(或其改造版)。把这两个模式吃透,就能覆盖大部分题目。
Graph 表示方法¶
- adjacency list:为每个 node 存一份邻居列表。空间复杂度 \(O(|V| + |E|)\),最适合 sparse graph(真实世界大多数 graph 都是 sparse)。
# Undirected graph
graph = {
0: [1, 2],
1: [0, 3],
2: [0, 3],
3: [1, 2]
}
# From edge list
def build_graph(n, edges):
graph = {i: [] for i in range(n)}
for u, v in edges:
graph[u].append(v)
graph[v].append(u) # omit for directed
return graph
-
adjacency matrix:\(n \times n\) 矩阵,若边 \((i, j)\) 存在则 \(A[i][j] = 1\)。空间复杂度 \(O(|V|^2)\),适合 dense graph 或需要 \(O(1)\) 判边场景。
-
如何选择:绝大多数情况下优先 adjacency list。只有当 graph 很稠密(\(|E| \approx |V|^2\))或你必须频繁常数时间判断边存在性时,才考虑 matrix。
Pattern: BFS(Breadth-First Search)¶
- BFS 用 queue 逐层扩展(level by level),适用于:
- 无权 graph 的 shortest path
- 层序遍历
- connected component
- 任意“最少步数”问题
from collections import deque
def bfs(graph, start):
visited = {start}
queue = deque([start])
while queue:
node = queue.popleft()
for neighbour in graph[node]:
if neighbour not in visited:
visited.add(neighbour)
queue.append(neighbour)
- 关键细节:
visited应在 enqueue 时标记,而非 dequeue 时。若出队才标记,同一 node 可能被多个前驱重复入队,造成额外开销甚至错误。
Easy: Number of Islands¶
-
问题:给定 2D 网格
'1'(陆地)和'0'(海水),统计岛屿个数。 -
模式:扫描网格,遇到
'1'就发起一次 BFS/DFS,把该连通陆地全部标记为已访问。发起次数即岛屿数。
from collections import deque
def num_islands(grid):
if not grid:
return 0
rows, cols = len(grid), len(grid[0])
count = 0
for r in range(rows):
for c in range(cols):
if grid[r][c] == '1':
count += 1
# BFS to mark entire island
queue = deque([(r, c)])
grid[r][c] = '0' # mark visited
while queue:
cr, cc = queue.popleft()
for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
nr, nc = cr + dr, cc + dc
if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == '1':
grid[nr][nc] = '0'
queue.append((nr, nc))
return count
-
常见坑:
directions = [(0,1),(0,-1),(1,0),(-1,0)]是 4 邻接网格题的高频模板。建议直接记忆;若是 8 邻接再加四个对角方向。 -
常见坑:此解法通过
grid[r][c] = '0'原地标记,省去visited集合,但会修改输入。面试中可用,但要主动说明这一取舍。
Medium: Rotting Oranges¶
-
问题:新鲜橘子与腐烂橘子相邻会被感染。求全部腐烂所需最短时间,若无法全部腐烂返回 -1。
-
模式:多源 BFS(multi-source BFS)。把所有初始腐烂橘子同时入队;每一层 BFS 对应 1 个时间单位。
from collections import deque
def oranges_rotting(grid):
rows, cols = len(grid), len(grid[0])
queue = deque()
fresh = 0
for r in range(rows):
for c in range(cols):
if grid[r][c] == 2:
queue.append((r, c))
elif grid[r][c] == 1:
fresh += 1
if fresh == 0:
return 0
time = 0
while queue and fresh > 0:
time += 1
for _ in range(len(queue)):
cr, cc = queue.popleft()
for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
nr, nc = cr + dr, cc + dc
if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == 1:
grid[nr][nc] = 2
fresh -= 1
queue.append((nr, nc))
return time if fresh == 0 else -1
- 关键洞察:multi-source BFS 计算的是“每个点到任一源点的最短距离”,正好对应“最后一个新鲜橘子被感染所需时间”。
Pattern: DFS(Depth-First Search)¶
- DFS 会尽可能向深处走,再回溯。可用显式 stack 或递归调用栈实现。典型应用:
- 环检测
- topological sort
- connected component
- 回溯/穷举
- 带约束路径搜索
def dfs(graph, node, visited=None):
if visited is None:
visited = set()
visited.add(node)
for neighbour in graph[node]:
if neighbour not in visited:
dfs(graph, neighbour, visited)
Medium: Course Schedule(Cycle Detection)¶
-
问题:给定课程数与先修关系,判断能否修完全部课程(即有向图中是否无环)。
-
模式:在 directed graph 中做环检测。DFS 使用三色状态:未访问、访问中(当前路径上)、已完成。
def can_finish(num_courses, prerequisites):
graph = {i: [] for i in range(num_courses)}
for course, prereq in prerequisites:
graph[course].append(prereq)
# 0 = unvisited, 1 = in-progress, 2 = completed
state = [0] * num_courses
def has_cycle(node):
if state[node] == 1:
return True # back edge → cycle
if state[node] == 2:
return False # already fully explored
state[node] = 1 # mark in-progress
for neighbour in graph[node]:
if has_cycle(neighbour):
return True
state[node] = 2 # mark completed
return False
for course in range(num_courses):
if has_cycle(course):
return False
return True
- 为什么必须三状态:两状态(visited/unvisited)无法区分“正在当前 DFS 路径中”与“已完全处理完”。碰到状态 1 才是 back edge(有环);碰到状态 2 只是交叉/前向边,不代表有环。
Medium: Course Schedule II(Topological Sort)¶
-
问题:输出一种合法修课顺序(topological order)。
-
模式(Kahn algorithm,BFS 版):把入度为 0 的 node 入队,逐个弹出并削减其邻居入度,新的入度 0 再入队,直到处理结束。
from collections import deque
def find_order(num_courses, prerequisites):
graph = {i: [] for i in range(num_courses)}
indegree = [0] * num_courses
for course, prereq in prerequisites:
graph[prereq].append(course)
indegree[course] += 1
queue = deque([i for i in range(num_courses) if indegree[i] == 0])
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbour in graph[node]:
indegree[neighbour] -= 1
if indegree[neighbour] == 0:
queue.append(neighbour)
return order if len(order) == num_courses else [] # empty = cycle exists
- 常见坑:若结果长度小于 node 总数,说明有环(某些 node 入度永远无法降到 0)。
Shortest Path¶
Dijkstra Algorithm¶
- 用于非负 weighted graph 中,从单源到其余所有 node 的 shortest path。实现依赖 priority queue(min-heap)。
import heapq
def dijkstra(graph, start):
# graph: {node: [(neighbour, weight), ...]}
dist = {node: float('inf') for node in graph}
dist[start] = 0
heap = [(0, start)]
while heap:
d, node = heapq.heappop(heap)
if d > dist[node]:
continue # stale entry
for neighbour, weight in graph[node]:
new_dist = d + weight
if new_dist < dist[neighbour]:
dist[neighbour] = new_dist
heapq.heappush(heap, (new_dist, neighbour))
return dist
-
时间复杂度:二叉 heap 下为 \(O((|V| + |E|) \log |V|)\)。
-
常见坑:
if d > dist[node]: continue必不可少。否则会反复处理过期堆项,严重拖慢性能,甚至退化。 -
常见坑:Dijkstra 不适用于负权边。只要存在负权,贪心前提(节点一旦出堆即最优)就被破坏,应改用 Bellman-Ford。
Hard: Network Delay Time¶
- 问题:给定 \(n\) 个 node 和有向带权边,求信号从源点到达所有 node 的时间;若有 node 不可达返回 -1。
def network_delay(times, n, k):
graph = {i: [] for i in range(1, n + 1)}
for u, v, w in times:
graph[u].append((v, w))
dist = dijkstra(graph, k)
max_time = max(dist.values())
return max_time if max_time < float('inf') else -1
Strongly Connected Component(SCC)¶
-
在 directed graph 中,SCC 是一个极大 node 集,使得集合内任意两 node 彼此可达。
-
Kosaraju algorithm: 1) 在原图 DFS,记录完成时间顺序; 2) 转置图(反向所有边); 3) 按完成时间逆序在转置图 DFS。第 3 步每棵 DFS 树就是一个 SCC。
-
适用场景:查循环依赖、2-SAT、把有向图压缩为 SCC DAG。
Common Pitfalls Summary¶
| Pitfall | Example | Fix |
|---|---|---|
| 出队时才标记 visited | 同一 node 多次入队 | 入队时立刻标记 |
| 有向图只用两状态 visited | 回边与交叉边混淆 | 用三状态(未访问/访问中/完成) |
| 负权图用 Dijkstra | shortest path 错误 | 换 Bellman-Ford |
| 忘记跳过 stale heap 项 | 重复处理过期路径 | 加 if d > dist[node]: continue |
| 网格越界判断漏写 | 数组越界异常 | 固定写边界条件 |
| 忽略 time=0 情况 | Rotting Oranges 无新鲜橘子 | BFS 前先判 fresh == 0 |
| 有向边错建成双向 | 先修关系方向错误 | 只按题意单向建边 |
Take-Home Problems(NeetCode)¶
BFS Pattern¶
- Number of Islands — 网格 BFS/DFS
- Rotting Oranges — 多源 BFS
- Clone Graph — BFS + hash map 克隆
- Pacific Atlantic Water Flow — 双源反向 BFS
- Word Ladder — 隐式 graph BFS
DFS Pattern¶
- Max Area of Island — DFS 计面积
- Course Schedule — 有向图判环
- Course Schedule II — topological sort
- Number of Connected Components — DFS 或 Union-Find
- Graph Valid Tree — 连通 + 无环
Shortest Path¶
- Network Delay Time — Dijkstra
- Cheapest Flights Within K Stops — 带约束 BFS/Bellman-Ford
- Swim in Rising Water — 二分 + BFS 或 Dijkstra
Advanced¶
- Alien Dictionary — 字符偏序的 topological sort