Chapter 8. Graph#
1. Build#
We can simply use Leetcode conventions to represent graph as nodes and its adjacent matrix, which can be implemented as a list of list in python.
def buildGraph(n_nodes, connections):
graph = {}
for node in range(n_nodes):
graph[node] = []
for node1, node2 in connections:
graph[node1].append(node2)
# if build an undirected graph, add the following line
# graph[node2].append(node1)
return graph
# test
n_nodes = 5
connections = [[0, 1], [0, 2], [1, 2], [1, 3], [2, 3], [2, 4]]
graph = buildGraph(n_nodes, connections)
print(graph)
{0: [1, 2], 1: [2, 3], 2: [3, 4], 3: [], 4: []}
2. Traversal#
Traversal can solve problems like find paths given the source node and target node, and also the topological sorting sequences.
valid-DAG problem
: check if a given DAG is valid
check if the DAG
G
has directed cyclesif no cycle, topological sort is a linear ordering of nodes along a horizontal line so that all directed edges go from left to right.
if cycles are present, topological sorting is not possible.
single-pair all paths problem
:
find all possible paths from
s
tot
These two types of problems can be solved by traversing the graph. There are two different ways of traversing a graph:
Breadth-first Search (BFS)
Depth-first Search (DFS)
2.1. Valid DAG#
Check if a DAG is valid. There should have no directed cycles in the graph.
The following graph is a DAG, because node 0
points to node 2
, node 1
points to node 0
and node 2
, node 2
points to nothing.
Obviously, there is no directed cycle.
G = [[2], [0,2], []]
This is not a valid DAG, because there is a directed cycle, i.e., node 0 -> node 2
, node 2 -> node 1
, and node 1 -> node 0
.
G = [[2], [0], [1]]
Leetcode
2.1.1. BFS#
The first logic that came to mind about using BFS is shown as follows:
use BFS to detect if there is a cycle
initialize a queue, and a
visisted
set to track which node has been visitedgiven node, travere its neighbor
This method, however, is not correct as its implementation.
the
visited
store all the nodes that has been visited.the case [[0,1], [0,2], [2,1]], has a cycle from undirected graph point of view, but no cycle from directed graph point of view
when traversing the node
0
, BFS will enqueue its dependency[1,2]
and mark as visisted.when going to the neighbor node
1
, its neighbor node2
is enqueued. But at previous step the node2
has already been marked asvisited
, which will be identified INCORRETLY as acycle
.
The main reason is the visited
set in this sense didn’t store parent/children information during traversal.
# the following code is only used for detecting cycles in a undirected graph
# for directed graph, this will not work as directed cycle is not the same as undirected cycle
def isValidDAG(graph):
# check if there is a cycle in the graph
def hasCycle(graph, source):
# initialize a set to store the visited nodes
visited = set()
# initialize a queue to store the nodes to be explored
queue = [source]
# loop until the queue is empty
while queue:
# pop the first node in the queue
node = queue.pop(0)
# if the node is already visited, then there is a cycle
if node in visited:
return True
# mark the node as visited
visited.add(node)
# explore all the neighbors of the node
for neighbor in graph[node]:
queue.append(neighbor)
return False
for i in graph.keys():
if hasCycle(graph, i):
return False
return True
# test
# False
nodes = 2
connections = [[0, 1], [1, 0]]
graph = buildGraph(nodes, connections)
print(isValidDAG(graph))
# True
nodes = 5
connections = [[0, 1], [0, 2], [1, 2], [1, 3], [2, 3], [2, 4]]
graph = buildGraph(nodes, connections)
print(isValidDAG(graph))
False
False
A correct implementation
is to use a indegree
set to record the indegree of each node. Indegree
is the number of dependencies a node requires, the number of incoming connections.
When do a topological sorting, we need start from a node with 0 indegree
, and such nodes are the starting point of a topological sort tree.
Algorithm:
step 0: get the indegree for each node in the graph
step 1: find nodes with
indegree of 0
, and enqueue to a queuestep 2: dequeue the queue,and decrease the indegree of the connected nodes from the popped node by 1
可以想象成在graph中把该节点抹去,那么该节点指向的其他节点的indegree就减1了
repeat step 1 and 2
the traversal order is the topological sort if there is no cycle detected
if the number of nodes that has been popped out from the queue equals to the number of total nodes, then there is no cycle.
if there is a cycle, the number of nodes that have been popped out from the queue should be less than the toal number of nodes.
def isValidDAG(graph):
indegree = {}
for node in graph.keys():
indegree[node] = 0
for node in graph.keys():
for edge in graph[node]:
indegree[edge] += 1
def hasCycle(graph, indegree):
# initialize a queue to store the nodes with indegree 0
queue = []
for node in indegree.keys():
if indegree[node] == 0:
queue.append(node)
# loop until the queue is empty
while queue:
# pop the first node in the queue
node = queue.pop(0)
# explore all the neighbors of the node
for neighbor in graph[node]:
indegree[neighbor] -= 1
if indegree[neighbor] == 0:
queue.append(neighbor)
# if there is a cycle, there will be a node with indegree > 0
for node in indegree.keys():
if indegree[node] > 0:
return True
return False
return not hasCycle(graph, indegree)
# test
# False
nodes = 2
connections = [[0, 1], [1, 0]]
graph = buildGraph(nodes, connections)
print(isValidDAG(graph))
# True
nodes = 5
connections = [[0, 1], [0, 2], [1, 2], [1, 3], [2, 3], [2, 4]]
graph = buildGraph(nodes, connections)
print(isValidDAG(graph))
False
True
2.1.2. Depth-first Search#
todo: implement using two approaches
traversal approach -> similar to backtracking
subporblem approach -> recusion with return value
DFS approach is more straight-forward than BFS in terms of detecting a loop, as DFS uses recursions and if a node was visited before in the same recursion, then there is a cycle.
Simply check if a node is visited or not. If it is visited, then there is a loop in the graph.
FOLLOW UP:
if we want to know the path of the loop,
we can use a
parent
dictionary to keep track of the parent of each node. Then we can use the parent dictionary to trace back the path of the loop.add a path parameter to the dfs function, and append the current node to the path. If there is a loop, then the path will contain the loop.
NOTES
think about when and how to use visited
or path
.
visited
: the visited records all the nodes that have been visited, and explored. If during the recursion, the node marked asvisited
is visited again, there is no need to further do the calculation as it’s been done before.or we can understand it as:
visited
is for the global graph point of view to make sure the traversal is not performed twice for the same node.
path
: the traced record all the nodes that are currently on the recursion path, which might not be visited/finished. - or we can understand it as:path
is the current recursion path, and can be used to see if current path has loop.
def isValidDAG(graph):
# initialize a set to store the visited nodes and a set to store the nodes in the current path
visited = set()
path = set()
def hasCycle(graph, source):
# if the node is already in path, then there is a cycle
if source in path:
return True
# if the node is already visited, then there is no cycle
if source in visited:
return False
# add the node to the path
path.add(source)
# add the node to visited
visited.add(source)
# explore all neighbors of the node
for neighbor in graph[source]:
if hasCycle(graph, neighbor):
return True
# remove the node from the path after recursion to restore the path to the state before the function call
path.remove(source)
# loop through all nodes in the graph in case the graph is not connected
for i in graph.keys():
if hasCycle(graph, i):
return False
return True
# test
# False
nodes = 2
connections = [[0, 1], [1, 0]]
graph = buildGraph(nodes, connections)
print(isValidDAG(graph))
# True
nodes = 5
connections = [[0, 1], [0, 2], [1, 2], [1, 3], [2, 3], [2, 4]]
graph = buildGraph(nodes, connections)
print(isValidDAG(graph))
False
True
DFS vs Backtracking DFS aims to operate on the node itself, and backtracking operates on branch (from node to another node). See the following code differences.
// DFS, focus on node
def dfs(root):
if not root:
reutrn
print(f"enter node {root}")
for child in root.children:
dfs(child)
print(f"leave node {root}")
// Backtracking, focus on branch
def backtrack(root):
if not root:
return
for child in root.children:
// make a choice
print(f"from node {root} to node {child}")
backtrack(child)
//undo a choice
print(f"from node {child} to node {root}")
2.2 Single Pair All Paths Problem#
Given a directed acyclic graph (DAG) of n
nodes labeled from 0
to n - 1
, find all possible paths from node 0
to node n - 1
and return them in any order.
Leetcode
2.2.1 BFS#
# BFS
def allPaths(graph, source, target):
# initialize results
paths = []
# initialize a queue to store the paths
queue = [[source]]
# loop until the queue is empty
while queue:
# pop the first path in the queue
path = queue.pop(0)
# get the last node in the path
node = path[-1]
# if the node is the target, then return the path
if node == target:
paths.append(path)
# explore all the neighbors of the node
for neighbor in graph[node]:
queue.append(path + [neighbor])
return paths
graph = [[1,2],[3],[3],[]]
print(allPaths(graph, 0, 3))
[[0, 1, 3], [0, 2, 3]]
2.2.2 DFS#
DFS uses path
to store the nodes in current recursion.
We can add a terminating case, where the targeted end node is reached.
def allPaths(graph, source, target):
# initialize a list to store all the paths
paths =[]
# initialize a list to store the current path
path = []
def dfs(graph, source, target):
if source == target:
# if the source is the target, add the current path to the list of paths
paths.append(path + [source])
return
path.append(source)
for neighbor in graph[source]:
dfs(graph, neighbor, target)
# remove the last node in the path after recursion to restore the path to the state before the function call
path.pop()
dfs(graph, source, target)
return paths
# test
graph = [[1,2],[3],[3],[]]
print(allPaths(graph, 0, 3))
[[0, 1, 3], [0, 2, 3]]
2.3 Topological Sorting#
For a tree,
level-order traversal
orpreorder traversal
, orpost-order traversal
can be treated as a topological sort.inorder traversal
is not.level-order traversal
for a graph is not possiblepreorder traversal
may not give topological sort orderfor example, 0->(1, 2), then 2->3), the preorder traversal in a DFS is [0,1,3,2]
postorder traversal
can give the topological order[1, 3, 2, 0]
reversed will be [0, 2, 3, 1]
Only
DAG
has topological sorting
Leetcode
Therefore, we can simply add a list to record the postorder traversal of the graph, and reverse the list to get the topological order.
check if the graph is DAG
if it is DAG, then return a postorder traversal, and reverse the list to get the topological order
if it is not DAG, then not possible to get a topological order
2.3.1 DFS#
# The following code assumes the graph has no cycle
def topological_sorting(graph):
# initialize a list to store the topological order
order = []
# initialize a set to store the visited nodes
visited = set()
def dfs(graph, source):
# if the node is already visited, then return
if source in visited:
return
# mark the node as visited
visited.add(source)
# explore all the neighbors of the node
for neighbor in graph[source]:
dfs(graph, neighbor)
# postorder: add the node to the topological order after recursion to restore the order to the state before the function call
order.append(source)
# loop through all nodes in the graph in case the graph is not connected
for i in graph.keys():
dfs(graph, i)
# reverse the order to get the topological order
order.reverse()
return order
# test
graph = {0: [1, 2], 1: [3], 2: [3], 3: []}
print(topological_sorting(graph))
[0, 2, 1, 3]
2.3.2 BFS#
# topological sorting with BFS
def topological_sorting(graph):
# calculate the indegree of all nodes
indegree = {}
for node in graph.keys():
indegree[node] = 0
for node in graph.keys():
for edge in graph[node]:
indegree[edge] += 1
# initialize a queue to store the nodes with indegree 0
queue = []
for node in indegree.keys():
if indegree[node] == 0:
queue.append(node)
# initialize a list to store the topological order
order = []
# loop until the queue is empty
while queue:
# pop the first node in queue
node = queue.pop(0)
# add the node to the topological order
order.append(node)
# explore all the neighbors of the node
for neighbor in graph[node]:
indegree[neighbor] -= 1
if indegree[neighbor] == 0:
queue.append(neighbor)
# if there is a cycle, then return None
if len(order) != len(graph.keys()):
return None
return order
# test
graph = {0: [1, 2], 1: [3], 2: [3], 3: []}
print(topological_sorting(graph))
[0, 1, 2, 3]
3 Shortest Path Problems#
Shortest path is to find the shortest path between two vertex:
single source shortest path
: find the shortest paths from a given source vertexs
to each vertext
.single destination shortest path
: find the shortest paths from each source vertexs
to the destination vertext
. We can reverse the direction of each edge, and formulate asingle-source shortest path
problem.single pair shortest path
: find the shortest path for a given pair (s
,v
). This problem usually requires to solvesingle source shortest path
first.all pairs shortest paths
: find a shortest path froms
tot
for every pair of verticess
andt
. Although we can solve this problem by running a single source algorithm once from each vertex, we usually can solve it faster using XXX.
BFS is typically used for shortest paths problems.
DFS is typically used for reacheability problems. When used for shortest path problems, DFS need go over all possible paths from given source and target, and then find the shortest one.
3.1 BFS#
def shortestPaths(graph, source):
# initialize shortest paths
paths = []
# initialize a set to store the visited nodes
visited = set()
# initialize path
path = [source]
# initialize a queue to store the paths
queue = [path]
# loop until the queue is empty
while queue:
# pop the first path in the queue
path = queue.pop(0)
# get the last node in the path
node = path[-1]
# save the shortest path between the source and the node
paths.append(path)
# explore all the neighbors of the node
for neighbor in graph[node]:
# if the neighbor is not visited, add it to the visited and the path to the queue
if neighbor not in visited:
visited.add(neighbor)
queue.append(path + [neighbor])
return paths
# test
graph = [[1,2],[3],[3],[]]
print(shortestPaths(graph, 0))
graph = [[1,2,3],[3],[3],[]]
print(shortestPaths(graph, 0))
[[0], [0, 1], [0, 2], [0, 1, 3]]
[[0], [0, 1], [0, 2], [0, 3]]
3.2 DFS#
we can also use DFS, but this might not be efficient:
use DFS to find all reachable paths from given pair
choose the path with minimum weights
4 Shortest Weighted Path#
Dijkstra algorithm
is usually used for detecting the shortest path in a DAG with nonnegative weights
. Dijkstra algorithm extends BFS for weighted DAG by introducing a DP table
to store the distance between vertices.
initialization
a distance table for recording the best distance found so far, and set to
infinity
base case: the distance from the source vertex
s
to itself is set to 0a priority queue to maintain the to-be-explored neighbor vertices, sorted based on their current distance to the given source
s
.
main loop:
pop out the queue, and get the current vertex and its current distance
explore its neighbors:
for each neighbor vertex, calculate its current distance from the source
s
if the current distance < the best distance found so far in the distance table:
update the distance table
enqueue
return the distance table
Note: usually no need to use
visited
set to track which node should be in the queue, because the distance table can be used
How to reconstruct the shortest paths based on distance table?
Leetcode
# This code is generated by Copilot.
def Dijkstra(graph, weights, source):
# initialize a set to store the visited nodes
visited = set()
# initialize a dictionary to store the shortest distance from the source to each node
distance = {}
# initialize a dictionary to store the previous node of each node
previous = {}
# initialize the distance of the source to itself as 0
distance[source] = 0
# initialize the distance of all other nodes to infinity
for node in graph.keys():
if node != source:
distance[node] = float('inf')
# initialize the previous node of the source as None
previous[source] = None
# initialize a queue to store the nodes
queue = [source]
# loop until the queue is empty
while queue:
# pop the first node in the queue
node = queue.pop(0)
visited.add(node)
# explore all the neighbors of the node
for neighbor in graph[node]:
# if the neighbor is not visited, add it to the queue
if neighbor not in visited:
queue.append(neighbor)
# update the distance and previous node of the neighbor
if distance[node] + weights[(node, neighbor)] < distance[neighbor]:
distance[neighbor] = distance[node] + weights[(node, neighbor)]
previous[neighbor] = node
return distance, previous
# test
graph = {0: [1, 2, 3], 1: [3], 2: [3], 3: []}
weights = {(0, 1): 1, (0, 2): 2, (0, 3): 3, (1, 3): 4, (2, 3): 5}
print(Dijkstra(graph, weights, 0))
graph = {0: [2,3], 1: [2,3], 2: [0,1,3], 3: [0,1,2]}
weights = {(0, 2): 1, (0, 3): 1, (1, 2): 1, (1, 3): 1, (2, 0): 1, (2, 1): 1, (2, 3): 1, (3, 0): 1, (3, 1): 1, (3, 2): 1}
print(Dijkstra(graph, weights, 0))
({0: 0, 1: 1, 2: 2, 3: 3}, {0: None, 1: 0, 2: 0, 3: 0})
({0: 0, 1: 2, 2: 1, 3: 1}, {0: None, 2: 0, 3: 0, 1: 2})
# This version outputs the shortest path from the source to each node
def Dijkstra(graph, weights, source):
# initialize a dictionary to store the shortest distance from the source to each node
distance = {}
# initialize a dictionary to store the previous node of each node
previous = {}
# initialize a dictionary to store the shortest path from the source to each node
paths = {}
# initialize the distance of all other nodes to infinity
distance = {node: float('inf') for node in graph.keys()}
# initialize the distance of the source to itself as 0
distance[source] = 0
# initialize the previous node of the source as None
previous[source] = None
# initialize the path of the source to itself as [source]
paths[source] = [source]
# initialize a queue to store the nodes
queue = [(distance[source], paths[source])]
# loop until the queue is empty
while queue:
# pop the path and its last node in the queue
min_dist, path = queue.pop(0)
node = path[-1]
# this is very important, because the node may have been visited before
# if the distance of the node is not the shortest, skip it
if min_dist > distance[node]:
continue
# explore all the neighbors of the node
for neighbor in graph[node]:
# update the distance and previous node of the neighbor
dist_neighbor = distance[node] + weights[(node, neighbor)]
if dist_neighbor < distance[neighbor]:
distance[neighbor] = dist_neighbor
previous[neighbor] = node
paths[neighbor] = path + [neighbor]
queue.append((dist_neighbor, path + [neighbor]))
return distance, previous, paths
# test
graph = {0: [1, 2, 3], 1: [3], 2: [3], 3: []}
weights = {(0, 1): 1, (0, 2): 2, (0, 3): 3, (1, 3): 4, (2, 3): 5}
print(Dijkstra(graph, weights, 0))
graph = {0: [2,3], 1: [2,3], 2: [0,1,3], 3: [0,1,2]}
weights = {(0, 2): 1, (0, 3): 1, (1, 2): 1, (1, 3): 1, (2, 0): 1, (2, 1): 1, (2, 3): 1, (3, 0): 1, (3, 1): 1, (3, 2): 1}
print(Dijkstra(graph, weights, 0))
({0: 0, 1: 1, 2: 2, 3: 3}, {0: None, 1: 0, 2: 0, 3: 0}, {0: [0], 1: [0, 1], 2: [0, 2], 3: [0, 3]})
({0: 0, 1: 2, 2: 1, 3: 1}, {0: None, 2: 0, 3: 0, 1: 2}, {0: [0], 2: [0, 2], 3: [0, 3], 1: [0, 2, 1]})
5 Bipartition Graph#
Check if a graph is bipartition graph. We just need traverse the graph, and color the nodes with two colors. If any two adjacent nodes in the resulting graph have different colors, then the graph is bipartitional.
Algorithm
BFS
maintain a queue that stores nodes at the same breadth yet not visited
pop out the nodes one by one
examine its neighbors
if the neighbor is not visited, color them with a different color than the node itself
if the neighbor is visited, and has the same color as its parent, then not a valid bipartition graph
DFS
traverse all the neighbors of the node
if the neighbor is not visited, color them with a different color than the node itself
if the node is already visited, and has the same color as its parent, then not a valid bipartition graph
Leetcode
0785
5.1 BFS#
def isValidBipartitionGraph(graph):
# initialize a set to store the visited nodes
visited = set()
# initialize a dictionary to store the colors of the nodes
colors = {}
def bfs(graph, source):
# if the source is visited, then the graph is bipartite
if source in visited:
return True
# initialize a queue to store the nodes
queue = [source]
# initialize the color of the source
colors[source] = 0
# loop until the queue is empty
while queue:
# pop the first node in the queue
node = queue.pop(0)
# add the node to the visited set
visited.add(node)
# explore all the neighbors of the node
for neighbor in graph[node]:
# if the neighbor is not visited, add it to the queue
if neighbor not in visited:
queue.append(neighbor)
# color the neighbor with the opposite color of the node
colors[neighbor] = 1 - colors[node]
# if the neighbor is visited and has the same color as the node, then the graph is not bipartite
elif colors[neighbor] == colors[node]:
return False
return True
# loop through all nodes in the graph in case the graph is not connected
for i in graph.keys():
if not bfs(graph, i):
return False
# otherwise, the graph is bipartite
return True
# test
graph = {0: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2]}
print(isValidBipartitionGraph(graph))
graph = {0: [1, 2, 3], 1: [0, 2], 2: [0, 1, 3], 3: [0, 2]}
print(isValidBipartitionGraph(graph))
True
False
5.2 DFS#
def isValidBipartitionGraph(graph):
# initialize a set to store the visited nodes
visited = set()
# initialize a dictionary to store the colors of the nodes
colors = {}
# define a recursive function to explore the graph
def dfs(graph, source, color):
# if the source is visited, then the graph is bipartite
if source in visited:
return True
# otherwise add the source to the visited set
visited.add(source)
# color the source
colors[source] = color
# explore all the neighbors of the source
for v in graph[source]:
# if the neighbor is not visited, color it with the opposite color of the source
if v not in visited:
dfs(graph, v, 1 - color)
elif colors[v] == color:
return False
return True
for i in graph.keys():
color = 0 if i not in colors else colors[i]
if not dfs(graph, i, color):
return False
return True
graph = {0: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2]}
print(isValidBipartitionGraph(graph))
graph = {0: [1, 2, 3], 1: [0, 2], 2: [0, 1, 3], 3: [0, 2]}
print(isValidBipartitionGraph(graph))
True
False
6 Union Find#
see here.
Leetcode:
class UnionFind():
def __init__(self, n):
# initialize the parent of each node to itself
self.parent = [i for i in range(n)]
# initialize the size of each set to 1
self.size = [1 for i in range(n)]
# initialize the number of disjoint sets to n
self.num_disjoint_sets = n
def find(self, node):
# find the root of the node by traversing the parent
# add path compression to make sure find() is O(1)
# this recursion maintains the height of the tree to be 2 (root<-(node1, node2, ... node n)
if self.parent[node] != node:
# path compression: recusively set the parent of the node to the root
self.parent[node] = self.find(self.parent[node])
return self.parent[node]
def union(self, node1 , node2):
# find the root of each node
root1 = self.find(node1)
root2 = self.find(node2)
# if the two nodes are already in the same set, do nothing
if root1 == root2:
return
# otherwise, merge the two sets
# here we dont merge the smaller set to the larger set for balancing purpose
# because we have compression path in find(), the height of the tree is always 2
self.parent[root1] = root2
self.size[root2] += self.size[root1]
# decrease the number of disjoint sets by 1
self.num_disjoint_sets -= 1
def connected(self, node1, node2):
# check if two nodes are in the same set
return self.find(node1) == self.find(node2)
def count(self):
#
return self.num_disjoint_sets
7 Minimum Spanning Tree#
Suppose we wish to connect all the computers in a new office building using the least amount of cable.
We can model this problem using an undirected, weighted graph G
whose vertices represent the computers, and whose edges represent all the possible pairs (u,v)
of computers, where the weight w(u,v)
of edge(u,v)
is equal to the amount of cable needed to connect computer u
to computer v
.
Rather than computing a shortest-path tree from some particular vertex v
, we are interested instead in finding a tree T
that contains all the vertices of G
and has the minimum total weight over all such trees.
Algorithms for finding such a tree are the focus of this section.
A tree, such as this, that contains every vertex of a connected graph G
is said to be a spanning tree
, and the problem of computing a spanning tree T
with smallest total weight is known as the minimum spanning tree(MST)
problem.
Two algorithms are popular to solve this problem: Prim’s algorithm and Kruskal’s algorithm.
These two algorithms are both greedy
algorithms, as at each step of iteration, they choose the object that is the best at the moment.
Such a strategy doesn’t always ganrantee that it will always find globally optimal solutions to problems.
However, for a MST
problem, we can prove that certain greedy strategies do yield a spanning tree with minimum weight.
Leetcode
7.1 Prim’s Algorithm#
Prim’s algorithm operates similarly as Dijkstra’s algorithm.
def Prim(G, r)
# Initialize a distance table and parent table for all vertice in G
for each u in G.V
u.d = infinity
u.p = None
# Initialize distance for root node
r.d = 0
# Initialize a priority queue for all vertice based on distance
Q = PriorityQueue(G.V)
#
while Q:
# Extract the vertex with the minimum distance in Q
u = Q.extract_min()
# Explore its neighbor
for v in G.adj[u]:
# If the neighbor has not finished and the weight of edge (u,v) is smaller than its current key
if v in Q and G.W[(u,v)] < v.d:
# Track its parent
v.p = u
# Reset distance key using smaller weight
v.d = G.W[(u,v)]
A naive implementation could be:
import heapq
def Prim(graph, weight):
nodes = graph.keys()
n = len(nodes)
# initialize a distanace table and parent table for MST
distance = [float('inf') for node in nodes]
source = list(nodes)[0]
distance[source] = 0
parent = [None for node in nodes]
# initialize a MST
MST = []
# initialize a set to store the visited nodes
visited = set()
# initialize a priority queue to store the nodes
pq = [(distance[source], source)]
# loop until the queue is empty
while pq:
print("==================================")
print(pq)
# pop the node with the smallest distance
dist, node = heapq.heappop(pq)
# if the node is visited, then skip
if node in visited:
continue
# otherwise, add the node to the visited set
visited.add(node)
# add the node to the MST
if parent[node] is not None:
MST.append((parent[node], node))
# explore all the neighbors of the node
for neighbor in graph[node]:
# if the neighbor is not visited and the weight of the edge is smaller than the current distance, update the distance and parent
if neighbor not in visited and weight[(node, neighbor)] < distance[neighbor]:
distance[neighbor] = weight[(node, neighbor)]
parent[neighbor] = node
heapq.heappush(pq, (distance[neighbor], neighbor))
return MST
graph = {0: [1, 2, 3], 1: [0, 2], 2: [0, 1, 3], 3: [0, 2]}
weight = {(0, 1): 1, (0, 2): 4, (0, 3): 3, (1, 2): 2, (2, 3): 5}
print("MST is:", Prim(graph, weight))
==================================
[(0, 0)]
==================================
[(1, 1), (4, 2), (3, 3)]
==================================
[(2, 2), (4, 2), (3, 3)]
==================================
[(3, 3), (4, 2)]
==================================
[(4, 2)]
MST is: [(0, 1), (1, 2), (0, 3)]
7.2 Kruskal’s Algorithm#
Kruskal’s algorithm is based on union find
algorithm.
While the Prim's algorithm
builds the MST by growing a single treeuntil it spans the graph, Kruskal’s algorithm
maintains a forest of clusters, repeatedly merging pairs of clusters until a single cluster spans the graph.
Initially, each vertex is by itself in a singleton cluster.
The algorithm then considers each edge in turn, ordered by increasing weight.
If an edge e
connects two different clusters, then e
is added to the set of edges of the minimum spanning tree, and the two clusters connected by e
are merged into a single cluster.
If, on the other hand, e
connects two vertices that are already in the same cluster, then e
is discarded.
Once the algorithm has added enough edges to form a spanning tree, it terminates and outputs this tree as the minimum spanning tree.
def Kruskal(G):
# Initialize a forest - more precisely, it should be a forest
T = []
# Initialize a union-find disjoints for the vertices in the graph
UnionFind(G.V):
# Put all edges in `G.E` to a min priority queue sorted by the weight `G.W[e]` for each edge `e (u, v)`
Q = PriorityQueue((G.W[e], e))
# Traverse each edge in the priority queue
for (we, e) in Q:
# Unzip to get the two nodes from an edge
u, v = e
# If two nodes are not connected in the union-find disjoints
# Then connect them and add the edge to the tree
if not UnionFind.connected(u, v):
UnionFind.connect(u, v)
A.append(e)
# return a MST
return A
8 Connectivity#
Tarjan’s algorithm for:
articulation points -> critical vertex
bridhes in a graph -> critical edge
strongly connected components
def buildGraph(n_nodes, connections):
graph = {}
for node in range(n_nodes):
graph[node] = []
for node1, node2 in connections:
graph[node1].append(node2)
# if build an undirected graph, add the following line
graph[node2].append(node1)
return graph
# Tarjan algorithm for finding critical edges in a graph
def criticalConnections(n, connections):
# build the graph
graph = buildGraph(n, connections)
# initialize a list to store the critical edges
critical_edges = []
# initialize a list to store the discovery time of each node
discovery_time = [-1] * n
# initialize a list to store the lowest discovery time of each node
lowest_discovery_time = [-1] * n
# initialize a set to store the visited nodes
visited = set()
# initialize a variable to store the current time
time = 0
def dfs(graph, node, parent):
nonlocal time
# if the node is visited, then return
if node in visited:
return
# mark the node as visited
visited.add(node)
# set the discovery time and lowest discovery time of the node
discovery_time[node] = time
lowest_discovery_time[node] = time
time += 1
# explore all the neighbors of the node
for neighbor in graph[node]:
# if the neighbor is the parent, then skip
if neighbor == parent:
continue
# if the neighbor is not visited, explore it
if neighbor not in visited:
dfs(graph, neighbor, node)
# update the lowest discovery time of the node
lowest_discovery_time[node] = min(lowest_discovery_time[node], lowest_discovery_time[neighbor])
# if the lowest discovery time of the neighbor is greater than the discovery time of the node, then the edge is a critical edge
if lowest_discovery_time[neighbor] > discovery_time[node]:
critical_edges.append([node, neighbor])
# loop through all nodes in the graph in case the graph is not connected
for i in graph.keys():
dfs(graph, i, None)
return critical_edges
# test
n = 4
connections = [[0, 1], [1, 2], [2, 0], [1, 3]]
print(criticalConnections(n, connections))
[[1, 3]]