0207 Course Schedule#
Problem#
There are a total of numCourses courses you have to take, labeled from 0 to numCourses - 1. You are given an array prerequisites where prerequisites[i] = [ai, bi] indicates that you must take course bi first if you want to take course ai.
For example, the pair [0, 1], indicates that to take course 0 you have to first take course 1.
Return true if you can finish all courses. Otherwise, return false.
Examples#
Example 1:
Input: numCourses = 2, prerequisites = [[1,0]]
Output: true
Explanation: There are a total of 2 courses to take.
To take course 1 you should have finished course 0. So it is possible.
Example 2:
Input: numCourses = 2, prerequisites = [[1,0],[0,1]]
Output: false
Explanation: There are a total of 2 courses to take.
To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible.
Constraints:#
1 <= numCourses <= 2000
0 <= prerequisites.length <= 5000
prerequisites[i].length == 2
0 <= ai, bi < numCourses
All the pairs prerequisites[i] are unique.
Follow-up#
Analysis#
when is it not possible to finish?
when there are loops in the dependency graph
how to detect if such a dependency graph has loops or not?
traverse the graph using DFS or BFS
mark visited nodes
if already visited, then there is a loop in the directed graph
What is a cycle in Directed Graph?#
DAG: there is a cycle along one direction. It is not possible to do a topological sorting if there is a cycle in the DAG.
[[1,0], [0,2], [2,1]]
Not DAG:
[[0,1], [0,2], [2,1]]
BFS Solution#
Solution 1:#
The following solution simply applies this logic:
build a graph from given dependency
use BFS to detect if there is a cycle
initialize a queue, and a
visistedset to track which node has been visitedgiven node, travere its neighbor
This method, however, is not correct as its implementation.
the
visitedstore all the nodes that has been visited.the case [[0,1], [0,2], [2,1]], has a cycle from undirected graph point of view, but no cycle from directed graph point of view
when traversing the node
0, BFS will enqueue its dependency[1,2]and mark as visisted.when going to the neighbor node
1, its neighbor node2is enqueued. But at previous step the node2has already been marked asvisited, which will be identified INCORRETLY as acycle.
The main reason is the visited set in this sense didn’t store parent/children information during traversal.
# the following code is only used for detecting cycles in a undirected graph
# for directed graph, this will not work as directed cycle is not the same as undirected cycle
def canFinish(numCourses, prerequisites):
# build graph from given number of courses and prerequisites
# here we represent graph as a dictionary
def buildGraph(numCourses, prerequisites):
graph = {}
for i in range(numCourses):
graph[i] = []
for x, y in prerequisites:
graph[x].append(y)
return graph
# check if there is a cycle in the graph
#visited = set()
def hasCycle(graph, source):
visited = set()
queue = [source]
while queue:
node = queue.pop(0)
# if the node is already visited, then there is a cycle
if node in visited:
return True
# mark the node as visited
visited.add(node)
# explore all the neighbors of the node
for neighbor in graph[node]:
queue.append(neighbor)
return False
graph = buildGraph(numCourses, prerequisites)
# add a for loop to check each node in case the graph is not connected
for i in range(numCourses):
if hasCycle(graph, i):
return False
return True
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
False
False
False
False
A correct implementation is to use a indegree set to record the indegree of each node. Indegree is the number of dependencies a node requires, the number of incoming connections.
When do a topological sorting, we need start from a node with 0 indegree, and such nodes are the starting point of a topological sort tree.
Algorithm:
step 0: get the indegree for each node in the graph
step 1: find nodes with
indegree of 0, and enqueue to a queuestep 2: dequeue the queue,and decrease the indegree of the connected nodes from the popped node by 1
可以想象成在graph中把该节点抹去,那么该节点指向的其他节点的indegree就减1了
repeat step 1 and 2
the traversal order is the topological sort if there is no cycle detected
if the number of nodes that has been popped out from the queue equals to the number of courses, then there is no cycle.
if there is a cycle, the number of nodes that have been popped out from the queue should be less than the toal number of courses.
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
graph = {}
indegree = {}
for i in range(numCourses):
graph[i] = []
indegree[i] = 0
for t, f in prerequisites:
graph[f].append(t)
indegree[t] += 1
return graph, indegree
# bfs
def bfs(graph, indegree):
queue = []
# find indegree 0 and enqueue
for i in range(numCourses):
if indegree[i] == 0:
queue.append(i)
j = 0
while queue:
node = queue.pop(0)
# degree -1 for all the neighbors
# if the neighbor's degree is 0, enqueue
for neighbor in graph[node]:
indegree[neighbor] -= 1
if indegree[neighbor] == 0:
queue.append(neighbor)
# record the number of nodes that have been visited
j += 1
# if j == numCourses, then there is no cycle
# if there is a cycle, j is smaller than numCourses because the indegree of the nodes in a cycle is not 0, and they are not visited.
return j == numCourses
graph, indegree = buildGraph(numCourses, prerequisites)
return bfs(graph, indegree)
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
True
False
True
False
DFS Solution#
Simply check if a node is visited or not. If it is visited, then there is a loop in the graph.
follow up:
if we want to know the path of the loop,
we can use a
parentdictionary to keep track of the parent of each node. Then we can use the parent dictionary to trace back the path of the loop.add a path parameter to the dfs function, and append the current node to the path. If there is a loop, then the path will contain the loop.
notes
think about when and how to use visited or traced.
visited: the visited records all the nodes that have been visited, and explored. If during the recursion, the node marked asvisitedis visited again, there is no need to further do the calculation as it’s been done before.or we can understand it as:
visitedis for the global graph point of view to make sure the traversal is not performed twice for the same node.
traced: the traced record all the nodes that are currently on the recursion path, which might not be visited/finished. - or we can understand it as:tracedis the current recursion path, and can be used to see if current path has loop.
import collections
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
# define graph as a dictionary with key as course and value as list of prerequisites
graph = collections.defaultdict(list)
for x, y in prerequisites:
graph[x].append(y)
return graph
# store the visited nodes
visited = {}
traced = {}
def hasCycle_dfs(graph, source):
# if source is already in traced, then there is a cycle
if source in traced:
return True
# if source is already visited, then there is no cycle
if source in visited:
return
# add source to visited
visited[source] = True
# add source to traced
traced[source] = True
# explore all the neighbors of the source
for neighbor in graph[source]:
# if we dont find a cycle in the neighbor, then we continue
if hasCycle_dfs(graph, neighbor):
return True
# if we replace the above line with the below line, then we will get the topological sort
# hasCycle_dfs(graph, neighbor)
# remove source from traced
traced.pop(source)
# otherwise there is no cycle
return False
graph = buildGraph(numCourses, prerequisites)
for i in range(numCourses):
hasCycle = hasCycle_dfs(graph, i)
if hasCycle:
return not hasCycle
return True
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
True
False
True
False
# implement as a traversal approach
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
# define graph as a dictionary with key as course and value as list of prerequisites
graph = collections.defaultdict(list)
for x, y in prerequisites:
graph[x].append(y)
return graph
# traverse the graph
def traverse(graph, source, visited, traced):
global hasCycle
# if source is already in traced, then there is a cycle
if source in traced:
hasCycle = True
return
# if source is already visited, then there is no cycle
if source in visited:
return
# add source to visited
visited[source] = True
# add source to traced
traced[source] = True
# explore all the neighbors of the source
for neighbor in graph[source]:
# if we dont find a cycle in the neighbor, then we continue
traverse(graph, neighbor, visited, traced)
# remove source from traced
traced.pop(source)
# otherwise there is no cycle
return False
global hasCycle
hasCycle = False
visited = {}
traced = {}
graph = buildGraph(numCourses, prerequisites)
for i in range(numCourses):
traverse(graph, i, visited, traced)
if hasCycle:
return not hasCycle
return True
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
True
False
True
False
Why the following code is correct??
we removed the
visitedset, and only usetracedset to check if there is a loop in the graph.without using
visitedset, the code will still work, because thetracedset will only contain the nodes that are currently on the recursion path.however, if not using the
visitedset, the code will be slower, because the same node will be visited multiple times, and the same calculation will be performed multiple times.
import collections
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
# define graph as a dictionary with key as course and value as list of prerequisites
graph = collections.defaultdict(list)
for x, y in prerequisites:
graph[x].append(y)
return graph
# store the visited nodes
traced = {}
def hasCycle_dfs(graph, source):
if source in traced:
return True
# add source to traced
traced[source] = True
# explore all the neighbors of the source
for neighbor in graph[source]:
if hasCycle_dfs(graph, neighbor):
return True
# if we remove conditional statement, then we will get the topological sort
# hasCycle_dfs(graph, neighbor)
# remove source from traced
traced.pop(source)
graph = buildGraph(numCourses, prerequisites)
for i in range(numCourses):
if hasCycle_dfs(graph, i):
return False
return True
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
True
False
True
False
If we also want to know the loop:
BFS
it’s hard to know the loop in BFS. the un-traversed nodes left by BFS is not the loop itself, but contains the loop, for example, 1<=>2->3. The loop is [1,2], but all 3 nodes are left after BFS.
we might use the outdegree information of the nodes after BFS
start from ndoes with outdegree = 0
remove such nodes, and outdegree - 1 for its neighbors
repeat
the nodes left are outdegree > 0 and indegree > 0, which formulates a cycle.
DFS
loop is in the traced or we can maintain a parent/children map. Once a cycle is detected, we can keep finding its parent to get the cycle.
import collections
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
# define graph as a dictionary with key as course and value as list of prerequisites
graph = collections.defaultdict(list)
for x, y in prerequisites:
graph[x].append(y)
return graph
def hasCycle_dfs(graph, source, traced):
if source in traced:
# find cycle, then return the cycle
loops.append(traced[:]+[source])
return
# add source to visited
traced.append(source)
# explore all the neighbors of the source
for neighbor in graph[source]:
hasCycle_dfs(graph, neighbor, traced)
# remove from traced
traced.pop()
graph = buildGraph(numCourses, prerequisites)
loops = []
for i in range(numCourses):
traced = []
hasCycle_dfs(graph, i, traced)
return loops
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
[[0, 1, 0], [1, 0, 1]]
[]
[[0, 2, 1, 0], [1, 0, 2, 1], [2, 1, 0, 2]]
[]
[[0, 0]]
Topological Sort
For a tree,
level-order traversalorpreorder traversal, orpost-order traversalcan be treated as a topological sort.inorder traversalis not.level-order traversalfor a graph is not possiblepreorder traversalmay not give topological sort orderfor example, 0->(1, 2)->3, the preorder traversal in a DFS is [0,1,3,2]
postorder traversalcan give the topological order[3, 1, 2, 0]
reversed will be [0, 2, 1, 3]
Only
DAGhas topological sorting