0207 Course Schedule#
Problem#
There are a total of numCourses
courses you have to take, labeled from 0
to numCourses - 1
. You are given an array prerequisites where prerequisites[i] = [ai, bi]
indicates that you must take course bi
first if you want to take course ai
.
For example, the pair [0, 1]
, indicates that to take course 0
you have to first take course 1
.
Return true
if you can finish all courses. Otherwise, return false
.
Examples#
Example 1:
Input: numCourses = 2, prerequisites = [[1,0]]
Output: true
Explanation: There are a total of 2 courses to take.
To take course 1 you should have finished course 0. So it is possible.
Example 2:
Input: numCourses = 2, prerequisites = [[1,0],[0,1]]
Output: false
Explanation: There are a total of 2 courses to take.
To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible.
Constraints:#
1 <= numCourses <= 2000
0 <= prerequisites.length <= 5000
prerequisites[i].length == 2
0 <= ai, bi < numCourses
All the pairs prerequisites[i] are unique.
Follow-up#
Analysis#
when is it not possible to finish?
when there are loops in the dependency graph
how to detect if such a dependency graph has loops or not?
traverse the graph using DFS or BFS
mark visited nodes
if already visited, then there is a loop in the directed graph
What is a cycle in Directed Graph?#
DAG: there is a cycle along one direction. It is not possible to do a topological sorting if there is a cycle in the DAG.
[[1,0], [0,2], [2,1]]
Not DAG:
[[0,1], [0,2], [2,1]]
BFS Solution#
Solution 1:#
The following solution simply applies this logic:
build a graph from given dependency
use BFS to detect if there is a cycle
initialize a queue, and a
visisted
set to track which node has been visitedgiven node, travere its neighbor
This method, however, is not correct as its implementation.
the
visited
store all the nodes that has been visited.the case [[0,1], [0,2], [2,1]], has a cycle from undirected graph point of view, but no cycle from directed graph point of view
when traversing the node
0
, BFS will enqueue its dependency[1,2]
and mark as visisted.when going to the neighbor node
1
, its neighbor node2
is enqueued. But at previous step the node2
has already been marked asvisited
, which will be identified INCORRETLY as acycle
.
The main reason is the visited
set in this sense didn’t store parent/children information during traversal.
# the following code is only used for detecting cycles in a undirected graph
# for directed graph, this will not work as directed cycle is not the same as undirected cycle
def canFinish(numCourses, prerequisites):
# build graph from given number of courses and prerequisites
# here we represent graph as a dictionary
def buildGraph(numCourses, prerequisites):
graph = {}
for i in range(numCourses):
graph[i] = []
for x, y in prerequisites:
graph[x].append(y)
return graph
# check if there is a cycle in the graph
#visited = set()
def hasCycle(graph, source):
visited = set()
queue = [source]
while queue:
node = queue.pop(0)
# if the node is already visited, then there is a cycle
if node in visited:
return True
# mark the node as visited
visited.add(node)
# explore all the neighbors of the node
for neighbor in graph[node]:
queue.append(neighbor)
return False
graph = buildGraph(numCourses, prerequisites)
# add a for loop to check each node in case the graph is not connected
for i in range(numCourses):
if hasCycle(graph, i):
return False
return True
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
False
False
False
False
A correct implementation is to use a indegree
set to record the indegree of each node. Indegree
is the number of dependencies a node requires, the number of incoming connections.
When do a topological sorting, we need start from a node with 0 indegree
, and such nodes are the starting point of a topological sort tree.
Algorithm:
step 0: get the indegree for each node in the graph
step 1: find nodes with
indegree of 0
, and enqueue to a queuestep 2: dequeue the queue,and decrease the indegree of the connected nodes from the popped node by 1
可以想象成在graph中把该节点抹去,那么该节点指向的其他节点的indegree就减1了
repeat step 1 and 2
the traversal order is the topological sort if there is no cycle detected
if the number of nodes that has been popped out from the queue equals to the number of courses, then there is no cycle.
if there is a cycle, the number of nodes that have been popped out from the queue should be less than the toal number of courses.
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
graph = {}
indegree = {}
for i in range(numCourses):
graph[i] = []
indegree[i] = 0
for t, f in prerequisites:
graph[f].append(t)
indegree[t] += 1
return graph, indegree
# bfs
def bfs(graph, indegree):
queue = []
# find indegree 0 and enqueue
for i in range(numCourses):
if indegree[i] == 0:
queue.append(i)
j = 0
while queue:
node = queue.pop(0)
# degree -1 for all the neighbors
# if the neighbor's degree is 0, enqueue
for neighbor in graph[node]:
indegree[neighbor] -= 1
if indegree[neighbor] == 0:
queue.append(neighbor)
# record the number of nodes that have been visited
j += 1
# if j == numCourses, then there is no cycle
# if there is a cycle, j is smaller than numCourses because the indegree of the nodes in a cycle is not 0, and they are not visited.
return j == numCourses
graph, indegree = buildGraph(numCourses, prerequisites)
return bfs(graph, indegree)
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
True
False
True
False
DFS Solution#
Simply check if a node is visited or not. If it is visited, then there is a loop in the graph.
follow up:
if we want to know the path of the loop,
we can use a
parent
dictionary to keep track of the parent of each node. Then we can use the parent dictionary to trace back the path of the loop.add a path parameter to the dfs function, and append the current node to the path. If there is a loop, then the path will contain the loop.
notes
think about when and how to use visited
or traced
.
visited
: the visited records all the nodes that have been visited, and explored. If during the recursion, the node marked asvisited
is visited again, there is no need to further do the calculation as it’s been done before.or we can understand it as:
visited
is for the global graph point of view to make sure the traversal is not performed twice for the same node.
traced
: the traced record all the nodes that are currently on the recursion path, which might not be visited/finished. - or we can understand it as:traced
is the current recursion path, and can be used to see if current path has loop.
import collections
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
# define graph as a dictionary with key as course and value as list of prerequisites
graph = collections.defaultdict(list)
for x, y in prerequisites:
graph[x].append(y)
return graph
# store the visited nodes
visited = {}
traced = {}
def hasCycle_dfs(graph, source):
# if source is already in traced, then there is a cycle
if source in traced:
return True
# if source is already visited, then there is no cycle
if source in visited:
return
# add source to visited
visited[source] = True
# add source to traced
traced[source] = True
# explore all the neighbors of the source
for neighbor in graph[source]:
# if we dont find a cycle in the neighbor, then we continue
if hasCycle_dfs(graph, neighbor):
return True
# if we replace the above line with the below line, then we will get the topological sort
# hasCycle_dfs(graph, neighbor)
# remove source from traced
traced.pop(source)
# otherwise there is no cycle
return False
graph = buildGraph(numCourses, prerequisites)
for i in range(numCourses):
hasCycle = hasCycle_dfs(graph, i)
if hasCycle:
return not hasCycle
return True
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
True
False
True
False
# implement as a traversal approach
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
# define graph as a dictionary with key as course and value as list of prerequisites
graph = collections.defaultdict(list)
for x, y in prerequisites:
graph[x].append(y)
return graph
# traverse the graph
def traverse(graph, source, visited, traced):
global hasCycle
# if source is already in traced, then there is a cycle
if source in traced:
hasCycle = True
return
# if source is already visited, then there is no cycle
if source in visited:
return
# add source to visited
visited[source] = True
# add source to traced
traced[source] = True
# explore all the neighbors of the source
for neighbor in graph[source]:
# if we dont find a cycle in the neighbor, then we continue
traverse(graph, neighbor, visited, traced)
# remove source from traced
traced.pop(source)
# otherwise there is no cycle
return False
global hasCycle
hasCycle = False
visited = {}
traced = {}
graph = buildGraph(numCourses, prerequisites)
for i in range(numCourses):
traverse(graph, i, visited, traced)
if hasCycle:
return not hasCycle
return True
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
True
False
True
False
Why the following code is correct??
we removed the
visited
set, and only usetraced
set to check if there is a loop in the graph.without using
visited
set, the code will still work, because thetraced
set will only contain the nodes that are currently on the recursion path.however, if not using the
visited
set, the code will be slower, because the same node will be visited multiple times, and the same calculation will be performed multiple times.
import collections
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
# define graph as a dictionary with key as course and value as list of prerequisites
graph = collections.defaultdict(list)
for x, y in prerequisites:
graph[x].append(y)
return graph
# store the visited nodes
traced = {}
def hasCycle_dfs(graph, source):
if source in traced:
return True
# add source to traced
traced[source] = True
# explore all the neighbors of the source
for neighbor in graph[source]:
if hasCycle_dfs(graph, neighbor):
return True
# if we remove conditional statement, then we will get the topological sort
# hasCycle_dfs(graph, neighbor)
# remove source from traced
traced.pop(source)
graph = buildGraph(numCourses, prerequisites)
for i in range(numCourses):
if hasCycle_dfs(graph, i):
return False
return True
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
False
True
False
True
False
If we also want to know the loop:
BFS
it’s hard to know the loop in BFS. the un-traversed nodes left by BFS is not the loop itself, but contains the loop, for example, 1<=>2->3. The loop is [1,2], but all 3 nodes are left after BFS.
we might use the outdegree information of the nodes after BFS
start from ndoes with outdegree = 0
remove such nodes, and outdegree - 1 for its neighbors
repeat
the nodes left are outdegree > 0 and indegree > 0, which formulates a cycle.
DFS
loop is in the traced or we can maintain a parent/children map. Once a cycle is detected, we can keep finding its parent to get the cycle.
import collections
def canFinish(numCourses, prerequisites):
def buildGraph(numCourses, prerequisites):
# define graph as a dictionary with key as course and value as list of prerequisites
graph = collections.defaultdict(list)
for x, y in prerequisites:
graph[x].append(y)
return graph
def hasCycle_dfs(graph, source, traced):
if source in traced:
# find cycle, then return the cycle
loops.append(traced[:]+[source])
return
# add source to visited
traced.append(source)
# explore all the neighbors of the source
for neighbor in graph[source]:
hasCycle_dfs(graph, neighbor, traced)
# remove from traced
traced.pop()
graph = buildGraph(numCourses, prerequisites)
loops = []
for i in range(numCourses):
traced = []
hasCycle_dfs(graph, i, traced)
return loops
# test
# False
numCourses = 2
prerequisites = [[1,0], [0,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 3
prerequisites = [[0,1],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 3
prerequisites = [[1,0],[0,2],[2,1]]
print(canFinish(numCourses, prerequisites))
# True
numCourses = 5
prerequisites = [[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
# False
numCourses = 5
prerequisites = [[0,0],[1,4],[2,4],[3,1],[3,2]]
print(canFinish(numCourses, prerequisites))
[[0, 1, 0], [1, 0, 1]]
[]
[[0, 2, 1, 0], [1, 0, 2, 1], [2, 1, 0, 2]]
[]
[[0, 0]]
Topological Sort
For a tree,
level-order traversal
orpreorder traversal
, orpost-order traversal
can be treated as a topological sort.inorder traversal
is not.level-order traversal
for a graph is not possiblepreorder traversal
may not give topological sort orderfor example, 0->(1, 2)->3, the preorder traversal in a DFS is [0,1,3,2]
postorder traversal
can give the topological order[3, 1, 2, 0]
reversed will be [0, 2, 1, 3]
Only
DAG
has topological sorting