DSA: Detect Cycles in Graphs (DFS vs Union-Find)

7 min read6.6k

Solution to Last Week's Challenge

Last week's problem was: DSA: Top-K Elements Using Heaps

The optimal solution uses dsa: top-k elements using heaps pattern with careful state management.

Complexity Analysis:

  • Time Complexity: O(n)
  • Space Complexity: O(1)

Graph theory is the backbone of modern computing. Whether you are managing microservices dependencies, building a social network, or optimizing circuit board layouts, understanding graph connectivity is non-negotiable. In technical interviews at Tier-1 tech companies, cycle detection is the most frequent graph-related question. It serves as a proxy for your ability to manage state, understand recursion, and choose the most efficient data structure for a specific constraint.

Detecting a cycle is not a "one-size-fits-all" problem. The approach changes significantly depending on whether the graph is directed or undirected. For directed graphs, we typically rely on Depth First Search (DFS) and the concept of a recursion stack. For undirected graphs, while DFS works, the Disjoint Set Union (Union-Find) data structure often provides a more elegant and performant solution, especially in dynamic scenarios where edges are added incrementally.

Theory and Core Concepts

To master cycle detection, you must first distinguish between the types of edges and the properties of the graph. In a directed graph, a cycle exists if, during a DFS traversal, we encounter a node that is currently in the active recursion stack. This is known as a back-edge. In an undirected graph, a cycle exists if we encounter an already visited node that is not the immediate parent of the current node.

The complexity of these algorithms is generally linear. For DFS, the time complexity is O(V + E), where V is the number of vertices and E is the number of edges. Union-Find, when optimized with path compression and union by rank, operates in near-constant time per operation, specifically O(E * α(V)), where α is the inverse Ackermann function.

Pattern Visualization

Choosing between DFS and Union-Find depends on the graph's nature and the problem's constraints. If the graph is static and directed, DFS is your best bet. If you are dealing with an undirected graph or a stream of edges where you need to detect a cycle the moment an edge is added, Union-Find is superior.

Implementation: DFS vs. Union-Find

Let's look at the two primary ways to solve this. We will use Python for its readability.

1. Directed Graph Cycle Detection (DFS)

In directed graphs, we use three states: 0 (Unvisited), 1 (Visiting), and 2 (Visited). A cycle is detected if we move to a node that is currently in the "Visiting" state.

python
def has_cycle_directed(num_nodes, adj):
    # State: 0 = unvisited, 1 = visiting, 2 = visited
    state = [0] * num_nodes
    
    def dfs(u):
        state[u] = 1 # Mark as visiting
        for v in adj[u]:
            if state[v] == 1:
                return True # Back-edge found
            if state[v] == 0:
                if dfs(v):
                    return True
        state[u] = 2 # Mark as visited
        return False

    for i in range(num_nodes):
        if state[i] == 0:
            if dfs(i):
                return True
    return False

# Time Complexity: O(V + E)
# Space Complexity: O(V) for state array and recursion stack

2. Undirected Graph Cycle Detection (Union-Find)

Union-Find is exceptionally efficient for undirected graphs. If two nodes of an edge already belong to the same set, adding that edge creates a cycle.

python
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, i):
        if self.parent[i] == i:
            return i
        # Path compression
        self.parent[i] = self.find(self.parent[i])
        return self.parent[i]

    def union(self, i, j):
        root_i = self.find(i)
        root_j = self.find(j)
        if root_i != root_j:
            # Union by rank
            if self.rank[root_i] < self.rank[root_j]:
                self.parent[root_i] = root_j
            elif self.rank[root_i] > self.rank[root_j]:
                self.parent[root_j] = root_i
            else:
                self.parent[root_i] = root_j
                self.rank[root_j] += 1
            return True
        return False # Cycle detected

def has_cycle_undirected(num_nodes, edges):
    uf = UnionFind(num_nodes)
    for u, v in edges:
        if not uf.union(u, v):
            return True
    return False

# Time Complexity: O(E * α(V))
# Space Complexity: O(V) to store parents and ranks

Complexity Analysis Table

MethodGraph TypeTime ComplexitySpace ComplexityBest Use Case
DFS (3-State)DirectedO(V + E)O(V)Dependency resolution, Course schedule.
DFS (Parent)UndirectedO(V + E)O(V)Simple connectivity in static graphs.
Union-FindUndirectedO(E * α(V))O(V)Redundant connections, Kruskal's MST.
Kahn's (BFS)DirectedO(V + E)O(V)Detecting cycles while sorting topologically.

Common Patterns

Pattern 1: The "Redundant Connection" In an undirected graph that started as a tree, adding one edge creates a cycle. Union-Find is the standard tool here.

python
# Simplified Union-Find logic for a single edge check
for u, v in edges:
    root_u = find(u)
    root_v = find(v)
    if root_u == root_v:
        return [u, v] # This edge completes the cycle
    union(u, v)

Pattern 2: Dependency Deadlock (Course Schedule) In a directed graph of tasks, if Task A depends on B and B depends on A, you have a deadlock. DFS with a recursion stack is the most intuitive approach.

Practice Problems

To truly master this, you should solve problems across the difficulty spectrum. Focus on identifying whether the graph is directed or undirected first.

This Week's Interview Challenge

Problem: The Critical Server Link

You are maintaining a network of n servers, labeled from 0 to n-1. The servers are connected by undirected communication cables provided as an array edges. A network is considered "unstable" if it contains a cycle, as packets might loop infinitely.

Given the current connections, determine if the network is a "Valid Tree." A graph is a valid tree if it is fully connected and contains no cycles.

Example 1:

  • Input: n = 5, edges = [[0,1], [0,2], [0,3], [1,4]]
  • Output: True

Example 2:

  • Input: n = 5, edges = [[0,1], [1,2], [2,3], [1,3], [1,4]]
  • Output: False

Constraints:

  • 1 <= n <= 2000
  • 0 <= edges.length <= 5000
  • No self-loops or duplicate edges.

Hints:

  1. How many edges must a tree with n nodes have?
  2. If you use Union-Find, what happens if you try to connect two nodes that are already in the same component?
  3. Don't forget to check if all nodes are connected (the graph might be a forest, not a single tree).

Template Code:

python
def is_valid_tree(n, edges):
    # Your implementation here
    pass

# Submit your solution and check back next week for the detailed answer!

Conclusion

Cycle detection is more than just a search algorithm; it’s a fundamental tool for ensuring system stability. When preparing for interviews, remember: Directed = DFS (3-state) and Undirected = Union-Find. Master these two, and you will be able to solve 80% of graph-related interview questions. Focus on the edge cases—specifically disconnected components and empty graphs—to impress your interviewers.