Блог пользователя lqy2014

Автор lqy2014, история, 6 недель назад, По-английски

Notes on Tarjan's Algorithm The Tarjan algorithm is named after its inventor: Robert Tarjan (born 1948). Tarjan proposed many graph algorithms, among which the most famous is the algorithm for connected components: the Tarjan algorithm. This note follows the order in which the author learned the material. Undirected Graphs We first discuss connected components in undirected graphs. Definitions Some terms below may be unfamiliar to readers, so we define them: Graph: A graph is defined as a pair of sets, e.g., a graph G=(V,E) , where V is the vertex set and E is the edge set connecting these vertices. Tree: A tree is a special type of graph with n vertices, n−1 edges, is connected (see below), and has exactly one simple path between any pair of vertices. Subgraph: A subgraph has vertex set V 2 ​ ⊆V 1 ​

and edge set E 2 ​ ⊆E , where all edges in E 2 ​

connect vertices within V 2 ​

. All graphs we discuss are unweighted by default, and we focus on whether the graph is connected. Connected: A graph is connected if for all u,v , u→v , meaning u and v are mutually reachable. We also solve for the following concepts: Cut Vertex: A vertex whose removal increases the number of connected components in the graph. Bridge (Cut Edge): An edge whose removal increases the number of connected components by exactly 1. We will show later: Cut vertices are stronger / more fundamental than bridges. Note: Removing a cut vertex can increase the number of components by an uncertain amount, but removing a bridge always increases it by exactly 1. Proof: Intuitively, a vertex may connect many components, while an edge only connects two. Deleting a vertex can split many parts, leading to an uncertain increase; deleting an edge only splits one pair, so the increase is exactly 1. The next two connected-component concepts rely on a key idea: Maximal: A subgraph is maximal with respect to a property if adding any extra vertex or edge would violate that property. With this understanding, we define: Vertex-Biconnected Component (v‑BCC, or point‑bcc): A maximal subgraph that is connected and contains no cut vertices (within the subgraph). Edge-Biconnected Component (e‑BCC, or edge‑bcc): A maximal subgraph that is connected and contains no bridges (within the subgraph). Note: For a v‑BCC, “no cut vertices” means no cut vertices inside the component, not globally. An e‑BCC may contain global bridges. All algorithms below use a tree structure on the graph: DFS Tree: Starting from any vertex, perform a DFS that visits every vertex exactly once. The tree formed by the visited edges and vertices is called a DFS tree. Many more definitions will follow; these cover about 80% of the basics. Step -1: Deeper Properties Edge-biconnected components can also be described as: After removing all bridges, each remaining connected component is an edge-biconnected component. Now consider: what happens if we condense each e‑BCC / v‑BCC into a single “super node”? For e‑BCCs: From the above definition, condensing each e‑BCC into a node produces a tree! This does not work for v‑BCCs, because cut vertices belong to multiple v‑BCCs, so normal condensation fails. Instead, we use the Block-Cut Tree (Circle-Square Tree) introduced later. Why do we use a DFS tree? Because in a DFS tree, non‑tree edges can only be back edges, never cross edges. If a cross edge existed, DFS would have already traversed it. This property allows us to compute v‑BCCs and e‑BCCs efficiently. Back Edge: An edge connecting a vertex to one of its ancestors. Cross Edge: An edge between two vertices with no ancestor‑descendant relationship. Step 0: Earlier Approaches Before Tarjan’s algorithm, to my knowledge, one method for finding bridges used linear algebra over the cycle space. It is based on: Theorem 0.1: A bridge cannot lie on any cycle. If we assign indices to back edges and tree edges, we can represent cycles formed by exactly one back edge. The vector space spanned by these cycles is the cycle space, and these cycles are linearly independent. From the theorem: an edge is a bridge if and only if its coefficient is 0 in every cycle of this space. We could build this space and query it. However, the size of this space is O(2 m−(n−1) ) , which is infeasible. We observe that if all basis elements give 0, then the entire cycle space gives 0. This leads to methods like tree difference / heavy-light decomposition. We must also handle disconnected graphs: Process each connected component separately — the same applies to Tarjan’s algorithm. This approach requires three DFS traversals and is not suitable as a concise template. Tarjan recognized this and developed the Tarjan algorithm. Step 1: Edge-BCCs + Bridges What made Tarjan’s algorithm for connected components revolutionary is the new concept: low i ​

! The entire algorithm revolves around low i ​

. Without it, Tarjan’s method would have been quickly forgotten, like the earlier linear-algebra approach. Let’s see how it finds bridges and e‑BCCs. First, return to the DFS tree and define: Timestamp: The order in which vertices are first visited by DFS, denoted dfn . A key property: in a DFS tree, every non‑tree edge is a back edge. Thus, for a back edge, if dfn x ​ >dfn u ​

, then x is an ancestor of u . We introduce: Theorem 1.1: If for every v∈subtree(u) , there exists x∈ / subtree(u) (in the DFS tree T ) such that the path x→v does not use the edge u→v , then (u,v) is not a bridge. subtree(u) : the subtree rooted at u . son(u) : a child of u . Intuition: A bridge splits the graph into two components when removed. Thus, (u,v) is a bridge only if no node in the subtree of u can reach above u without using u→v . With Theorem 1.1, we define low i ​

: low i ​

is the smallest timestamp reachable from i using at most one back edge. Why timestamps? They let us compare dfn u ​

and low v ​

to make decisions. Typical definition: low u ​ =min{dfn x ​ ≤dfn u ​

​ y→x,y∈subtree(v),v∈son(u)} The rule for a bridge emerges: (u,v) is a bridge if and only if low v ​ >dfn u ​

. How to compute low i ​

: When u visits a new child v , recursively visit v , then update low u ​ =min(low u ​ ,low v ​ ) . If u→v is a back edge, set low u ​ =min(low u ​ ,dfn v ​ ) . Important: When finding bridges, you must handle multiple edges correctly. Now for e‑BCCs: use a stack. Push every visited vertex onto the stack. When you detect a bridge u→v , pop vertices from the stack up to and including v — but keep u since it belongs to a different component. At the end, you must extract the final e‑BCC containing the root; otherwise, the code is wrong. The resulting code is clean and short — far better than the messy linear-algebra method. It becomes our standard template. Step 2: Vertex-BCCs + Cut Vertices + Menger’s Theorem As mentioned, all Tarjan-based algorithms center on low i ​

. How do we get v‑BCCs and cut vertices? We adapt the logic used for bridges. We had Theorem 1.1 for bridges; is there an analogous statement for cut vertices? Yes. Proposition 2.1: If for every v∈subtree(u) , there exists x∈ / subtree(u) (in DFS tree T ) such that the path x→v does not pass through u , then u is not a cut vertex. This is very similar to the bridge condition. Indeed, Tarjan’s algorithms for cut vertices and bridges come from the same core idea. However, Proposition 2.1 has a flaw: If u is the root, it is a cut vertex only if it has at least two children. We must handle the root as a special case. With the corrected condition, we use the same low i ​

: low i ​

= the smallest timestamp of any ancestor reachable from i . Cut vertex condition: u is a cut vertex if there exists a child v such that low v ​ ≥dfn u ​

Meaning: v cannot escape the subtree of u without going through u itself. For bridges, we required low v ​ >dfn u ​

; here we allow equality. Finally, handle the root separately. For v‑BCCs: When you identify u as a cut vertex, pop vertices from the stack up to v , then add u to the component. Since u may belong to multiple v‑BCCs, we do not remove it from the stack. We maintain components with a stack, similarly to e‑BCCs. At the end, one element (the root) usually remains in the stack — this is fine, even for singleton components. Structure summary: Edge-BCCs are built around vertices. Vertex-BCCs are built around edges and expand until they hit a global cut vertex. Tarjan’s algorithm runs in linear time, so it is efficient. Menger’s Theorem (below) further justifies why v‑BCCs are meaningful. Why study connected components? Why are cut vertices “stronger” than bridges? These questions are answered by Menger’s Theorem, derived from: Theorem 2.2: Inside an edge-biconnected component, for any edge e=(u,v) and any vertex x , there exists a cycle containing both x and e . The analogous property holds for vertex-biconnected components. This is proven by a fundamental result from the Austrian mathematician Karl Menger (1927): Menger’s Theorem: A k 1 ​

-edge-connected graph cannot be disconnected without removing at least k 1 ​

edges. A k 2 ​

-vertex-connected graph cannot be disconnected without removing at least k 2 ​

vertices. Extended: In a k -edge-connected graph, any two vertices have k edge-disjoint simple paths between them. In a k -vertex-connected graph, any two vertices have k internally vertex-disjoint simple paths (excluding endpoints). This theorem is equivalent to the max-flow min-cut theorem. Connectivity is the discrete version of max-flow min-cut. This is why we study connectivity. By Menger’s Theorem, vertex connectivity is stronger than edge connectivity. In e‑BCCs, the only mandatory edges between u and v are bridges. In v‑BCCs, what are the mandatory vertices? This is answered by the circle-square tree. v‑BCCs are manageable because the total number of vertices in all v‑BCCs is bounded by O(m+n) . Step 3: Condensation (Shrinking Components) Edge-BCC Condensation Condensing e‑BCCs is intuitive: Since e‑BCCs are connected only by bridges, shrinking each e‑BCC into a single node produces a tree. A classic example: P2860 [USACO06JAN] Redundant Paths G — Luogu: Make the graph a single cycle. After e‑BCC condensation, the answer is simply: ceil( number_of_leaves / 2 ) Code: cpp 运行

include<bits/stdc++.h>

using namespace std; typedef pair<int, int> P; const int N = 5005; struct Edge{ int to, id; }; int n, m; int dfn[N], low[N], idx, from[N], d[N]; int st[N], col[N], tp, cnum; vector edge[N]; map<P, bool> vi;

void dfs(int u){ dfn[u] = low[u] = ++idx; st[tp++] = u; for(auto e : edge[u]){ if(from[u] == e.id) continue; int v = e.to; if(!dfn[v]){ from[v] = e.id; dfs(v); low[u] = min(low[u], low[v]); }else{ low[u] = min(low[u], dfn[v]); } } if(low[u] == dfn[u]){ col[u] = ++cnum; tp--; while(st[tp] != u){ col[st[tp]] = cnum; tp--; } } }

void tarjan(){ for(int i = 1; i <= n; i++){ if(!dfn[i]) dfs(i); } }

int main(){ cin >> n >> m; for(int i = 1; i <= m; i++){ int u, v; cin >> u >> v; edge[u].push_back({v, i}); edge[v].push_back({u, i}); } tarjan(); for(int i = 1; i <= n; i++){ for(auto e : edge[i]){ int j = e.to; int x = col[i], y = col[j]; if(x == y) continue; P tmp = make_pair(x, y); if(vi.find(tmp) == vi.end()){ d[x]++, d[y]++; vi[tmp] = 1; } } } int ans = 0; for(int i = 1; i <= cnum; i++){ if(d[i] == 1) ans++; } cout << (ans + 1) / 2 << endl; return 0; } Note: Off-by-one stack errors (e.g., confusing tp++ and ++tp) are a common source of RE. Always check for multiple edges in bridge / e‑BCC code! Vertex-BCC Condensation & The Circle-Square Tree Condensing v‑BCCs is tricky because cut vertices belong to multiple v‑BCCs. The solution: Erase all original edges. For each v‑BCC, create a new square node. Connect every original vertex (circle node) in the v‑BCC to this square node. The result is a Block-Cut Tree (Circle-Square Tree). This is the generalized circle-square tree. Properties: The circle-square tree is connected. It is acyclic (a tree). It is bipartite. It is famous in OI because it cleanly represents mandatory vertices: 4. The set of mandatory vertices on any u - v path = the set of circle nodes on the path between u and v in the circle-square tree. In e‑BCC condensation, edges on the spanning tree are mandatory. This is not true for v‑BCCs, because global cut vertices behave differently. The circle-square tree avoids confusion between intra-component and inter-component paths. That is why it is powerful. The set of all feasible vertices for u - v paths is the union of circle nodes on the tree path. This completes v‑BCC condensation. We have now covered biconnected components in undirected graphs. Practice Problems: P4630 [APIO2018] Duathlon — Luogu P4606 [SDOI2018] Strategic Game — Luogu

Полный текст и комментарии »

  • Проголосовать: нравится
  • -27
  • Проголосовать: не нравится