Hi everyone!
There is a well-known result about bipartite graphs which is formulated as follows:
In any bipartite graph, the number of edges in a maximum matching equals the number of vertices in a minimum vertex cover.
Knowing it you can easily find the size of a minimum vertex cover in a given bipartite graph. But what about actually recovering such cover? Well... There is an algorithm to it, but it's lengthy and not very motivated, so I constantly forget its details. However, there is a simple alternative, which is easily reproduced from scratch, even if you forget specific details.
Besides, there is another well-known result which is formulated as follows:
A bipartite graph has a perfect matching if and only if every subset $$$\alpha$$$ of one part has a neighborhood $$$\beta$$$ in the other part such that $$$|\beta| \geq |\alpha|$$$.
This result can be proven with a similar approach. It is likely known to those familiar with the topic, but still might be of interest to someone.
Kőnig's theorem
Recall that given a flow network $$$G=(V, E)$$$, finding minimum cut between $$$s$$$ and $$$t$$$ is done as follows:
- Find a maximum flow $$$f$$$ from $$$s$$$ to $$$t$$$ in the network;
- Find $$$X$$$, the set of all vertices reachable from $$$s$$$ in the residual network;
- Minimum cut is formed by $$$S=X$$$ and $$$T=V \setminus X$$$, that is the minimum cut is formed by the edges going from $$$S$$$ to $$$T$$$.
Now, recall that bipartite matching between sets of vertices $$$A$$$ and $$$B$$$ may be found as maximum flow in the following network:
- There is an edge of capacity $$$1$$$ going from $$$s$$$ to every vertex of $$$A$$$;
- There is an edge of capacity $$$1$$$ going from every vertex of $$$B$$$ to $$$t$$$;
- There is an edge of capacity $$$+\infty$$$ going between some vertices of $$$A$$$ and $$$B$$$, as defined by the bipartite graph.
Now... Is there anything special about minimum cut in such network?
Minimum cut $$$(S, T)$$$ in a flow network induced by a bipartite graph
Generally, some vertices from $$$A$$$ and some vertices from $$$B$$$ will be with the source $$$s$$$ in the cut, while other will be with the sink $$$t$$$.
Let $$$A = A_S \cup A_T$$$ and $$$B = B_S \cup B_T$$$, such that $$$A_S, B_S \subset S$$$ and $$$A_T, B_T \subset T$$$. Then the following is evidently true:
- There are no edges from $$$A_S$$$ to $$$B_T$$$ (otherwise the cut would be infinite);
- Thus, every edge in the bipartite graph is incident to some vertex from $$$A_T$$$ or $$$B_S$$$ (or both);
- Minimum cut is formed only by edges going from $$$S$$$ to $$$A_T$$$ and from $$$B_S$$$ to $$$T$$$ and thus its size is $$$|A_T|+|B_S|$$$.
Putting it all together, the minimum vertex cover is $$$A_T \cup B_S$$$, and it can be easily found from the minimum cut:
- Find a minimum cut $$$(S, T)$$$ in the flow network of the maximum matching on bipartite graph with parts $$$A$$$ and $$$B$$$;
- A minimum vertex cover is comprised of the sets $$$A_T = (A \cap T)$$$ and $$$B_S = (B \cap S)$$$.
(Generalized) Hall's theorem
The approach above is useful in some other cases as well. Consider the following problem:
A bipartite graph on parts $$$A$$$ and $$$B$$$ is a $$$k$$$-expander if the neighborhood (set of adjacent vertices) $$$\beta \subset B$$$ of any $$$\alpha \subset A$$$ is such that
Given a bipartite graph, you need to check whether it is a $$$k$$$-expander.
Note that for $$$k=1$$$ and $$$|A|=|B|$$$, the condition above is, by Hall's theorem, equivalent to graph having a perfect matching.
Let's construct a similar flow network, but the edges from $$$s$$$ to $$$A$$$ will have capacity $$$k$$$, while the edges from $$$B$$$ to $$$t$$$ have capacity $$$1$$$, and the edges between $$$A$$$ and $$$B$$$ still have capacity $$$+\infty$$$. One has to check whether such graph has a maximum flow of size $$$k|A|$$$.
If the graph is not a $$$k$$$-expander, there can't be such flow, as there is a subset $$$\alpha \subset A$$$ through which you can't push flow of size $$$k|\alpha|$$$ no matter what. But how to prove that the graph is a $$$k$$$-expander when there is such flow?
Well, we rather prove that if the minimum cut in the graph has a capacity less than $$$k|A|$$$, the graph is not a $$$k$$$-expander.
Let's split the minimum cut into $$$A_S \cup B_S$$$ and $$$A_T \cup B_T$$$, as above. There still must be no edges from $$$A_S$$$ to $$$B_T$$$, as they have infinite capacities. So, there is a cut of size $$$k|A_T|+|B_S|<k|A|$$$, which translates into $$$|B_S| < k(|A|-|A_T|) = k|A_S|$$$.
On the other hand, edges from $$$A_S$$$ are only directed in $$$B_S$$$, thus $$$\alpha=A_S$$$ is a subset of $$$A$$$ for which $$$|\beta| \geq k|\alpha|$$$ does not hold.
That being said, the algorithm to check whether the graph is a $$$k$$$-expander is as follows:
- Construct a network, in which edges $$$s \to A$$$ have capacity $$$k$$$, edges $$$B \to t$$$ have capacity $$$1$$$ and edges $$$A \to B$$$ have capacity $$$+\infty$$$;
- Check that the maximum flow (minimum cut) in the network is $$$k|A|$$$. If it is, the graph is a $$$k$$$-expander, otherwise it's not.
> There is an algorithm to it, but it's lengthy and not very motivated, so I constantly forget its details.
Well, it's almost the same as you described. Direct the edges of the bipartite graph as in Kuhn (saturated ones are from right to left, unsaturated ones are from left to right); then run DFS from unsaturated vertices of the left part, and split the vertices of both parts into two sets: visited by DFS and not visited by it. Take visited vertices from the right part and unvisited vertices from the left part as the vertex cover. It's almost the same as finding the minimum cut and splitting vertices according to their state in the cut.
Anyway, very nice tutorial!
Yeah, it's exactly the same considering that the thing you build is exactly the residual network and running the DFS just finds the set of vertices reachable from $$$s$$$, which is exactly what finding minimum cut is.
But what I would say is that thinking about this in a flow context is what makes this algorithm actually intuitive and motivated. Compare this to the "constructive proof" section on Wikipedia, where it is described as dry set-theoretic casework.
Yeah, I kinda don't like the proof from Wikipedia. When I described this algorithm to my students, I tried visualizing the four sets of vertices and possible edges between them in a small picture to show its correctness. Usually it worked well enough.
Hm, it seems that the version you describe assumes capacity $$$1$$$ instead of $$$+\infty$$$ on edges between $$$A$$$ and $$$B$$$.
The main difference is that with capacity $$$+\infty$$$ you're still allowed to follow the edge in the initial direction even if it is used in the matching. The main property that there are no edges from $$$A_S$$$ to $$$B_T$$$ in the cut still holds even with capacities of $$$1$$$, but it's a bit more tedious to prove and is not immediately obvious as with $$$+\infty$$$ capacities.
DFS results will be exactly the same in both ways to formulate algorithm. DFS never needs to go through an edge in matching from left to right: we can reach a saturated vertex in the left part only through its match.
So there's no real difference, it's just easier, in my opinion, to remember something like "direct the edges as in Kuhn".
This is easier to remember for me when you realize the vertex cover is exactly the vertices incident to edges in the minimum cut.
The reason for this is simple: in a cut, for every edge A-B you must cut either s-A or B-t, so necessarily the set of edges you cut form a valid vertex cover.
I'm bad in graph theory like that. To me,this tutorial is a very nice one.
Sorry if I misunderstood. Set S is suppose to be the set of vertices reachable from source s in the residual graph. In the picture, how A3 (numbering the set A from top to bottom) is in set S ? Since unit flow is passing through s to A3, isn't the capacity of the edge in residual graph is 0?
You're right. The reason here is that the picture shows just some minimum cut on the network, and not necessarily the one formed by the vertices reachable from $$$s$$$ in the residual network. I think in this network, the only possible cut induced by a residual network is $$$S = \{a_1, a_2, b_1\}$$$.