Motivation
By chance, I stumbled upon the blog post by errorgorn titled Solving Problems with Min Cut Max Flow Duality. When they mentioned using flow (for certain problems) in a similar way to 2SAT, something clicked for me. While I knew how to solve a variety of flow problems along with forming certain flow networks ad hoc, there was this entire subset of flow problems which I previously did not understand how to do.
The blog post does a great job of explaining many of the concepts/applications, but there were certain things that got black-boxed. In order to understand things, my brain works such that truly understanding what is going on under the hood (often through proofs) helps me better intuit how to use and abuse a tool. Naturally, I tried to search the internet for more information on this topic. Unfortunately, I quickly came to realize that not much material as in-depth as the aforementioned blog post exists (or at least, I failed to find it).
In writing this blog post, I hope to help provide the very resource that I failed to find. I hope this makes the topic more accessible to others and serves as a resource for explaining both why and how.
Note that, while I give a brief introduction into what maximum flow is, this blog post is best ingested after having learned the general concepts of flow.
Introduction/Preliminary
What is Maximum Flow?
To start, let $$$G$$$ be a directed graph with edge weights called capacities. This is sometimes referred to as a network or flow network.
For the purposes of finding "flow," we have a start and end vertex where we want to find the maximum amount of "flow" from the start to the end. These are typically called the source and sink, respectively, and denoted $$$s$$$ and $$$t$$$, respectively.
The name maximum flow serves well as a pictorial descriptor of what it means: consider releasing some amount of units of liquid from the source. Our goal is to find out the maximum amount of units which can flow to (reach) the sink without surpassing the capacities of any edges in the flow network. For example, if we find a (simple) path from $$$s$$$ to $$$t$$$ all of which have positive capacities, then the maximum flow possible strictly along this path is the minimum capacity out of all edges along this path.
What Is a Cut?
Suppose we have a connected, undirected graph $$$G$$$ (there exists a path between every pair of vertices). We will denote the set of vertices as $$$V(G)$$$ and the set of edges and $$$E(G)$$$. Then a set of edges, $$$K\subseteq E(G)$$$, is called an edge cut if removing all edges in $$$K$$$ from $$$G$$$, denoted as $$$G\setminus K$$$, disconnects the graph.
% TODO: maybe use X,Y instead to avoid confusion for below
Suppose we have, $$$S,T \subseteq V(G)$$$. We denote the set of edges between them as $$$[S,T]$$$. If $$$T = \overline{S}$$$, then $$$[S,T]$$$ is called an edge cut. What this means is that removing these edges from $$$G$$$ ($$$G\setminus [S,T]$$$) disconnects all vertices in $$$S$$$ from all vertices in $$$T$$$, i.e. there does not exist $$$s \in S, t \in T$$$ such that there exists a path from $$$s$$$ to $$$t$$$ on $$$G$$$ (sometimes called an $$$s,t$$$-path for the individual case and an $$$S,T$$$-path overall).
Note that, up until now, we have been talking about undirected graphs. However, flow networks are directed. Let us slightly redefine things. Now, $$$G$$$ is a flow network. For $$$S,T \subseteq V(G)$$$, we define $$$[S,T]$$$ as the set of edges going from $$$S$$$ to $$$T$$$ (i.e. forming $$$S,T$$$-paths of length 1). We will denote the capacity of an edge, $$$e$$$, as $$$c(e)$$$ and the value of the edge cut (when $$$T = \overline{S}$$$) as $$$v([S,T]) = \displaystyle \sum_{e \in [S,T]} c(e)$$$, i.e. the sum of capacities.
From here on out, the "edge" portion of "*edge* cut" will be omitted. Do note that this is technically ambiguous (and can lead to confusion if context is omitted) as vertex cuts also exist. I am choosing to omit it regardless in favor of common practice/use of terminology in competitive programming for flow.
What Is a Minimum Cut?
When we ask for the minimum cut in the context of a flow network, what we really mean is the minimum value for an $$$s,t$$$-cut. That is, out of all $$$S,T \subseteq V(G)$$$ such that $$$T = \overline{S}, s \in S, t \in T$$$, we want the pair with the smallest value for $$$v([S,T])$$$.
Relationship Between Minimum Cut and Maximum Flow
There is a theorem which states how, in a flow network, the minimum $$$s,t$$$-cut equals the maximum $$$s,t$$$-flow. I will avoid proving this here as it would make a long blog post even longer and it's not the main focus.
Cost-Benefit Flow
What Does This Mean?
Suppose we have a boolean variable, $$$b$$$, with a cost of $$$\begin{cases}T & \phantom{\lnot}b\\ F & \lnot b\end{cases}$$$ and we want to choose the value of $$$b$$$ such that the cost is minimized. Our goal is to represent this using flow.
If we have a flow network consisting solely of an $$$s,t$$$-path, then the maximum flow must be the smallest capacity along this path (otherwise, we would surpass the smallest capacity which is not possible). In turn, the minimum cut is also the smallest capacity along this path. Accordingly, we can represent $$$b$$$ as the flow network $$$s \overset{F}{\longrightarrow} b \overset{T}{\longrightarrow} t$$$. Note that we could have equivalently represented this with the labels swapped. (It is fine to do so, but be sure you are consistent with this for when we build off this later). Also note that $$$\lnot b$$$ corresponds to the cut $$$[\{s\}, \{b,t\}]$$$ and $$$b$$$ corresponds to the cut $$$[\{s,b\}, \{t\}]$$$. (This is because having the left/false edge in the cut means $$$b$$$ is to the right of the cut and vice versa.)
We now have the foundation for cost, but what about benefit?
Suppose we have a boolean variable, $$$b'$$$, with a benefit of $$$\begin{cases}T' & \phantom{\lnot}b'\\ F' & \lnot b\end{cases}$$$ and we want to choose the value of $$$b$$$ such that the benefit is maximized. Once again, we want to represent this with flow.
But hang on, we are now maximizing, so how can use the minimum cut to get our answer?
We can do this by making the benefit values negative (since $$$x \leq y \iff -x \geq -y$$$, i.e. we can convert a maximization problem to a minimization problem). So now we can make a similar graph of the form $$$s \overset{-F'}{\longrightarrow} b \overset{-T'}{\longrightarrow} t$$$, right?
Well, not quite. As you might have noticed, we now have negative edge capacities. This is not a valid flow network (at least, not without redefining aspects of a flow network). How do we fix this?
To figure this out, let us take a step back. Suppose we have the flow network $$$s \overset{F}{\longrightarrow} b \overset{T}{\longrightarrow} t$$$ from earlier again. I claim that we can add $$$k$$$ to both edge capacities so long as we subtract it from the resultant minimum cut found. That is, we now have the flow network $$$s \overset{F+k}{\longrightarrow} b \overset{T+k}{\longrightarrow} t$$$. Note that there are only two possible cuts since $$$b$$$ can either be together with $$$s$$$ or together with $$$t$$$. The cut $$$[\{s,b\}, \{t\}]$$$ has the value $$$T+k$$$ while the cut $$$[\{s\}, \{b,t\}]$$$ has the value $$$F+k$$$. In both cases, the cut's value has increased by $$$k$$$. Since the minimum cut must be one of these (in particular, the smallest one), $$$k$$$ always gets added to the minimum cut. Accordingly, adding $$$k$$$ to both edge capacities is equivalent to adding $$$k$$$ to the minimum cut, so we can also undo it by subtracting $$$k$$$ at the end.
This gets us most of the way to fixing our negative capacities issue; however, what was outlined above assumes we started with a valid flow network in the first place (before adding $$$k$$$). Do we actually need this assumption? No! Suppose we simply work backwards; that is, we start out with a flow network with a sufficiently large $$$k$$$ baked into the capacities. If we run flow on this, we then get the minimum cut, $$$c$$$, which has $$$k$$$ baked into it. We can then subtract $$$k$$$ from $$$c$$$ which will in turn subtract $$$k$$$ from each capacity giving us our desired ("original") capacities. (I will concede that this isn't the most rigorous of proofs, but it ultimately boils down to separating the result from the behavior. We could arguably construct an alternative proof by observing the linearity between the final answer and the supposed capacities even when they don't form a valid flow network.)
Now, we can finally fix the negative flow. Let $$$m = \min(-F', -T')$$$. We now have the flow network of $$$s \overset{-F'-m}{\longrightarrow} b \overset{-T'-m}{\longrightarrow} t$$$ where we add $$$m$$$ to the resultant minimum cut (note that since $$$m \leq 0$$$, $$$-m \geq 0$$$, so we have the same format as $$$+k$$$ above). Note that $$$-F',-T' \geq m$$$, so $$$-F'-m, -T'-m \geq 0$$$.
If you look closely at what we just did, you might notice something really interesting. Since we are subtracting $$$m$$$ from both capacities and $$$m$$$ was equal to one of them, we always make one edge have capacity exactly 0 and the other have the absolute difference. But wait, we can do the same thing back when we were working with cost! Why can we do this? Well, what we found for adding $$$k$$$ also occurs for subtracting $$$k$$$ by using the exact same logic of it impacting all cuts equally. So we can also make cost have one side be 0 and the other side be the absolute difference (of course, which side it is on matters if we want to be consistent; as mentioned prior, sit tight for just a bit longer as to why we want to be consistent).
In other words, after offsetting the final answer, we only need to worry about a cost (equivalently, benefit) on one side. So how does this tie into combining both cost and benefit?
Well, first let us consider $$$b$$$ having both a cost, $$$x$$$, and a benefit, $$$y$$$, for the same state. Note that cost and benefit are opposites. In other words, the cost of this state is $$$x-y$$$ or cost $$$-$$$ benefit... wait, that's part of the title of this blog post! But the connections don't end here (after all, this is an observation which could have been made from the get-go). Notice how, earlier, we converted benefit on its own into a cost by... making it negative! Okay, I know, this is still not that special.
What actually ties this all together is when one state has a cost and the other has a benefit. For example, suppose we have $$$b$$$ with $$$\begin{cases}\text{cost}\phantom{\text{fitt }} T & \phantom{\lnot}b\\ \text{benefit } F & \lnot b\end{cases}$$$. Then we have the flow network of $$$s \overset{-F}{\longrightarrow} b \overset{T}{\longrightarrow} t$$$. Since $$$-F \leq T$$$, we can let $$$m = -F$$$, at which point we get $$$s \overset{-F-m}{\longrightarrow} b \overset{T-m}{\longrightarrow} t$$$ with $$$m$$$ added back to the minimum cut at the end, or equivalently $$$s \overset{0}{\longrightarrow} b \overset{T+F}{\longrightarrow} t$$$ with $$$-F$$$ added back to the minimum cut at the end... wait, $$$T+F$$$ is not cost $$$-$$$ benefit! Okay, I must confess, the title of this blog post is a happy accident in that it describes how cost and benefit are opposites relative to the same state. The real reason I named it cost-benefit is because of the relationship between cost and benefit, nothing more nothing less. However, I wasn't kidding when I said that having cost and benefit on opposing states ties it all together.
Notice how we ended up with $$$0$$$ and $$$T+F$$$. What we did was reframe the entire problem such that it is relative to one state; the true state is $$$T+F$$$ in distance away from the false state, as is the case in the other direction. For example, if both sides are cost (equivalently, benefit) with false being $$$x$$$ cost and true being $$$y$$$ cost, then they are each distance $$$|y-x|$$$ away from each other. We are able to pin one state and say that the other is exactly $$$z$$$ better/worse. The reason why I spent so much time on this is because of how instrumental the ability to shift around values in this manner is in fleshing out other aspects of what I call "cost-benefit flow."
Adding More Variables
So far, all we have is a single boolean variable where we run flow to determine which state has a smaller value. If we stopped here, this would be comically useless.
Suppose we have $$$n$$$ boolean variables, $$$b_1, b_2, \dots, b_n$$$, each with their own cost/benefits for each state. How can we represent all of these in one flow network? Let us start off simple by creating the path flow network which we have been working with individually for each one. Each one has its own copy of $$$s$$$ and $$$t$$$. If we merge all $$$s$$$ nodes into a single $$$s$$$ node and all $$$t$$$ nodes into a single $$$t$$$ node, then notice how each one can behave completely independently. After all, the only paths in this flow network are exactly the paths we started with.
One key thing we must consider is the behavior of fixing negative values within the flow network/making one state relative to the other. Since we haven't formed any new paths, for all other variable states fixed except $$$b_i$$$, there are two possible cuts, both of which would have the aforementioned $$$m_i$$$ subtracted from them. Just like before, this would require us to add $$$m_i$$$ back to the final minimum cut found. Accordingly, we can update all the paths within the flow network to have the form $$$s \overset{F'_i-m_i}{\longrightarrow} b_i \overset{T'_i-m_i}{\longrightarrow} t$$$ (where $$$F'_i, T'_i$$$ have a negative value if $$$F_i,T_i$$$ are benefit, respectively) and add $$$\displaystyle \sum_{i=1}^n m_i$$$ back to the final minimum cut found.




