nor's blog

By nor, 4 months ago, In English

Motivation

On a computer science discord server, someone recently asked the following question:

Is the master theorem applicable for the following recurrence? $$$T(n) = 7T(\lfloor n / 20 \rfloor) + 2T(\lfloor n / 8 \rfloor) + n$$$

There was some discussion related to how $$$T$$$ is monotonically increasing (which is hard to prove), and then someone claimed that there is a solution using induction for a better bound. However, these ad hoc solutions often require some guessing and the Master theorem is not directly applicable.

Similarly, in the linear time median finding algorithm (or the linear time order statistics algorithm in general), the recurrence is $$$T(n) = T(\lfloor 7n/10 \rfloor + \varepsilon) + T(\lfloor n/5 \rfloor) + n$$$ where $$$\varepsilon \in \Theta(1)$$$ Also, in merge sort and creating a segment tree recursively, the time complexities are $$$T(n) = T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + f(n)$$$ where $$$f(n)$$$ is $$$\Theta(n)$$$ for merge sort and $$$\Theta(1)$$$ for segment tree construction.

Even the general form of the recurrence that the master theorem handles is just $$$T(n) = aT(n/b) + f(n)$$$, and it doesn't handle cases of the form $$$T(n) = aT(\lfloor n/b \rfloor) + f(n)$$$ and $$$T(n) = aT(\lceil n/b \rceil) + f(n)$$$, which is what we find in a lot of algorithms.

To cleanly handle all these cases at once, there is a generalization of the Master theorem for recurrences, called the Akra-Bazzi theorem. It is not really well-known, and since it is not trivial to prove (the proof is pretty nice, I recommend looking at it), it is often skipped over in CS courses on algorithms (sometimes even the instructors are not aware of this theorem). This is why I wanted to write this blog — to provide a better way of reasoning about your divide and conquer algorithms.


Statement

Theorem: Consider $$$T(x)$$$ given by a sufficient number of base cases (for instance, bounded for $$$x < x_0$$$ for $$$x_0$$$ defined later), and defined recursively by $$$T(x) = g(x) + \sum_{i=1}^k a_i T(b_i x + h_i(x))$$$ for large enough $$$x$$$ (i.e., for all $$$x \ge x_0$$$ for some $$$x_0$$$).

Then, subject to the following conditions:

  • $$$a_i > 0$$$ and $$$0 < b_i < 1$$$ are constants for each $$$i$$$
  • $$$g$$$ is differentiable and $$$|g'(x)|$$$ has at most polynomial growth (i.e., $$$|g'(x)| \in O(x^c)$$$ for some constant $$$c$$$)
  • $$$|h_i(x)| \in O(x/(\log x)^2)$$$ for all $$$i$$$

We have, if $$$p$$$ is the unique solution to $$$\sum_{i=1}^k a_i b_i^p = 1$$$, then

$$$

T(x) \in \Theta \left(x^p \left(1 + \int_{1}^{x} \frac{g(u)}{u^{p+1}} \mathrm{d}u \right)\right)

$$$

There are some more pedantic conditions on $$$x_0$$$ and $$$g$$$ for this to hold, but for all practical purposes, those conditions (like some definite integrals with finite upper and lower limits are finite) are true. For more, you can refer to this paper.


Advantages and applications

This has the following advantages:

  • Being much more general than the master theorem, and in different cases for $$$p$$$, reduces to the master theorem readily.
  • Being applicable to more back-of-the-envelope calculations and more CS-like recurrences, where you don't want to think about floors/ceils/constant offsets — the "error term" $$$h_i$$$ takes care of that for you.
  • Reduces your dependence on ad hoc methods like guess-and-induct.

Let's apply this theorem to the few examples mentioned above (although there won't be any non-trivial results among the above examples, an important special case will be illustrated in the following examples).

  • The original recurrence: $$$T(n) = 7T(\lfloor n / 20 \rfloor) + 2T(\lfloor n / 8 \rfloor) + n$$$ — note that in this case, we get $$$0 < p < 1$$$. The theorem claims that $$$T(n) \in \Theta(x^p (1 + \frac{x^{1-p} - 1}{1-p})) = \Theta(x)$$$.
  • The same analysis as above holds for the linear-time median finding algorithm.
  • For the merge sort recurrence and the segment tree building recurrence, we have $$$p = 1$$$. The theorem allows us to use the Master theorem directly without caring about the floors.
  • Vote: I like it
  • +67
  • Vote: I do not like it

»
4 months ago, # |
Rev. 2   Vote: I like it +104 Vote: I do not like it

Learned about Akra-Bazzi theorem a year ago. Was fascinated by how useless it is.

  • »
    »
    4 months ago, # ^ |
      Vote: I like it +28 Vote: I do not like it

    Haha, at least it makes derivations "using" the master theorem more rigorous (i.e., ignoring floors/ceils). When I learnt about it, on the other hand, I was searching for a master-theorem-like result that handles multiple recursive terms, so I did find it quite useful.

    • »
      »
      »
      4 months ago, # ^ |
        Vote: I like it +24 Vote: I do not like it

      I almost never saw any direct application of Akra-Bazzi outside of artificial academic exercises actually.

      The only "real-life" example I know of a similar recurrence that actually exists is in the median of medians algorithm for finding $$$k$$$-th order statistics in deterministic $$$\Theta(n)$$$, where the recurrence is

      $$$ T(n) = T(\frac{n}{5}) + T(\frac{7n}{10})+\Theta(n). $$$

      But in this case, it's much easier to prove the result by induction then by invoking the theorem. I'm yet to see any "real-life" example where the resulting complexity is actually $$$O(x^p)$$$ with a non-trivial $$$p$$$.

      • »
        »
        »
        »
        4 months ago, # ^ |
          Vote: I like it +3 Vote: I do not like it

        For me, the main appeal is the presence of error terms. Master theorem as it is doesn't apply to pretty much any recurrence in CS. To get rid of the floors and ceils, you need a stronger version of the master theorem, which is a subcase of the Akra-Bazzi theorem.

        About real life applications: I remember writing an algorithm for some CS problem in a university course which required a non-trivial application of the Akra-Bazzi theorem (induction still worked but it was hard to guess). I unfortunately don't exactly remember the problem, but I might search for it eventually.

        • »
          »
          »
          »
          »
          4 months ago, # ^ |
          Rev. 3   Vote: I like it 0 Vote: I do not like it

          Idk, I feel like "don't be pedantic and ignore small perturbations" approach is more sensible, as it always works on practice...

          I see the appeal for rigorous academic research maybe, but practically I don't think you will ever need Akra-Bazzi, unless you have several distinct $$$b_i$$$, or $$$g(x)$$$ is $$$x^p$$$ multiplied by some unorthodox function, such as $$$\frac{1}{\log x}$$$...

          • »
            »
            »
            »
            »
            »
            4 months ago, # ^ |
            Rev. 2   Vote: I like it 0 Vote: I do not like it

            Ah, about the unorthodox $$$g$$$ comment, you can find many divide and conquer problems where we often have $$$g(n) = n^a \log^b n$$$ for some choice of $$$a, b$$$ due to sqrt decomposition or FFT or some other non-trivial algorithm in the combine step. $$$b$$$ can be negative in cases where you can skip over chunks while doing dp or something.

            The example I was talking about had multiple distinct $$$b_i$$$-s, I'll try to dig that up.

»
4 months ago, # |
Rev. 2   Vote: I like it +11 Vote: I do not like it

Some heuristic explanation for the theorem. Assume that $$$|h_i(x)| \ll x$$$ and $$$T(x) \in O(x^p)$$$ for some $$$p$$$. Then,

$$$ g(x) + \sum\limits_{i=1}^k a_i b_i^p x^p \in O(x^p), $$$

which rewrites as

$$$ \frac{g(x)}{x^p} + \sum\limits_{i=1}^k a_i b_i^p \in O(1). $$$

Now, let $$$p_1$$$ be the solution to $$$1 = \sum\limits_{i=1}^k a_i b_i^p$$$, and $$$p_2$$$ be the infimum $$$p$$$ such that $$$g(x) \in O(x^{p})$$$, and let $$$p = \max(p_1, p_2)$$$.

Unless $$$p_1 = p_2$$$, the larger values will dominate the smaller one, hence $$$T(x) \in O(\max(x^{p}, g(x)))$$$.

In particular, in a similar fashion to the master theorem, it should mean that

  • if $$$g(x) \in \Omega(x^c)$$$ and $$$\sum\limits_{i=1}^k a_i b_i^c < 1$$$, then $$$T(x) \in \Theta(g(x))$$$.
  • if $$$g(x) \in O(x^c)$$$ and $$$\sum\limits_{i=1}^k a_i b_i^c > 1$$$, then $$$T(x) \in \Theta(x^{p})$$$.

I suppose it also means that only when $$$p_1 = p_2$$$, the solution is given by

$$$ \frac{T(x)}{x^p} \in \Theta\left(\int\limits_0^x \frac{g(u)}{u^{p+1}} du\right), $$$

but I fail to provide a confident intuitive explanation for this case. It probably can be done by considering the finite difference $$$G(x) - G(x-1)$$$, where $$$G(x) = \frac{T(x)}{x^p}$$$, and finding out that it's asymptotically equivalent to $$$\frac{g(x)}{x^{p+1}}$$$.