Записи в блоге

Блог пользователя peltorator

So Do You Really Understand Fast Matrix Multiplication?

Автор peltorator, 3 дня назад, По-английски

Most of you have probably heard of $$$\omega$$$, the matrix multiplication exponent. Are you sure you fully understand its definition, though? I thought so too, until I started asking myself some uncomfortable questions and realized that I had some subtle misconceptions. To answer these questions, I read a bunch of papers and book chapters, and now I have some peace of mind. Let me try to scare you a bit first and then hopefully rebuild your understanding of fast matrix multiplication.

Let's start with four questions. Try to answer them before reading further.

Q1: Does there exist an $$$O(n^{\omega})$$$-time matrix multiplication algorithm?

Q2: Does there exist an $$$o(n^{\omega})$$$-time matrix multiplication algorithm?

Q3: Does there exist an $$$O(n^{\omega + 0.1})$$$-time algorithm for integer matrix multiplication?

Q4: Does there exist an $$$O(n^{\omega - 0.1})$$$-time algorithm for integer matrix multiplication?

Answers

If you had asked me a month ago, I would have been quite surprised by these answers. Now let's take a closer look at the matrix multiplication exponent and make sense of these answers.

But first: what is $$$\omega$$$? Intuitively, it should be something along the lines of "the smallest value $$$t$$$ such that $$$n \times n$$$ matrices can be multiplied in time $$$O(n^t)$$$". From this definition, the answer to the first question should clearly be "yes". However, the definition is not quite right because the relevant set of exponents may not have a minimum. The correct definition takes the infimum over all valid exponents.

There could be no single "best" matrix multiplication algorithm, but only a sequence of better and better ones. As a result, the definition only guarantees algorithms running in time $$$O(n^{\omega + \varepsilon})$$$ for every fixed $$$\varepsilon \gt 0$$$. It does not tell us whether the infimum is actually attained. So Q1 is not true by definition. It might be true, but we simply don't know.

This is why statements like "using fast matrix multiplication we solve the problem in $$$O(n^{\omega})$$$ time" are not very precise. Even though before this journey I knew that $$$\omega$$$ is an infimum and not a minimum, I never stopped to think through what that really implies. If you want a notation for an actual achievable running time, writing something like $$$\mathrm{MM}(n, n, n)$$$ is cleaner. Here we mean that we can plug in the time complexity of any specific matrix multiplication algorithm.

Time complexities of the form $$$O(n^{\omega} \cdot \log n)$$$ make even less sense. What people usually mean is that the algorithm performs $$$O(\log n)$$$ matrix multiplications, plus some lower-order work. But we are measuring small logarithmic factors while being imprecise about a much larger polynomial factor. Once you write the bound as $$$O(n^{\omega + \varepsilon} \log n)$$$, the extra $$$\log n$$$ can be absorbed into a slightly larger exponent, so this is still just $$$O(n^{\omega + \varepsilon'})$$$ for every $$$\varepsilon' \gt 0$$$.

This brings us to the second question. Even if there is a single "best" algorithm, its running time need not be exactly $$$\Theta(n^\omega)$$$. It could be $$$O(n^\omega \log n)$$$, or even $$$O(n^\omega / \log n)$$$, which is $$$o(n^\omega)$$$. Of course, the second possibility can only happen if $$$\omega \gt 2$$$, since reading the input already takes $$$\Theta(n^2)$$$ time. But until we prove that $$$\omega = 2$$$, an $$$o(n^\omega)$$$-time algorithm cannot be ruled out.

Surprisingly, we are not done yet. Given our current definition of $$$\omega$$$, the third question sounds trivial. If there is an algorithm working in time $$$O(n^{\omega + \varepsilon})$$$ for every $$$\varepsilon \gt 0$$$, we can in particular take $$$\varepsilon = 0.1$$$, right? In reality, our definition of $$$\omega$$$ is still not precise. What exact problem are we solving? What does "matrix multiplication" mean? Is it boolean matrix multiplication? Integer? Real? In what model of computation are we counting the time complexity? Turns out, $$$\omega$$$ only counts the shortest sequence of arithmetic operations computing the product of two $$$n \times n$$$ matrices. Note that the size of this sequence may depend on the set of values from which the matrix entries are taken. In other words, we need to fix the underlying field. It could be $$$\mathbb{R}$$$, $$$\mathbb{Q}$$$, $$$\mathbb{F}_p$$$, or any other one. Each specific field gives its own matrix multiplication exponent: $$$\omega(\mathbb{R})$$$, $$$\omega(\mathbb{Q})$$$, $$$\omega(\mathbb{F}_p)$$$. And when people just say $$$\omega$$$, they (most likely) mean the most general case that we normally care about: $$$\omega(\mathbb{R})$$$.

This poses two problems for us. First, there may exist a sequence of $$$O(n^{\omega + \varepsilon})$$$ arithmetic operations that computes the matrix product, but is there an actual algorithm that knows what operations to do in what order? Note that we need to know this sequence for each of the infinitely many values of $$$n$$$. Second, even if we know the list of operations, it may very well include arbitrary real numbers like $$$\sqrt{2}$$$ or $$$\pi$$$. It could be problematic for us to perform these computations exactly in our standard computational model (for example, word-RAM). Even if we use floating point numbers, we may have precision issues due to subtractions and divisions.

Let us now resolve these two issues. The first one is the easier one. Suppose we know how to multiply $$$k \times k$$$ matrices optimally for some constant $$$k$$$. Then we can use the idea behind Strassen's algorithm (the first and simplest algorithm that solves matrix multiplication in time $$$O(n^{3 - \varepsilon})$$$) to get a relatively good algorithm for arbitrarily large matrices. We split an $$$n \times n$$$ matrix into a $$$k \times k$$$ grid of blocks of size roughly $$$(n / k) \times (n / k)$$$ each and apply the scheme recursively on the blocks. It is not hard to show that by picking $$$k$$$ large enough, we can achieve time complexity $$$O(n^{\omega + \varepsilon})$$$ for any fixed $$$\varepsilon \gt 0$$$. This is pretty neat: we have no idea what the value of $$$\omega$$$ is, but we know that Strassen-like algorithms achieve almost-optimal time complexity regardless.

The harder issue is getting rid of arbitrary real coefficients. This takes a lot of work that goes beyond the scope of this blog post, but the gist is as follows. The matrix exponent is not too sensitive to the specific field at hand, and in particular $$$\omega(\mathbb{R}) = \omega(\mathbb{Q})$$$. (If you want a precise statement, one can prove that $$$\omega$$$ only depends on the field characteristic.) Therefore, in the algorithm described above, it is sufficient to consider only sequences of arithmetic operations for $$$k \times k$$$ matrices using rational coefficients. That's a big relief! Once the coefficients are rational, we can reduce the whole computation modulo some suitable prime $$$p$$$. The prime should just not divide any coefficient denominators and should be large enough that the exact integer answer can be recovered from the result modulo $$$p$$$. (Technically, this is still not enough if divisions are allowed as arithmetic operations, but once again, one can prove that we only need additions, subtractions, and multiplications.) This algorithm can now be easily implemented in $$$O(n^{\omega + \varepsilon})$$$ time on a word-RAM.

So the answer to the third question is indeed "yes". But not for a trivial definitional reason; it relies on several nontrivial facts connecting algebraic complexity over fields to actual algorithms over integers.

Finally, we move to the fourth question. The definition of $$$\omega$$$ lives in the algebraic world: it is about arithmetic complexity over $$$\mathbb{R}$$$. There is no obvious reason why the optimal word-RAM complexity of integer matrix multiplication has to match that exponent. In principle, it may be easier. So we cannot rule out an $$$O(n^{\omega - 0.1})$$$-time algorithm for integer matrix multiplication. (Of course, assuming that $$$\omega \ge 2.1$$$.)

If this blog post challenged your mental model of $$$\omega$$$, I am happy. Hopefully, you now have a more accurate one. As a final exercise, try to prove that there exists an $$$O(n^{\omega + o(1)})$$$-time algorithm for matrix multiplication. This is stronger than having an $$$O(n^{\omega + \varepsilon})$$$-time algorithm for every fixed $$$\varepsilon \gt 0$$$.

Solution sketch

№	Пользователь	Рейтинг
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3603
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	dXqwq	3436
8	Radewoosh	3415
9	Otomachi_Una	3413
10	Um_nik	3376

№	Пользователь	Вклад
1	Qingyu	158
2	adamant	152
3	Proof_by_QED	146
3	Um_nik	146
5	Dominater069	144
6	errorgorn	141
7	cry	139
8	YuukiS	135
9	chromate00	134
9	TheScrasse	134

Блог пользователя peltorator

10. Heuristic algorithm for Hamiltonian path in directed graphs by heuristica

9. I compiled a list of almost all useful blogs ever published on Codeforces by parveen1981

8. Fast modular multiplication by orz

7. Competitive Programming Hall of Fame — cphof.org by Ra16bit

6. The Ultimate Topic List and (The Ultimate) Code Library by YouKn0wWho

5. My opinion on how to practice competitive programming by Radewoosh

4. C++20 Is Released by MikeMirzayanov

3. [Tutorial] GCC Optimization Pragmas by nor

2. Lambda optimization certificate by never_giveup

1. [Tutorial] Li Chao Tree Extended by rama_pang

Про формат занятий

1. Онлайн

2. Очно

Описание параллелей

Параллель А

Параллель A'

Параллель B

Параллель B'

Параллель C

Другие направления

Контакты