Runtime of finding the GCD of an array?

#	User	Rating
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

I started thinking about this during my livestream of the solution to round 731, and discussed this in a comment a bit earlier, but it seems novel enough and might require enough debate that I thought it makes sense to be it's own blog post:

Suppose I have an array $$$a$$$ of $$$n$$$ integers, each up to a million. What is the worst-case runtime of the following code?

int gcd(int a, int b) {
   return b==0?a:gcd(b, a%b);
}

//in main
int gcd=a[0];
for (int i:a) gcd=gcd(i, gcd);

Conventionally, I have always thought it was $$$n*log(MAXVAL)$$$. However, each time the gcd call runs, the answer decreases by an expected factor of about 1.6 (the golden ratio). But really, we only do all $$$log(n)$$$ calls if the gcd gets quite small. In that case, future gcd calls will be $$$O(1)$$$, not $$$O(log)$$$. So do we have a clean potential function like: the number of prime factors in gcd, which can bound the total GCD recursion? I know that this is some very hand-wavey logic, but it seems almost legit enough for me to believe it. Or is this totally not true? I'm secretly wishing it's true because it would be cool, but also I feel like I would have seen this before if it was both commonly known and true.

TLDR: is the above code really $$$O(n + log(MAXVAL))$$$, instead of $$$O(n * log(MAXVAL))$$$?

Comments (8)

Write comment?

12tqian

3 years ago, # |

← Rev. 2 →

+83

I think it is $$$\mathcal O(n + \log(MAXVAL))$$$. If the $$$\gcd$$$ doesn't change, calling $$$\gcd$$$ takes $$$\mathcal O(1)$$$ time. Otherwise, say on the $$$k$$$ times it changes, let's say we recurse $$$r_i$$$ times into $$$\gcd$$$. Then $$$\mathcal O(n + \sum r_i)$$$ is the number of times we recurse into $$$\gcd$$$, which also should be the time complexity. Let $$$g_0 = a[0]$$$, and let $$$g_i$$$ be the $$$\gcd$$$ after the ith time we change the $$$\gcd$$$. So note that $$$\frac{g_i}{g_{i-1}} \geq 2^{r_i}$$$, because each call into $$$\gcd$$$ reduces the maximum of $$$a, b$$$ by a factor of at least $$$2$$$. Then taking the product of this all for all $$$i$$$, we get that $$$\frac{g_0}{g_k} \geq 2^{\sum r_i}$$$, and taking the $$$\log$$$ of this gives our desired bound.

→ Reply

Nson

+14

For me, this is both commonly known and true. I think I first saw this for segtree of gcd.

clyring

+18

In short, yes. The runtime of a single call $$$gcd(a, b) = x$$$ can be bounded above by a constant times $$$1+\log{\frac{\min \{a, b\}}{x}}$$$ arithmetic operations by (for example) noticing that the size of smaller argument must be reduced by at least half every two iterations. From there, it's easy to bound the total runtime by a telescoping series to get $$$O(n + log(MAXVAL))$$$ runtime for the loop.

tch1cherin

-13

Deleted

SecondThread

3 years ago, # ^ |

Yes, but this is only for one gcd call. Since the gcd of two consecutive Fibonacci numbers is 1, all following calls will run in constant time.

Savior-of-Cross

It is $$$O(n + log(maxval))$$$. I remember it being discussed in cf blogs a few times. If you build gcd-segment tree, the query complexity is also $$$O(log(n) + log(maxval))$$$ for the same reason.

sparkles

← Rev. 3 →

Is this theory of gcd applicable to build for something like sparse tables / Segtrees too ? If not, what is the actual complexity on these structures ?

SPyofcode

Yet I think sparse table is $$$O(n \times (\log n + \log a))$$$ whereas $$$n$$$ is the size and $$$a$$$ is maximum value

Since the gcd will be calculated independently to the size of $$$n$$$. And there are $$$n$$$ such value. Then the gcd part should not be calculated more then $$$O(n \times \log a)$$$ as there are not thing more to reduce the gcd

And since the sparse table itself having $$$O(n \times \log n)$$$ complexity. This give us the total of $$$O(n \times (\log n + \log a))$$$ but I think it can be some what smaller as for random cases

SecondThread's blog