By SecondThread, history, 3 years ago,

I started thinking about this during my livestream of the solution to round 731, and discussed this in a comment a bit earlier, but it seems novel enough and might require enough debate that I thought it makes sense to be it's own blog post:

Suppose I have an array $a$ of $n$ integers, each up to a million. What is the worst-case runtime of the following code?

int gcd(int a, int b) {
return b==0?a:gcd(b, a%b);
}

//in main
int gcd=a[0];
for (int i:a) gcd=gcd(i, gcd);


Conventionally, I have always thought it was $n*log(MAXVAL)$. However, each time the gcd call runs, the answer decreases by an expected factor of about 1.6 (the golden ratio). But really, we only do all $log(n)$ calls if the gcd gets quite small. In that case, future gcd calls will be $O(1)$, not $O(log)$. So do we have a clean potential function like: the number of prime factors in gcd, which can bound the total GCD recursion? I know that this is some very hand-wavey logic, but it seems almost legit enough for me to believe it. Or is this totally not true? I'm secretly wishing it's true because it would be cool, but also I feel like I would have seen this before if it was both commonly known and true.

TLDR: is the above code really $O(n + log(MAXVAL))$, instead of $O(n * log(MAXVAL))$?

• +106

 » 3 years ago, # | ← Rev. 2 →   +83 I think it is $\mathcal O(n + \log(MAXVAL))$. If the $\gcd$ doesn't change, calling $\gcd$ takes $\mathcal O(1)$ time. Otherwise, say on the $k$ times it changes, let's say we recurse $r_i$ times into $\gcd$. Then $\mathcal O(n + \sum r_i)$ is the number of times we recurse into $\gcd$, which also should be the time complexity. Let $g_0 = a[0]$, and let $g_i$ be the $\gcd$ after the ith time we change the $\gcd$. So note that $\frac{g_i}{g_{i-1}} \geq 2^{r_i}$, because each call into $\gcd$ reduces the maximum of $a, b$ by a factor of at least $2$. Then taking the product of this all for all $i$, we get that $\frac{g_0}{g_k} \geq 2^{\sum r_i}$, and taking the $\log$ of this gives our desired bound.
 » 3 years ago, # |   +14 For me, this is both commonly known and true. I think I first saw this for segtree of gcd.
 » 3 years ago, # |   +18 In short, yes. The runtime of a single call $gcd(a, b) = x$ can be bounded above by a constant times $1+\log{\frac{\min \{a, b\}}{x}}$ arithmetic operations by (for example) noticing that the size of smaller argument must be reduced by at least half every two iterations. From there, it's easy to bound the total runtime by a telescoping series to get $O(n + log(MAXVAL))$ runtime for the loop.
 » 3 years ago, # | ← Rev. 2 →   -13 Deleted
•  » » 3 years ago, # ^ |   +3 Yes, but this is only for one gcd call. Since the gcd of two consecutive Fibonacci numbers is 1, all following calls will run in constant time.
 » 3 years ago, # |   0 It is $O(n + log(maxval))$. I remember it being discussed in cf blogs a few times. If you build gcd-segment tree, the query complexity is also $O(log(n) + log(maxval))$ for the same reason.
 » 3 years ago, # | ← Rev. 3 →   0 Is this theory of gcd applicable to build for something like sparse tables / Segtrees too ? If not, what is the actual complexity on these structures ?
•  » » 3 years ago, # ^ |   +1 Yet I think sparse table is $O(n \times (\log n + \log a))$ whereas $n$ is the size and $a$ is maximum valueSince the gcd will be calculated independently to the size of $n$. And there are $n$ such value. Then the gcd part should not be calculated more then $O(n \times \log a)$ as there are not thing more to reduce the gcdAnd since the sparse table itself having $O(n \times \log n)$ complexity. This give us the total of $O(n \times (\log n + \log a))$ but I think it can be some what smaller as for random cases