### Fly_37's blog

By Fly_37, history, 2 years ago,

Hello!

Some time ago I created a problem for local programming competition. Unfortunately it turned out that I had incomplete proof of one lemma, that I can not show even to this day.

Lemma: Given an increasing array of $N$ arbitrary large numbers we define its cost as sum of lengths of all non-trivial, maximal arithmetic progressions starting at the first element. The cost of any array is $\mathcal{O}(N\log{N})$.

For example for array $[0, 2, 3, 4, 6, 8, 9]$ — the total cost is $|[0, 2, 4, 6, 8]| + |[0, 3, 6, 9]| + |[0, 4, 8]| + |[0, 6]| + |[0, 8]| + |[0, 9]| = 5 + 4 + 3 + 2 + 2 + 2= 18$.

It is easy to see, that if we simply take $N$ consecutive natural numbers we get $\mathcal{O}(N\log{N})$ cost, but I was not able to prove that this is the worst case scenario.

Best complexity I can show is $\mathcal{o}(N^2)$, but still far from the goal...

Can anyone show if the lemma is true or false?

• +258

 » 2 years ago, # |   +40 Shouldn't the answer for the example be $5 + 4 + 3 + 2 + 2 + 2 = 18$?
•  » » 2 years ago, # ^ |   +8 Sure, fixed. Thanks!
•  » » » 2 years ago, # ^ |   +47 The second part is false then.$F([0, 1, 2, 3, 4, 5]) = 15$ as$|[0, 1, 2, 3, 4, 5]| + |[0, 2, 4]| + |[0, 3]| + |[0, 4]| + |[0, 5]| = 6 + 3 + 2 + 2 + 2 = 15$.On the other hand, $F([0, 1, 2, 3, 4, 6]) = 16$ as$|[0, 1, 2, 3, 4]| + |[0, 2, 4, 6]| + |[0, 3, 6]| + |[0, 4]| + |[0, 6]| = 5 + 4 + 3 + 2 + 2 = 16$.
•  » » » » 2 years ago, # ^ |   +18 That is a fair point :o
 » 2 years ago, # | ← Rev. 2 →   0 Is the array strictly increasing, or just non-decreasing?Edit: Actually, I guess we may assume that the array is just strictly increasing (having duplicate values cannot increase the number of arithmetic progressions).
•  » » 2 years ago, # ^ |   0 Array is strictly increasing
 » 2 years ago, # |   0 Auto comment: topic has been updated by Fly_37 (previous revision, new revision, compare).
 » 2 years ago, # |   +55 Spent about an hour and couldn't solve it. Brilliant, though very hard problem :)
•  » » 2 years ago, # ^ |   -70 Here's what I think will really happened with you. Pick this problem with difficulty rating f(your rating) Pretend to solve it for n minutes, then read the editorial.
 » 2 years ago, # | ← Rev. 2 →   -44 We can try something with differencesdi = di+1 — di d1 d2 d3 -- dn-1 , now our problems become how many different ways are there to divide these differences in different groups of equal sum such that we can't take out this sum from last group of remaining elements anymore first group : d0, d1 + d2 + -- + dl, dl+1 — dr, [dr -- indivisible group]similarly : d0 + d1, ------- , [dr -- indivisible groupI got until here, main observations were : that no. of groups of each length will reduce significantly ,total no. of diff arrays will be n
 » 2 years ago, # | ← Rev. 2 →   +165 $O(N \log N)$ seems hard, but here is a proof for $O(N^{3/2})$ bound:It is easy to see that $a_i,a_j$ can be consecutive elements of at most one maximal arithmetic progression.Let $S_i$ — set of such $j$'s that $a_i  » 2 years ago, # | ← Rev. 2 → 0 Consider the following algorithm that counts the number of arithmetic progressions: for every prime$p$(excluding 1 of course): Consider all elements$a_0+t \cdot p$. Denote their amount by$\ell_p$. Sequences with difference$p,2p,3p,...,\ell \cdot p $are valid options. Iterate over$i=1$to$\ell$and add$\lfloor\frac{\ell}{i}\rfloor$. Since we iterated over all multiples of primes, we iterated over all differences possible (we excluded 1, so the bound will have an extra$n$term). We also know that$ \sum_{p}{\ell_p} = n $. It remains to bound the sum of$\lfloor\frac{\ell}{i}\rfloor$:$ \sum_{i=1}^{\ell}{\lfloor\frac{\ell}{i}\rfloor} \le \sum_{i=1}^{\ell}{\ell\cdot\frac{1}{i}} \le \ell \cdot \log{\ell} $Summing over all primes yield$\sum_{\ell_p}{\ell_p \cdot \log{\ell_p} \le n \cdot \log{n} } $. Edit: As w0nsh said, a single number can contribute to multiple$\ell_p$'s. We know that$x$can have at most$\log{x}$different prime divisors. Suppose$a_i\le M $, then$ \sum_{p}{\ell_p} \le n \cdot \log{M} $, and the bound will be$ n \cdot \log{M} \cdot \log{(n \cdot \log{M})} $. I assumed that for a specific prime number, all differences are possible. Perhaps proving an$O(n \cdot \log{n}) $bound requires analyzing this more carefully. •  » » 2 years ago, # ^ | +51 I don't think$\sum_{p}{\ell_p} = n$is true. One number can contribute to multiple$\ell_p$'s. •  » » 23 months ago, # ^ | +14 I don't see why$p, 2p, 3p, \dots, \ell \cdot p$should be the only valid differences for a given prime. Are you implicitly assuming that$1, 2, \dots, n$(or$p, 2p, \dots, n\cdot p$) is the sequence with highest cost? (I think$\lfloor \frac{\ell}{i}\rfloor$should be$\lfloor \frac{M}{i} \rfloor$, but then considering primes doesn't help.)Here's a simple$O(M \log n)$bound: WLOG suppose the first element is$0$. For every non-zero element$x$, there is a unique maximal arithmetic progression starting with$0, x, \dots$. Its length is at most$\frac{M}{x} + 1$. Summing over all$x$gives$ \sum_{0 < x \in A} \Big(\frac{M}{i}+1\Big) \leq n-1 + \sum_{i=1}^{n-1} \frac{M}{i} = O(n + M \log n) = O(M \log n) $ » 23 months ago, # | +10 Should be noted that consecutive numbers isn't the worst one, but it probably is asymptotically. For example,$[0,1,2,3,4,5,6,7]$has lower cost then$[0,1,2,3,4,5,6,8].$ » 23 months ago, # | ← Rev. 2 → +146 I found a proof for slightly worse bound:$O(n \log^2 n)$.Let$p_i,p_{i+1}$be consecutive primes. Let$E_{p_i}$be a set of pairs$(a_l,a_j)$such that there exists maximal arithmetic progression, such that$a_l$is the$p_i$-th element, and$a_j$is the$p_{i+1}$-th element. WLOG we can assume that$a_0=0$, therefore$\frac{a_l}{a_j}=\frac{p_i}{p_{i+1}}$Notice that from the Prime number theorem it follows that the cost in original problem is$O(\sum_p |E_p| \log n)$.Lets consider initially empty graph$G$with$V=[a_0,a_1,\ldots a_{n-1}]$. We will be adding edges from$E_2, E_3, \ldots$to the graph, in that order.[1] every vertex appears in$E_p$at most twice;[2] If$a_l, a_k$are in the same connected component of$(V, E_2 \cup \ldots \cup E_{p_{i}})$then$\frac{a_l}{a_k}=\frac{A}{B}$for some coprime numbers$A,B$with prime factors$\leq p_{i+1}$. Therefore for every$j>i: \ (a_k,a_l) \not\in E_{p_j}$.From [1] it follows that if we are adding edges from$E_q$and connecting two components$S_1,S_2$, then$|E_q \cap S_1 \times S_2| \leq 2\cdot min(|S_1|,|S_2|)$. Combining smaller-to-bigger argument and [2], we obtain, that the total number of edges in graph$(V, E_2 \cup E_3 \cup \ldots)$is$O(n \log n)$, so the bound for the problem is$O(n \log^2 n)$. •  » » 23 months ago, # ^ | +63 This seems to be a huge improvement! Wow :o •  » » 23 months ago, # ^ | +30 amazing proof  » 23 months ago, # | -38 Hey, if the array is increasing, then it's very easy to prove following:Cost can be calculated in O(Nlog²N) in worst scenario. (Add: maybe one log factor can be reduced, depends on restrictions on A[i])Let's assume our array isa0, a1, a2, a3, ... , aN.We construct another array dd0, d1, ... , dNSuch that d[i] = a[i]-a[0]Now let's see. If array is increasing, we can iterate for each d[i] (ignoring d[0]) and doint k=1; while (map[d*k]) { //while d*k exists in array k++ } In the end, k will be the length of such subarray with d==d[i]//Map adds another log factor, maybe can be reduced1)Note, that k<=n 2)Because array is strictly increasing, d1 •  » » 23 months ago, # ^ | +14 what  » 23 months ago, # | +55 Here's a construction that does$\omega(n \log n)$for infinitely many$n$, disproving the lemma:Choose some$k$and consider the set of elements$i\cdot j$for$1 \leq i,j \leq k$— sorted, after removing duplicates and with a$0$in the beginning.This sequence has cost$\Omega(k^2 \log k)$: any starting element$ij$has a sequence of length at least$\frac{k}{i}$which is enough to show this bound.However, after removing duplicates we have$o(k^2)$elements in our sequence — as found here: https://mathoverflow.net/questions/108912/number-of-elements-in-the-set-1-cdots-n-cdot-1-cdots-n/108939#108939Choosing$k = \sqrt{n}$gives the contradiction. We still don't know the exact bound though. •  » » 23 months ago, # ^ | +81 Won't removing the duplicates affect the total cost estimate as well, since$ij$are not counted multiple times anymore? •  » » » 23 months ago, # ^ | ← Rev. 2 → +20 Right, it might make it asymptotically less than$k^2 \log k\$, my bad. Not sure if this invalidates the construction but I feel like it does.