### PurpleThinker's blog

By PurpleThinker, 3 months ago,

A few days ago, someone much smarter than me shared with me the following problem. He said it involves "nothing else but simple arrays and for-loops, but has some weird observations to make". I couldn't solve it, even though the solution is simple enough that it can probably be understood by higher-rated pupils on CF. I thought it might be interesting to you, so here it is:

You are given an array $a$ with $n$ positive integers (you don't know anything about their maximum value). Find out the maximum difference between two elements if the array $a$ would be sorted.

Example:
Input: $n = 4, a = [11, 2, 7, 5]$
Output: $4$, because if we were to sort the array $a$, then we would have $[2, 5, 7, 11]$, so the maximum difference between two adjacent elements is $4$.

Of course, you can just sort the array in $O(n \log n)$ and find the difference that way. Value-based sorting (like radix sort) is not an option because you don't know what the maximum value is. But there is actually a way to find it in $O(n)$, without sorting the array.

Hint
Solution

Hope you enjoyed the problem!

• +274

 » 3 months ago, # |   +15
 » 3 months ago, # |   +8
 » 3 months ago, # |   +154 Cool! Now find minimum. Seriously though. HintThe idea with buckets is good. But you don't really have a good bound of the answer until you start calculating it. SolutionLet's add elements one by one and maintain the answer and similar bucket structure. And let's set the bucket size to the current answer. The good thing about this structure is that each bucket contains at most one element, so when adding new element you only need to check the bucket with the element itself and two neighbouring ones, which takes $O(1)$ (keep buckets in a hash table to not waste anything on empty ones). The bad thing is that the answer might change at some point and our structure stops being correct. Well, just rebuild it in linear time. "But that's quadratic" you might say. And you will be right, but what if we shuffle the array in the start? Then what is the probability that the newest element updates the answer? Well, since the answer have just been updated, the new distance is unique (or maybe you split some segment exactly in the middle, then not really unique, but that doesn't change much). So the new point must be one of the ends of this segment, and the probability of that is $\frac{2}{k}$, where $k$ is the number of processed elements (including the last one).So, on $k$-th iteration you update the answer with probability $\frac{2}{k}$ and in that case rebuild everything in $O(k)$. So the expected time wasted on that iteration is $O(2)=O(1)$, which sums up to just $O(n)$.
•  » » 3 months ago, # ^ |   +91 The cool thing about it... is that it works in spaces with more dimensions. Let's say you want to find 2 closest points in 3D. You do the same, except your buckets are cubes now (imagine cutting the space into a mesh with step d). It's not true that only one point will be at any bucket, but it's still $O(1)$ points. And you need to check your cube and all the neighbours, which is 27 buckets now. But technically it is still $O(n)$ in total.I literally don't know the normal algorithm for the 2 closest points on a plane, because I have only used this one.
•  » » » 3 months ago, # ^ | ← Rev. 2 →   +7 Spoiler...I bet umnik uses binary search to find the bucket each point is in
•  » » » » 3 months ago, # ^ |   +22 I get the joke, but I can't just let it slide. The whole point is that it is $O(n)$.
•  » » » 2 months ago, # ^ |   +28 Here is fun algorithm for finding 2 closest points on 2D plane: sort all the points alongside random direction (by dot product with randomly chosen vector) check all the pairs of neighbors (in that order) repeat with new random vectors additional 5 (or 10 if you get WA) times Yeah, technically it is $O(n \log n)$ but it is significantly faster in practice.And coding only takes a few minutes.It does not generalize to higher dimensions though.
 » 3 months ago, # |   0 This is famous interview problem: https://www.interviewbit.com/problems/maximum-consecutive-gap/
 » 3 months ago, # | ← Rev. 3 →   -8 nice problem
•  » » 3 months ago, # ^ |   -8 it is wrong, think harder.
•  » » » 3 months ago, # ^ |   0 spitting this nonsense. I'm not if he used even a 1% of his brain power.
•  » » » 3 months ago, # ^ |   -21 You probably just implemented it wrong lmao
•  » » » » 3 months ago, # ^ |   0 sorry, I don't see how that solution is correct.a = [5 2 1 4] sorted_a = [1 2 4 5] max = 5 second_max = 4 diff = 1 max_diff = 2 ???
•  » » 3 months ago, # ^ |   0 found the guessforces expert
 » 3 months ago, # |   +54 A mandatory pedantic comment Value-based sorting (like radix sort) is not an option because you don't know what the maximum value is This is a horrible way to phrase it because you immediately proceed to contradict yourself: I will call max the maximum value in a ... You can calculate these in O(n) time Maybe you wanted to say that the radix sort has a pseudopolynomial complexity? However, this is not true, because its complexity is O(wn), where n is the number of keys, and w is the key length. It is polynomial because the input size is also O(wn).Maybe you wanted to say that w is large, in which case O(wn) is polynomial but slow? However, your proposed solution also becomes slow if you account for the increasing cost of arithmetic operations. For example, subtraction becomes O(w), and division by n-1 may be even slower.Maybe it's possible to construct a computational system where the complexities of primitive operations work out in favor of your proposed solution, but I doubt that it will be practical.
•  » » 3 months ago, # ^ |   -21 I think for $a_i \le 10^9,$ we can assume that the cost of such arithmetic operations is basically constant (because of word size), but it is not feasible to create an array of size $10^9$ in order to perform radix sort.
•  » » » 3 months ago, # ^ | ← Rev. 2 →   +8 You can radix sort by the first bit, then by the second bit, etc. This way, you can sort 32-bit integers in 32 linear-time passes. (or be smarter and do 4 passes, each sorting by a group of 8 bits)In general, you can sort $N$ numbers of $W$ bits in $O(WN)$ time.
•  » » » » 3 months ago, # ^ |   +24 I see. For some reason, I mistook radix sort for counting sort. 🤡
•  » » » 3 months ago, # ^ |   +3 You must be thinking about counting sort, not radix sort.
•  » » » » 3 months ago, # ^ |   +37 Yes, I realized that later. I'm a clown
•  » » 3 months ago, # ^ |   -8 For me, it was pretty clear that he meant that you don't know the maximum value at the time of writing the program.
•  » » » 3 months ago, # ^ |   0 As if someone hardcodes key length in radix sort.
•  » » » » 3 months ago, # ^ |   0 Well, radix sort wouldn't be significantly slower even with big numbers, but still. I agree with the things you said, but trying out both versions and measuring their runtime have more worth than talking about them.
•  » » » 2 months ago, # ^ |   -8 I don’t think so. SpoilerHe used Max elements for calculating d, and then dividing elements into buckets. If we don’t know the value of maximum element, then we couldn’t possibly calculate d.
 » 3 months ago, # |   +39 Cool problem! I think there is a small mistake in your explanation. explanationI though about the problem in a more "pigeonhole principle" type of way: Make N equally-sized buckets (this needs some adjusting when the $\max-\min+1$ doesn't divide $n$, but it still works).If each element falls into a different bucket, we just sorted in linear time, so solve normally.Otherwise, there are empty buckets, and the difference across an empty bucket will be bigger than any difference within a bucket, so the answer will be the $\min$ of a bucket minus the $\max$ of the previous non-empty bucket.This brings me to what might be a mistake in your solution: numbers within the same bucket: calculate max value — min value from the bucket This is not an actual candidate (there can be 3 or more elements in the bucket), but checking it doesn't break the algorithm because one of two things always happens: there is an empty bucket (so the largest cross-bucket difference is bigger than this fake candidate), or every element is in a different bucket (so the same-bucket difference is just zero, and thus smaller than the real answer).
 » 3 months ago, # |   +2 ' with n positive integers (you don't know anything about their maximum value'' I will call max the maximum value in a and min -- the minimum value. You can calculate these in O(n) time' Totally not contradictory.
 » 3 months ago, # |   +16 numbers within the same bucket: calculate max value — min value from the bucket isn't this both not correct and unnecessary?
•  » » 3 months ago, # ^ |   0 I agree, if the lower bound is the box size you never need to consider 2 elements from the same box. Also it would only be a valid candidate if the number of elements in the box were exactly 2.
 » 3 months ago, # |   0 Why this? Value-based sorting (like radix sort) is not an option because you don't know what the maximum value is. If you read the input array, you will definitely know the maximum value.
•  » » 3 months ago, # ^ |   +1 They are saying that it can be really large, so you can't use radix sort, because it takes $\mathcal{O(n \cdot \ell)}$. But as described in nskybytskyi's comment above, if the numbers are really quite large, the comparision and division operations described in this solution take $\mathcal{O(\ell)}$ time anyway (where $\ell$ is the length of the binary representation of the number).
 » 3 months ago, # | ← Rev. 4 →   0 What about using a vector containing ints (or lls) called values? Then, as you traverse the elements of the input array, you can set values[index]=value. Then, couldn't you just loop through 0 to n-1, and find the maximum value of the absolute value of (values[index+1]-values[index]), and output the max? Please let me know if I misunderstood the problem.
•  » » 2 months ago, # ^ |   0 yeah you definitely did, the problem is asking about max adjacent difference in a sorted array
 » 6 weeks ago, # |   0