Блог пользователя mxm_jv

Автор mxm_jv, 4 года назад, По-английски
For not c++ users

Initial problem

Statement:

Let $$$A$$$ be an array $$$A_0,A_1,\ldots,A_{n-1}$$$ of the size $$$n$$$. Find the maximum sum of continuous subarray(window) of the size $$$k$$$ with possibility to ignore one of the elements.

Naive $$$O(n\cdot k)$$$ solution:

Let's denote $$$sums[i]=A_{i-k+1}+A_{i-k+2}+\ldots+A_i$$$

$$$minimum[i]=min(A_{i-k+1},A_{i-k+2},\ldots,A_i , 0)$$$ (if minimum is positive, we will use 0 instead)

For $$$i \lt k - 1, minimum[i]=sums[i]=0$$$

Key idea: The answer is maximum of $$$sums[i] - minimum[i]$$$, for every $$$k-1 \leq i \lt N$$$

Proof

So, let's go straight forward, and for every $$$i$$$, calculate $$$sums[i]$$$, find $$$minimum[i]$$$, and then pick the maximum for $$$sums[i]$$$ — $$$minimum[i]$$$.

There is a way to calculate $$$sums$$$ in $$$O(n)$$$ using formula: $$$sums[i] = A_i + sums[i-1] - A_{i-k}$$$.

But there is no obvious algorithm to calculate $$$minimum[i]$$$, so for every $$$i$$$ we will need to iterate every value in the k-window and pick the minimum.

Complexity: $$$O(n\cdot k)$$$

Linearithmic solution:

Though, it's a not-straight forward strategy to find minimum in the range, there is well-known problem called RMQ, range minimum queries, especially static range minimum queries, which helps us find the solution in $$$O(n\cdot log(n))$$$ (using sparse table to find minimum in needed range)

Sparse table is really fast, and with a good implementation it might succeed, when linear complexity is needed, though it's not what we are looking for.

CPalgorithms RMQ section

Top coder RMQ and LCA

Another $$$O(n\cdot log(n))$$$ approach:

To find minimum in our k-window, we can use priority queues($$$pq$$$) of minimum.

Let's build $$$sums$$$ with technique we used in our $$$O(n\cdot k)$$$ solution, and instead of going through k-window to find minimum for every $$$i$$$, we will start with $$$i=0$$$ and iterate until the end.

For every $$$i$$$ we will add a pair $$$(value=A_i, index=i)$$$ And then we remove the top of priority queue, if second element is out of the k-window $$$(pq.top().index \leq i-k)$$$, until we find the element that is inside. so for $$$i$$$ answer is $$$sums[i] - pq.top().value$$$

Linear time? Not yet.

Let's take a look on even better ($$$\text{if }k \ll n$$$) linearithmic solution in $$$O(n\cdot log(k))$$$

Instead of priority queue, we can use set($$$set$$$) (in our case multiset)! Although hidden constants of sets are way bigger than priority queue ( because of tree balancing ), set let you keep constant size of k, which let us insert, erase and get minimum if $$$O(log(k))$$$

To make this work we still using our $$$sums$$$ technique, but now you can use the same idea to find minimum,

Insert $$$A_0, A_1,\ldots A_{k-1}$$$ to the set.

To find minimum of the set in c++, use *s.begin()

For $$$i=k-1$$$ answer is $$$sums[k-1] - min(set)$$$

Iterate through $$$i \geq k$$$, and for every $$$i$$$,

set.insert(A[i])

set.erase(A[i-k]) *

*to make sure you erase only one element use set.erase(set.find(A[i-k])) instead.

And the answer again will be $$$sums[i] - min(set)$$$

We are inserting and erasing from the set of constant size $$$k$$$, $$$n$$$ times, so overall complexity: $$$O(n\cdot log(k))$$$

This solution also might be generalized for $$$q$$$ elements to remove, just take $$$q$$$ smallest negative values int the set of every $$$i$$$. For generalized problem overall complexity is: $$$O(n\cdot q\cdot log(k))$$$, but feel free to come up with better solutions for this problem in the comments.

Finally transition to $$$O(n)$$$
Did you skip?

The key observation that you might already found, that for every $$$i$$$, if there is exist $$$j$$$ such, $$$A_i \gt A_j$$$ and $$$i \lt j$$$, then for every $$$j \leq w \leq j+k-1$$$, $$$minimum[w] \neq A_j$$$

Proof

So now, when we have all the things we need, let's take a look on double-ended queue(deque). This data-structure allows to insert(push) and remove(pop) from 2 sides(back and front) in $$$O(1)$$$ Although it doesn't keep the elements in order.

How can we help it? Let's use our observation!

Let's say we want to keep minimum for current window in front of the deque. As priority queue in third solution, we will keep in the queue pair of $$$(value=A_i, index=i)$$$

We iterate through every $$$i$$$ from 0 to n-1,

Let's remove all elements from the front, that outside of current window.

while($$$deque.front().index \leq i-k$$$) — remove $$$deque.front()$$$

And now to insert new $$$A_i$$$, let's go to back of our deque.

We know that for every element in the deque, $$$index \lt i$$$, because we iterate on $$$i$$$. Now because of the observation we made, we can remove all the elements from the back, that are bigger than $$$A_i$$$

Why?

And what we'll get in the front is actual minimum for current window.

When $$$i \geq k-1$$$ we can start to finding the maximum of $$$sums[i] - deque.front().value$$$

Every element is once added (only when iterated through), and at most once removed. Because we used deque, we can't iterate trough an entire deque (only when removing), so overall complexity: $$$O(n)$$$

Conclusion:

By the end we found an algorithm that finds minimum for every k-size subarray in linear time, I don't know if it's actually useful in other problems, but I enjoyed writing this blog. Feel free to correct me if you found any mistakes or you have another ideas in the comments.

Also thanks to TsundereMagic for linking an article on CP algorithms about pretty much same idea.

P.S

My c++ implementation on PasteBin.com

Полный текст и комментарии »

  • Проголосовать: нравится
  • +11
  • Проголосовать: не нравится