Online solution for each query for finding maximum frequent value in a range

→ Pay attention

Before contest
Educational Codeforces Round 190 (Rated for Div. 2)
19:31:11
Register now »

→ Streams

Codeforces Edu Round 190 (Div 2) — Solution Discussion

By Shayan

Before stream 21:36:11

View all →

→ Top rated

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3611
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	Radewoosh	3415
8	Um_nik	3376
9	maroonrk	3361
10	XVIII	3345

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	Qingyu	162
2	adamant	148
3	Um_nik	146
4	Dominater069	143
5	errorgorn	141
6	cry	138
7	Proof_by_QED	136
8	YuukiS	135
9	chromate00	134
10	soullless	132

View all →

→ Find user

→ Recent actions

Detailed →

LittleMaster_7's blog

Online solution for each query for finding maximum frequent value in a range

By LittleMaster_7, history, 10 years ago, In English

SPOJ — Most Frequent Value

I solved this problem using Mo's algorithm.

Is there any Online solution for each query for this problem.

LittleMaster_7
10 years ago
24

Comments (23)

Show archived | Write comment?

UnknownNooby

10 years ago, hide # |

← Rev. 3 →

+21

I only know the solution in $\text{[math]}$ .

Tried to find better solution for quite a long time, but no results yet :(

→ Reply

downvoteplz

10 years ago, hide # ^ |

Can you describe it?

→ Reply

UnknownNooby

10 years ago, hide # ^ |

← Rev. 2 →

+31

Note: (I consider all values in array ≤ N. If that is not true, then you can use hashmap instead of array or with $\text{[math]}$ precalculations you can make that so)

Split your array into $\text{[math]}$ blocks with size K. Precalculate answers for all intervals between all beginnings of these blocks along with array cnt[x] which tells you how many numbers x are in that interval. You can do that simply in linear time for every interval.

We have wasted $\text{[math]}$ time and memory by this point, what can we do now?

Let's consider we have query [L;R] such that R - L ≥ 2·K (Otherwise we can do linear search to calculate number of occurrences for every value on interval.) Now we know for sure that one of precalculated arrays is completely inside of our query. To be exact, this array covers $\text{[math]}$ . (Further I'll call these borders [A;B])

Notice that A - L ≤ K, same is for R - B ≤ K We can simply use the precalculated array for [A;B], recalculate the value for [L;R] with linear approach which will work in O(K) for each query.

Note

To sum up: This approach works in $\text{[math]}$ by choosing $\text{[math]}$ you can get the time complexity I was talking about.

Note 2

→ Reply

MadhuramJ

6 years ago, hide # ^ |

Same approach with a different k can be used

Lets say k=N^z and Q= N^y (y<=1) Total Complexity=O( N^(z*z — 2*z +2) + N^(z+y) ) We would want the two powers to be the same Equating we get: z*z — 3*z + (2-y) = 0

we get z= ( 3 — sqrt(4*y+1) ) / 2

for y=1(Q=N) , z=0.38 Total complexity = O(N^1.38)

→ Reply

nandrewjh

10 years ago, hide # |

+18

is there any offline without mo's?

→ Reply

harish_dalal

5 years ago, hide # ^ |

-7

The recent solution for the problem https://mirror.codeforces.com/contest/1514/problem/D proposed by galen_colin in his recent stream on youtube, worth watching it.

→ Reply

Enchom

10 years ago, hide # |

← Rev. 5 →

+100

Thought the problem is interesting so here is what I came up with — should be $\text{[math]}$ .

We have an array $\text{[math]}$ of size $\text{[math]}$ , and $\text{[math]}$ queries.

Let's start off by choosing some constant $\text{[math]}$ . We will do some heavy precomputing that we'll split in two parts:

First precomputation

Define the following function:

$\text{[math]}$ = the minimum index $\text{[math]}$ such that the most frequent value in $\text{[math]}$ occurs exactly $\text{[math]}$ times.

We want to compute this function for all $\text{[math]}$ and $\text{[math]}$ . This can be done in $\text{[math]}$ since we can do something similar to a two-pointer walk for a fixed $\text{[math]}$ . I'll omit details, but feel free to ask.

Second precomputation

The first part of our precomputation will help us answer queries whose answer is quite small. So we'll now have to do something about queries with a large answer. Suppose that we create a set $\text{[math]}$ that contains all values of $\text{[math]}$ which occur in $\text{[math]}$ more than $\text{[math]}$ times, and denote its elements by $\text{[math]}$ . Obviously, this set will have size of at most $\text{[math]}$ , that is $\text{[math]}$ . Now let's define the function:

$\text{[math]}$ = the minimum index $\text{[math]}$ such that $\text{[math]}$ and $\text{[math]}$

We want to compute this function for all $\text{[math]}$ and $\text{[math]}$ . This can again be done in $\text{[math]}$ by using a DP-like approach and moving from the end to front of the array for every fixed $\text{[math]}$ .

Answering the queries

Now let's start answering queries. Suppose we get a query $\text{[math]}$ to $\text{[math]}$ . Suppose that we want to check if there is some value that occurs at least $\text{[math]}$ times. Well, for $\text{[math]}$ we can simply check whether $\text{[math]}$ . If it is — then there is a value that occurs at least $\text{[math]}$ times, and otherwise there isn't one.

In such case, we can straight away check whether $\text{[math]}$ and if that's false, then we know that the answer is less than K and we can just iterate on all $\text{[math]}$ and find the largest value that works. That would take $\text{[math]}$ .

However, if we have $\text{[math]}$ , then the answer is at least $\text{[math]}$ , but may be larger. Well, in that case we will check each of the numbers in $\text{[math]}$ , as if the answer is larger than $\text{[math]}$ , then surely one of them is the most frequent number.

Using the $\text{[math]}$ function we can easily find the number of occurrences of $\text{[math]}$ in our segment for some $\text{[math]}$ in $\text{[math]}$ ¹. In such case we can find our answer in $\text{[math]}$ by checking every element of $\text{[math]}$ .

Resulting solution and theoretical complexity

We have total precompute complexity of $\text{[math]}$ and each query is answered in either $\text{[math]}$ or $\text{[math]}$ . The total complexity in the worst case is $\text{[math]}$ . It is plain simple to see that if we set K = $\text{[math]}$ we get worst case complexity of $\text{[math]}$ .

My experience

My coding and explaining are a bit rusty so writing the code and this comment took me an hour each. I ended up getting AC on the problem but with a lot of time limits prior to that. I had to optimise the code a bit to get it accepted. An example optimisation is to solve the first case of queries in $\text{[math]}$ by binary search instead of linear search. I also had to pick the constant K from the program depending on the test case in order to make it run quicker, as in practice $\text{[math]}$ may not always be the best.

Obviously, even if the solution is asymptotically as good as Mo's algorithm offline solution, the constant is much higher, hence it's a few times slower and the SPOJ problem time limit is quite tight. Another downside of the solution is that it takes a lot of memory, but luckily the SPOJ problem had a very large limit.

Feel free to ask any questions and sorry if I've omitted too many details. Notify me about any mistakes too, as this comment took way too much time and I don't have time to proofread.

¹ Note: To be able to quickly find the number of occurrences of $\text{[math]}$ in some segment $\text{[math]}$ to $\text{[math]}$ we'll have to precompute another array:

$\text{[math]}$ = the amount of indices $\text{[math]}$ such that $\text{[math]}$ and $\text{[math]}$ .

For example the sample array in the SPOJ problem {1, 2, 1, 3, 3} would yield and ID array of {2, 1, 1, 2, 1}. In a sense, we're just numbering the values of each kind backwards. Now, let's set $\text{[math]}$ and also set all $\text{[math]}$ if there is no valid index $\text{[math]}$ according to the definition of the $\text{[math]}$ function.

Now, magic! If we want to find the amount of occurrences of $\text{[math]}$ in the segment $\text{[math]}$ to $\text{[math]}$ we simply take $\text{[math]}$ and that is our answer.

→ Reply

shaheen_bd

10 years ago, hide # ^ |

+13

Well explained. :)

→ Reply

shaheen_bd

10 years ago, hide # ^ |

← Rev. 2 →

+16

If there is update of any value, then how to solve it ?

→ Reply

Errichto

10 years ago, hide # ^ |

+24

And linear memory please.

Seriously, isn't this problem already hard enough? Encho's solution is quite complicated (and btw. I tried to solve it yesterday, spent 40-50 minutes and didn't succeed) and you just casually ask "ok, what if we also change values".

→ Reply

shaheen_bd

10 years ago, hide # ^ |

+28

I also tried to implement that idea , but failed :(

→ Reply

Azret

10 years ago, hide # ^ |

+15

Wow, that's really cool :) Liked magic section much

→ Reply

LittleMaster_7

10 years ago, hide # ^ |

Thanks a lot , nice idea .

→ Reply

bicsi

10 years ago, hide # ^ |

← Rev. 2 →

Great stuff! Although note that there is no need for the Next matrix, as some prefix sums would do the trick just fine! I wonder if there is some preprocessing that would let us find out information about the "frequent" elements faster than O(numberofelements). I doubt it, but it would surely be interesting to check out!

EDIT: By keeping the prefix matrix as n vectors of size sqrt(n) the solution will be very cache friendly for the big values.

→ Reply

johnchen902

10 years ago, hide # |

+18

There is an entry on Wikipedia: Range Mode Query.

→ Reply

johnchen902

10 years ago, hide # ^ |

← Rev. 2 →

+38

That page mentioned an O(n) space and $\text{[math]}$ method.

Theorem 1 Let A, B be any multiset. $\text{[math]}$ .

Proof Trivial

Now assume we have an array A of size n. Split it into $\text{[math]}$ blocks, each of which sized $\text{[math]}$ . Precompute the mode and frequency of each consecutive blocks. It took O(n) space and $\text{[math]}$ time.

For each query, we have a prefix, a span and a suffix. By Theorem 1, the mode must be the mode of the span, an element of the prefix, or an element of the suffix. For each element in the prefix or the suffix, check if it is more frequent than the current mode. With additional preprocessing and analysis, $\text{[math]}$ per query can be achieved.