Sorting sliding windows online [Looking for answers]

→ Pay attention

Before contest
CodeTON Round 9 (Div. 1 + Div. 2, Rated, Prizes!)
24:09:12
Register now »

*has extra registration

→ Top rated

#	User	Rating
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

View all →

→ Find user

→ Recent actions

Detailed →

div4only's blog

Sorting sliding windows online [Looking for answers]

By div4only, history, 21 month(s) ago, In English

Given an $$$1$$$-indexed integer array $$$a=[a_1,\,a_2,\,a_3,...,\,a_n]$$$ and a fixed windows size $$$k$$$，define sliding window $$$I_{j,\,(j \geq k)} := [a_{j-k+1}, a_{j-k+2}, ..., a_j]$$$. For each $$$j \geq k$$$, we need to answer a query:

How many sliding windows in $$$\{I_k, I_{k+1}, ..., I_{j-1}\}$$$ are less or equal than $$$I_j$$$ in the alphabetical order?

For example, $$$a=[1, 2, 1, 3]$$$ and $$$k=2$$$:

(1) For $$$j=2$$$, we should answer $$$0$$$.

(2) For $$$j=3$$$, we should answer $$$1$$$ as $$$[1,2] < [2,1]$$$ in alphabetical order.

(3) For $$$j=4$$$, we should answer $$$1$$$ as $$$[1,2] < [1,3]$$$ in alphabetical order.

Suffix array + LCP array can solve it in $$$O(nlogn)$$$ offline. But how about solving online? For example, what if $$$a$$$ is an unbounded datastream instead of an array? In the datastream case, you have to process $$$a_j$$$ and $$$I_j$$$ before reading $$$a_{j+1}$$$.

strings

div4only
21 month(s) ago
6

Comments (6)

Write comment?

MZuenni

21 month(s) ago, # |

+10

With hashing+order statistics you can solve it in $$$O(n\log(n)\log(k))$$$. The order statistics tree gives you the position in the sorted sequence of all $$$I_j$$$ in $$$O(n\log(n)\cdot\texttt{cmp})$$$, where $$$\texttt{cmp}$$$ is the time needed to compare two sequences. With hashing you can compare two sequences in $$$O(\log(k))$$$ by binary searching the longest common prefix and then comparing the next index.

→ Reply

div4only

21 month(s) ago, # ^ |

← Rev. 2 →

But here is a problem: The hash value of a window might be exponentially large, and you may need $$$O(k)$$$ time to maintain such a large number.

→ Reply

Svyat

21 month(s) ago, # ^ |

Hash functions are surjective. There's no point in making it bijective for the reasons of performance. As a downside — collisions are possible. That's why you should minimize the chance of a collision (random moduli or base).

→ Reply

div4only

21 month(s) ago, # ^ |

But how can we compare windows after moduli?

→ Reply

Svyat

21 month(s) ago, # ^ |

link1 (cf)
link2 (cp-algorithms)

→ Reply

div4only

21 month(s) ago, # ^ |

← Rev. 2 →

Thanks!

→ Reply