Блог пользователя div4only

Автор div4only, история, 21 месяц назад, По-английски

Given an $$$1$$$-indexed integer array $$$a=[a_1,\,a_2,\,a_3,...,\,a_n]$$$ and a fixed windows size $$$k$$$,define sliding window $$$I_{j,\,(j \geq k)} := [a_{j-k+1}, a_{j-k+2}, ..., a_j]$$$. For each $$$j \geq k$$$, we need to answer a query:

How many sliding windows in $$$\{I_k, I_{k+1}, ..., I_{j-1}\}$$$ are less or equal than $$$I_j$$$ in the alphabetical order?

For example, $$$a=[1, 2, 1, 3]$$$ and $$$k=2$$$:

(1) For $$$j=2$$$, we should answer $$$0$$$.

(2) For $$$j=3$$$, we should answer $$$1$$$ as $$$[1,2] < [2,1]$$$ in alphabetical order.

(3) For $$$j=4$$$, we should answer $$$1$$$ as $$$[1,2] < [1,3]$$$ in alphabetical order.

Suffix array + LCP array can solve it in $$$O(nlogn)$$$ offline. But how about solving online? For example, what if $$$a$$$ is an unbounded datastream instead of an array? In the datastream case, you have to process $$$a_j$$$ and $$$I_j$$$ before reading $$$a_{j+1}$$$.

  • Проголосовать: нравится
  • +16
  • Проголосовать: не нравится

»
21 месяц назад, # |
  Проголосовать: нравится +10 Проголосовать: не нравится

With hashing+order statistics you can solve it in $$$O(n\log(n)\log(k))$$$. The order statistics tree gives you the position in the sorted sequence of all $$$I_j$$$ in $$$O(n\log(n)\cdot\texttt{cmp})$$$, where $$$\texttt{cmp}$$$ is the time needed to compare two sequences. With hashing you can compare two sequences in $$$O(\log(k))$$$ by binary searching the longest common prefix and then comparing the next index.