Minima/maxima over all fixed-size arrays (multi-dimensional)

Правка en1, от bicsi, 2017-08-12 16:45:17

This article will be presenting a rather classical problem that can be solved using deques, along with an extension that allows you to solve the problem in its more general multi-dimensional case. I have decided to write this article after this discussion on 2D range-minimum query.

The article will be mainly based on this following problem:

You are given an array of numbers A[] of size n and a number k ≤ n. Find the minimum value for each continuous subarray of size k.

We will be now focusing on the linear-time solution to this problem.

Solution:

Consider sweeping from left to right through the array. At every moment we keep a list of "candidates" for minimum values throughout the process. That means that at each moment, you have to add one element to the list and (potentially) remove one element from the list.

The key observation is that, during the sweep line process, we find two values A[i] and A[j] which have i < j and A[i] ≥ A[j], then we can safely discard A[i]. That is because, intuitively, A[j] will continue to "live" in our sweep line more than A[i], and we will never prefer A[i] instead of A[j].

We should now consider pruning all the "useless" values ("useless" as in the statement above). It is easy to see now that doing this will lead to a strictly increasing list of candidates (why?). In this case, the minimum will always be the first element (O(1) query).

In order to insert an element to the back of the pruned candidate list, we will do a stack-like approach of removing all elements that are greater than it, and to erase on element, we just pop the front of the list (if it is not already removed).

This is a well-known approach for finding minima over fixed-size continuous subarrays. I will now present an extensions that allows you to do the same trick in matrices and even multi-dimensional arrays.

The multi-dimensional extension

Problem (2D):

You are given an matrix of numbers A[][] of size n × m and two numbers k ≤ n, l ≤ m. Find the minimum value for each continuous submatrix of size k × l.

Solution:

Consider the matrix as a list of rows. For each row vector of A, use the 1D algorithm to compute the minimum value over all l-length subarrays, and store them in ColMin[][] (obviously, ColMin[][] is now a n × (m - l + 1)-sized matrix).

Now, consider the new matrix as a list of columns. For each column vector of ColMin, use the algorithm to compute the minimum value over all k-length subarrays, and store them in Ans[][] (of size (n - k + 1) × (m - l + 1)).

The Ans[][] is the solution to our problem.

To add picture.

The pseudocode is as follows:

def solve_2d(M, k, l):
  column_minima = {} # empty list
  for each row in M.rows:
    # We suppose we have the algorithm that solves
    # the 1D problem
    min_row = solve_1d(row, l)
    column_minima.append_row(min_row)
  
  ans = {}
  for each col in column_minima.cols:
    min_col = solve_1d(col, k)
    ans.append_col(min_col)
  
  return ans

Note that the pseudocode is (deliberately) hiding some extra complexity of extracting rows / columns and adapting the 1D algorithm to the 2D problem, in order to make the understanding of the solution clearer.

The total complexity of the algorithm can be easily deduced to be O(n * m)

Multi-dimensional case analysis

The solution can be extended to an arbitrary order of dimensions. For a d-dimensional matrix of size s1, s2, ..., sd, the time-complexity of the problem is O(d * s1 * ... * sd), and the memory complexity is O(s1 * ... * sd). This is much better than other algorithms that do the same thing on non-fixed size submatrices (e.g. multi-dimensional RMQ has O(s1 * ... * sd * log(s1) * ... * log(sd)) time and memory complexity).

Finding the best k minima

To be written.

Теги rmq, deque, 2d

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en2 Английский bicsi 2017-08-13 13:53:14 1716 Tiny change: 'blema/smax\n\nI recommen' -> 'blema/smax. I recommen'
en1 Английский bicsi 2017-08-12 16:45:17 4141 Initial revision (published)