arthur_9548's blog

By arthur_9548, 6 months ago, In English

We hope everyone enjoyed the problems of XIII UnB Contest Mirror! This editorial contains the description of the solutions and their implementation. Feel free to discuss them in the comments!

Problem A

Solution
Code

Problem B

Solution
Code

Problem C

Solution
Code

Problem D

Solution
Code

Problem E

Solution
Code

Problem F

Solution
Code

Problem G

Solution
Code

Problem H

Solution
Code

Problem I

Solution
Code

Problem J

Solution
Code

Problem K

Solution
Code

Problem L

Solution
Code

Problem M

Solution
Code

Problem N

Solution
Code

Full text and comments »

  • Vote: I like it
  • +13
  • Vote: I do not like it

By arthur_9548, 6 months ago, In English

Hello, Codeforces!

We are happy to invite you to the XIII UnB Contest Mirror on Nov/04/2025 20:00 (Moscow time). This contest was originally held at University of Brasília, during our annual University Week.

This contest is a Codeforces Gym, and therefore is NOT an official Codeforces Round and is NOT rated.. The contest will follow standard ICPC rules.

The problems were created and prepared by:

Also, we would like to thank:

  • Our problem reviewers and testers: Guilherme Rocha (TheRockRocha), Lucas Gabriel (lucasg05), Nikolle Licá (june16th) and Ruan Petrus (MagePetrus);
  • Víctor Lamarca (VLamarca) for contributing to the preparation of the problems.
  • DOM Judge team for the platform used in the on-site contest;
  • dario2994 for the brilliant pol2dom tool;
  • MikeMirzayanov for the amazing Codeforces and Polygon platforms.

Most of the problems are aimed at those who are starting their journey in competitive programming. However, we also included a few problems we believe will be an interesting challenge for more experienced participants, such as finalists of ICPC Regionals.

Good luck to all participants!

Note: if you participated in the on-site contest, please don't make any comments regarding the problems in this blog. As soon as the Mirror finishes, we will publish an editorial blog — you may discuss the problems there.

Full text and comments »

Announcement of XIII UnB Contest Mirror
  • Vote: I like it
  • +29
  • Vote: I do not like it

By arthur_9548, 8 months ago, In English

Introduction

This article was started by LeoRiether and concluded by me. I found his idea incredible and offered to help him finishing the text he started to write here some years ago. I hope everyone finds it interesting! I think it's the most natural way to understand the KMP algorithm.

When following the examples, you should use LeoRiether's KMP simulator to see for you yourself what is happening.

The definitions presented here will not be so rigorous, but will instead follow a more intuitive approach. We will use examples with strings composed of characters from $$$a$$$ to $$$z$$$, but you could do the exact same with any other alphabet (like the natural numbers or whatever you want).

Automaton?

The text will assume you know what is a nondeterministic finite automaton (NFA) and a deterministic finite automaton (DFA), but I believe that an intuitive comprehension of them is enough to understand the ideas presented. You can think of an automaton as a graph with letters in the edges, where we will call vertices states. There is a starting state and some accepting states. To use an automaton, you maintain "pointers" to it's states/vertices, which will be called "threads".

Feeding a string $$$S$$$ to an nondeterministic finite automaton as input is to feed all of it's characters $$$S_i$$$ sequentially, considering that at the start there is a thread at the starting state. To feed a character to an automaton is basically to follow the procedure:

For all current threads: let's say the thread is at the state $$$A$$$. For all edges from $$$A$$$ to some $$$B$$$ that have the character $$$S_i$$$, we will prepare to create a new thread at $$$B$$$ and delete the thread at $$$A$$$. After considering all current threads, first delete all of them and then create all the new ones we had prepared to create.

If at the end there is any thread at any accepting state, the automaton "accepts" the string. You can assume the deterministic version is the same, but each state should have exactly one outgoing edge for each letter (so simulating it is just traversing a graph).

The Problem

We want to solve the string matching problem: given two strings $$$S$$$ and $$$P$$$, find all substrings of $$$S$$$ that match the pattern $$$P$$$. There are other uses for the concepts shown here, but we'll focus on this one first.

A Simple Solution

There's a pretty simple NFA construction that solves this exact problem. We'll feed this automaton the characters of $$$S$$$ as input, and if at any point a thread is at the accepting state, we'll know there's a substring that matches $$$P$$$. Let the starting state have an edge to itself for every character of the alphabet (remember this is the alphabet of the automaton, not necessarily the English alphabet). This loop will create a thread at every input character and leave it there, at the starting node, ready to match the pattern when the other inputs come. Next, we'll make a path containing the characters of the pattern $$$P$$$ at every edge. Here's an example of an NFA that matches the pattern "abacaba":

abacabautomaton

Notice that the $$$Σ$$$ edge in the first node represents "an edge for every symbol of the alphabet", and that it includes the letter "a" in this case. Thus, if the automaton receives an "a", the thread in the first node splits into one for the second node and one that keeps looping on the starting node. When this split occurs, one of the threads will keep "walking" to the right while the symbols of the string are matching the forward edges, but if at any point they don't match, the thread "dies". Take a moment to appreciate how this simple NFA solves the matching problem.

An Example

Let's look at an automaton that matches the pattern "abaa". The nodes filled in red represent alive threads.

abaa

We'll input the string "abaab" and see how the threads behave. First, let's input the letter "a".

abaa1

The thread we had before split into two, and now the one on the right will try to walk to the right as far as it can. Now we input the second character of our string, "b".

abaa2

Nothing surprising here. Let's input "a".

abaa3

Now there are 3 threads, look at them go! Now input another "a".

abaa4

Notice that the thread that was on the second node didn't have any edge to follow, so it died, but then the first node split and another thread is now at the second state. Also, we have a filled accepting state, so there's a match with the pattern "abaa"! One last input, "b".

abaa5

We already have a thread matching the "b" from "abaa", even if this first "a" overlaps with our first match. This kind of behaviour allows us to match overlapping patterns efficiently.

Running the Automaton

The naive way to run this automaton is to keep a set of threads and advance them individually. This is, however, not efficient at all. Imagine the pattern is something like "aaaaaa", and we match a string "aaaaaaaaaaaaaaaaa". At every iteration a new thread would be spawned and every other thread would advance, with no deaths. This gives us a time complexity of $$$O(|S| × |P|)$$$. We can do better by exploiting some properties.

How many states are reachable?

There is potential to do better than the naive simulation by analyzing what is called the "subset construction" of our NFA (think of all possible combinations (subsets) of threads that can be alive at the same time). Indeed, only $$$|P|+1$$$ subsets would be reachable in it! The proof is as follows: imagine we've input some string to the automaton and there's some set of alive threads. Let $$$u$$$ be the most advanced (closest to the right) thread alive -- we'll call it the "leader" thread --, and let $$$n$$$ be the number of forward edges (the ones that aren't the loop) it followed. There are no threads to the right of $$$u$$$, so we only need to consider the last $$$n$$$ symbols the automaton has seen, and there's only one length-n string possible because there's only one path the thread could have followed. Because of that, the alive threads to the left of $$$u$$$ are uniquely determined! In other words, if we know the leader, there's only one possible set of alive threads to the left of it. With this, it's possible to conclude that there's exactly one reachable subset in the subset construction for every leader state, which gives us $$$|P|+1$$$ states in total (since we are including the subset with only the starting thread).

An idea one could have now is: when simulating the automaton, what if we only keep track of the leader thread, instead of the entire alive subset? This is exactly what we'll do, but there's a problem: what if the leader dies? We need to know who will occupy his place. Before that, though, we'll need one more concept.

Left-set

It turns out that "there's only one possible set of alive threads to the left of the leader" does not apply exclusively to the leader. In fact, if we know some thread is alive, all of the threads to the left of it can be uniquely determined. The argument is very similar to the one shown before, here's a sketch: if a thread is alive in node $$$n$$$ ($$$0$$$-indexed), then there's one possible string of length $$$n$$$ that must have been fed to the automaton in the last $$$n$$$ steps, thus the threads in smaller positions must have come from this string.

Now we can finally know what to do in the case of a failed match.

Dealing with failure

Suppose we've built the automaton for "abababax" and fed "abababa" to it.

abababax

What would happen if we input the letter "b"? Well, the leader can't go forward, so it dies. The thread immediately to the left of the leader, however, can go forward, and thus becomes the new leader. We'll call this "thread immediately to the left of a thread" the "neighbor". This idea outlines a possible efficient way of simulating our NFA. We'll keep track of the position of the leader at every step. How? There are two cases:

  1. The leader matches the input and goes forward. This case is easy, just increment the current leader position.
  2. The leader fails to match and dies. Then, the neighbor of the leader (let him be at position $$$n0$$$) may or may not survive. If it does, the new leader is at $$$n0+1$$$. If it dies, then maybe the neighbor of this other thread (at $$$n1 \lt n0$$$) survives. If it does, the new leader is at $$$n1+1$$$. If it dies, then maybe their neighbor survives. You get the idea; in the worst case, we can always go all way back to the starting node, who always has an alive thread.

Then, if we could know the neighbor of a thread in $$$O(1)$$$, the entire matching process would take time $$$O(|S|)$$$. Yes, even if we walk through neighbors one by one as they die, the algorithm runs in linear time. To see why, consider that we spawn at most one thread per step, so the total number of dead threads is bounded by $$$|S|$$$. It's like pushing threads to a queue when they are spawned and popping the queue when they die.

Precalculating the Neighbor

When building our automaton, we'll precalculate an array neighbor, where neighbor[u] indicates the neighbor of $$$u$$$, in the case where $$$u$$$ is the leader. If we do that, we'll be able to simulate our automaton as described in "Dealing with failure" in $$$O(|S|)$$$!

It turns out precalculating the neighbor is very similar to what we've seen in "Dealing with failure". Let's compute the neighbor array from left to right. Suppose we've already computed the result for the first $$$k$$$ vertices. How do we compute it for the $$$k+1$$$-th? The only way the leader could move from $$$k$$$ to $$$k+1$$$ is to go through the edge between them, right? So it means we got $$$P_k$$$ as input ($$$0$$$-indexed). If neighbor[k] has an edge with the letter $$$P_k$$$, then neighbor[k+1] = neighbor[k]+1. If not, let's check the edge from neighbor[neighbor[k]]], and so on, until we succeed or reach the start state. You can use the same argument from the previous section to prove that this is $$$O(|P|)$$$.

In fact, this is the same as calculating who is the new leader when the leader is neighbor[k] and we receive $$$P_k$$$ as input!

Implementation

I'll take this moment to remember about LeoRiether's amazing KMP simulator. It is very useful to simulate some examples.

Let's get to the actual code. We have come to the conclusion that the only thing that we need is to calculate the new leader based on the previous leader and the input character, which depends whether the leader's edge matches the input or not. We can implement the exact logic we've been discussing:

struct KMP {
	string P;
	int n; // n = |P|
	vector<int> neighbor;
	
	KMP(string& p){
		P = p;
		n = (int)P.size();
		neighbor.resize(n+1);
		neighbor[1] = 0; //starting node is always alive
		for(int k = 1; k < n; k++)
			neighbor[k+1] = next_leader(neighbor[k], P[k]);
	}
	
	bool match(int state, char c){
		return state < n and P[state] == c;
	}
	
	int next_leader(int leader, char input){
		if (leader == 0)return P[0] == input; 
		// either advances to 1 or remains in start
		if (match(leader, input))return leader+1;
		else return next_leader(neighbor[leader], input);
	}
};

Whenever the leader is at the position $$$|P|$$$, we have found a match:

int matches(string P, string S){
    KMP kmp(P);
    int count = 0, state = 0;
    for(char c : S){
        state = kmp.next_leader(state, c);
        if (state == (int)P.size())count++;
    }
    return count;
}

I personally prefer a more reduced implementation:

struct KMP {
	string P; int n; vector<int> nb;
	
	KMP(string& p) : P(p), n((int)P.size()), nb(n+1) {
		for(int k = 1; k < n; k++) nb[k+1] = nxt(nb[k], P[k]);
	}
	
	int nxt(int i, char c){
		for(; i; i = nb[i])if (i < n and P[i]==c)return i+1;
		return P[0]==c;
	}
};

LeoRiether's can be found here.

You can test your implementation here.

Prefix Function

The currently most popular interpretation for the KMP algorithm is as a Prefix Function: for each prefix $$$p$$$ of a string $$$S$$$, it returns the length of the second largest prefix of $$$p$$$ that is also a suffix of it (the first largest is, of course, $$$p$$$ itself).

This is equivalent to, when the leader thread is at the state $$$|p|$$$, the position of the second largest alive thread, which is exactly our neighbor array. You can see this clearly if you think about what happens when we feed the characters of $$$P$$$ to the automaton we have just built.

The KMP Automaton

We have seen how to solve the string matching problem in $$$O(|P| + |S|)$$$. However, in some problems (like this one), we may want to store something like: for all possible leaders and input characters, what would be the new leader. This means to construct an explicit DFA that is equivalent to the NFA we were simulating earlier. When people talk about the "KMP Automaton", this is usually what they mean.

This construction can be made using dynamic programming. If the result has been calculated for the first $$$k$$$ states, for the $$$k+1$$$-th we iterate each character $$$c$$$ of the alphabet. Then, we have two possibilities:

  1. If $$$P_k = c$$$: the new leader is $$$k+1$$$.
  2. If not, the answer is the same as the already calculated at the state neighbor[k+1].

You can add the following to the KMP implementation to achieve this (considering the alphabet from $$$a$$$ to $$$z$$$):

    vector<vector<int>> dfa(n+1, vector<int>(26));
    void build_dfa(){
        dfa[0][P[0]] = 1; //only way to advance at 0
        for(int k = 1; k <= n; k++)
            for(int c = 0; c < 26; c++)
                if (k < n and P[k] == 'a'+c) dfa[k][c] = k+1;
                else dfa[k][c] = dfa[neighbor[k]][c];
    }

Conclusion

I'd like to thank LeoRiether again for letting me complete his work and share the final result with the Codeforces community. Also, thanks to duduFreire and felipe_massa for the excellent review on this blog.

I hope you liked the ideas presented as much as I did!

Full text and comments »

  • Vote: I like it
  • +106
  • Vote: I do not like it

By arthur_9548, history, 11 months ago, In English

Hey everyone! I hope everyone enjoyed the problems of VI UnBalloon Contest Mirror. This editorial contains the description of the solutions and their implementation. Feel free to discuss them in the comments!

Problem A

Solution
Code

Problem B

Solution
Code

Problem C

Solution
Code

Problem D

Solution
Code

Problem E

Solution
Code

Problem F

Solution
Code

Problem G

Solution
Code

Problem H

Solution
Code

Problem I

Solution
Code

Problem J

Solution
Code

Problem K

Solution
Code

Problem L

Solution
Code

Problem M

Solution
Code

Problem N

Solution
Code

Full text and comments »

  • Vote: I like it
  • +29
  • Vote: I do not like it

By arthur_9548, 12 months ago, In English

Hello, Codeforces!

We are happy to invite you to the VI UnBalloon Contest Mirror on May/18/2025 20:00 (Moscow time) . UnBalloon is the competitive programming community of University of Brasília.

This contest is a Codeforces Gym, and therefore is NOT an official Codeforces Round and is NOT rated.. The contest will follow standard ICPC rules.

All problems were created and prepared by me, duduFreire, lucassala, PedroGallo, MagePetrus and wallacelw.

Also, we would like to thank:

Most of the problems are targeted to those who are starting their journey in competitive programming. However, we also included a few problems we believed will be an interesting challenge for more experienced participants, such as finalists of ICPC Regionals. We hope the contest to be balanced enough to have all kinds of teams thinking on challenging tasks until the end!

Good luck to all participants!

Note: if you participated in the on-site contest, please don't make any kind of comment regarding the problems in this blog. As soon as the Mirror finishes, we will publish an editorial — you may discuss the problems there.

Full text and comments »

Announcement of VI UnBalloon Contest Mirror
  • Vote: I like it
  • +58
  • Vote: I do not like it

By arthur_9548, history, 15 months ago, In English

Hello, Codeforces!

I'm Arthur Botelho (arthur_9548), and this is my first blog. After reading this blog by mouse_wireless and solving this problem, my perspective of multidimensionality changed. I decided to implement other multidimensional data structures and solve problems using them. Now, I wish to explain my understanding of this subject and my implementations of these structures. I hope it will be useful or, at least, interesting.

Summary

In this blog, we will discuss the general idea of Range Query Data Structures in one dimension and extrapolate the concepts to higher dimensions. We will also study implementations of multidimensional versions of these four structures:

  • Prefix/Partial/Cumulative Sum (Psum)
  • Disjoint Sparse Table (DiST)
  • Binary Indexed/Fenwick Tree (BIT)
  • Segment Tree (SegTree)

The idea of the implementations is to be scalable, generic, understandable and efficient in time complexity. It is possible to make small optimizations which improve the runtime and adapt the structures to work better in specific problems, but this is not the goal of this blog.

Customized versions of these implementations are available at my Competitive Programming repository.

Prerequisites

Links about these topics and more will appear throghout the blog. It is not necessary to know everything — I'm not a specialist in any of these myself. Just the basic notion is enough.

Blog Organization

  • Introduction: 1D Range Queries
  • Multidimensional Range Queries
  • Data Structures Implementation:
    • Static Range Queries:
      • Psum
      • DiST
    • Range Queries + Point Updates:
      • BIT
      • SegTree
  • Conclusion

This blog has ended up bigger than I expected, but I believe it's mostly due to code used in explanations. You also don't need to read every section if you don't want to. The first two are the theoretical basis for all structures — they may be important even if you already know the subject, since I will use definitions and concepts from them later on. On the other hand, the subsections of "Data Structures Implementation" are more or less independent.

Introduction: 1D Range Queries

From now on, I may refer to Data Structures as DS and Range Query Data Structures as RQDS.

We know that RQDS are used to solve problems of the form:

Be $$$*$$$ the symbol for an associative binary operation and $$$A$$$ a list of elements, where $$$A_i$$$ is the element at the $$$i+1$$$-th position in the list (which means $$$0$$$-indexed). Given queries of the form $$$[l,\ r]$$$, calculate the value of:

$$$ A_l * A_{l+1} * A_{l+2} * \ldots * A_{r-2} * A_{r-1} * A_r $$$

This abstract definition is meant to reflect that we can solve this problem in the same way for many different types of elements and associative operations. Some concrete examples:

  • Given a list of numbers, find the sum of the numbers in positions $$$[l,\ r]$$$.
  • Given a list of matrices, find the product of matrices in positions $$$[l,\ r]$$$.

The main idea is to store the results of applying the operation in some specific ranges, such as prefixes of the list $$$A$$$ or some subarrays of $$$A$$$ whose size is a power of $$$2$$$.

If some elements $$$A_i$$$ are changed between queries, we are dealing with range queries and point updates (we won't discuss range queries and range updates here). If the list $$$A$$$ doesn't change, we are dealing with static range queries.

The associative binary operation is also important. Let's suppose this operation has an identity/neutral element $$$id$$$ (if it doesn't, we can always redefine the operation to include one more element such that this new element becomes the identity). Then, we are working with a Monoid. If the operation is inversible, which means for all $$$x$$$ there is an element $$$x^{-1}$$$ such that $$$x * x^{-1} = x^{-1} * x = id$$$, the Monoid is actually a Group.

Depending on the problem faced and operation type, we will use different data structures:

  • Groups and static range queries: Psum
  • Monoids and static range queries: DiST
  • Groups and range queries + point updates: BIT
  • Monoids and range queries + point updates: SegTree

Also, if the operation is idempotent, which means $$$x * x = x$$$ for any $$$x$$$, we have an Idempotent Monoid. The Sparse Table structure works with Idempotent Monoids and static range queries. We won't discuss it here, but I've made a multidimensional implementation of it as well (it's in the repository I've mentioned earlier).

In this blog, our data structures will be implemented in a way to make it easier to modify the operation type and the structure size. The definition of the operation and the element's type will be made outside the DS (passed through a template parameter) and the size will be passed as a parameter to the constructor (usually read from input), making it a generic implementation (similar to what is done in this blog).

The decision to code them as a C++ struct aims to make the usage and creation of multiple structures easier. For example: imagine that you want to create a list of $$$N$$$ Psums (where $$$N$$$ is a variable), or that you have one SegTree that multiplies integers and another SegTree that sums floats. You'll probably want to avoid having to copy/paste code and manually change variables, functions and types.

To sum up, our RQDS code will look like this:

template<class S> //information on the operation and element type
struct DataStructure{ //implementation of the DS, doesn't depend on operation
    using T = typename S::T; //to reduce verbosity. T is the element type
    int n; vector<T> v; //size and structure itself
    DataStructure(int s):n(s),v(n, S::id){} //initialization example
    //the line above is making use of C++ Member Initializer lists
    T query(int l, int r){ //range query
    /* we will suppose "merge" uses stored information of some
    specific ranges to produce the answer for any range */
        return merge(l, r);
    }
};

struct AlgebraicStructure{ //implementation of operation, doesn't depend on DS
    using T = some_type; //element type
    static constexpr T id = some_value; //operation identity
    T op(T a, T b){return a * b;} //some associative operation
};

void usage(){
    DataStructure<AlgebraicStructure> ds(size);
    //...
    //after structure is ready, answer queries of [l, r] like this:
    cout << ds.query(l, r) << endl; //0-indexed
}

We will get to concrete examples of this implementation later, when viewing speficic data structures. Depending whether we want point updates or not, different additional methods will be implemented to insert data inside the structure. Also, look here if the identity $$$id$$$ cannot be made constexpr.

Multidimensional Range Queries

I'm supposing you already know how to efficiently solve most types of range query problems in lists, i.e., in dimension $$$1$$$. However, we may want to solve similar problems in higher dimensions. The range query problem we will be solving is as follows:

Be $$$*$$$ the symbol for a commutative associative binary operation and $$$A$$$ a set of points, where each point is represented by it's integer coordinates $$$(x,\ y,\ z,\ \ldots)$$$ and each point has a value. Given queries of the form $$$[(x_1,\ y_1,\ z_1, \ \ldots), \ (x_2,\ y_2,\ z_2,\ \ldots)]$$$ calculate the result of applying the operation to the values of all points whose coordinates obey:

$$$x_1 \leq x \leq x_2, \ y_1 \leq y \leq y_2, \ z_1 \leq z \leq z_2,\ \ldots$$$

As an example, we may imagine we have a grid ($$$2D$$$) filled with numbers and queries regarding the sum of numbers in subgrids of this grid, or a $$$3D$$$ space where each point is associated to a polynomial and queries regarding the product of all polynomials in a cuboid whose edges are aligned with the space axes.

The general idea will be the same as the $$$1D$$$ case: calculate the results of applying the operation in some specific ranges and answer queries by efficiently merging these results. In fact, we are going to proceed essentially the same way. The only difference is that, instead of merging single values (the results of ranges), we will be merging data structures.

Let's study an example problem: given a grid with numbers, answer queries of calculating the sum of all numbers inside a given subgrid. Suppose we want to calculate sums of subgrids of the following grid:

$$$ $$$
$$$a \ b$$$
$$$c \ d$$$

For this example, let's try to create a data structure that will calculate the sum of numbers for any subgrid $$$[(x_1,\ y_1),\ (x_2,\ y_2)]$$$.

We already know how to calculate sums in rows ($$$1D$$$). Let's calculate, for each row, the list of values we get by applying the operation (sum) in the column ranges: $$$[0,\ 0]$$$, $$$[1,\ 1]$$$ and $$$[0,\ 1]$$$. For the first row, we will have the list: $$$a,\ b,\ a+b$$$. For the second: $$$c,\ d,\ c+d$$$. With these, we can calculate the sum of elements in any range of a single row.

However, if the range of a query contains more than one row, we can't seem to get the answer immediately with the lists we have just calculated. We will overcome this by extrapolating the process: let's calculate the list of lists of values we get by applying the operation between corresponding elements of the lists associated to rows in the row ranges: $$$[0,\ 0]$$$, $$$[1,\ 1]$$$ and $$$[0,\ 1]$$$. This will yield the following:

  1. For $$$[0,\ 0]$$$: $$$a,\ b,\ a+b$$$.
  2. For $$$[1,\ 1]$$$: $$$c,\ d,\ c+d$$$.
  3. For $$$[0,\ 1]$$$: $$$a+c,\ b+d,\ a+b+c+d$$$.

Now, we are able to answer any subgrid query of the form $$$[(x_1,\ y_1), \ (x_2,\ y_2)]$$$. For all possible ranges in the first dimension ($$$[x_1,\ x_2]$$$), we have a list associated to it which can solve queries of all possible ranges in the second dimension ($$$[y_1,\ y_2]$$$). Let's solve the query $$$[(0,\ 1),\ (1,\ 1)]$$$:

  1. The list associated to the range of the first dimension ($$$[0,\ 1]$$$) is $$$a+c,\ b+d,\ a+b+c+d$$$.
  2. The element of this list associated to the range of the second dimension ($$$[1,\ 1]$$$) is $$$b+d$$$. This is the answer.

Despite being a very simple example, this is very similar to how an actual $$$2$$$x$$$2$$$ $$$2D$$$ SegTree would work. Of course, the idea is not calculating the query answer for all ranges in all dimensions, but instead calculating for the same ranges we would in the $$$1D$$$ case.

You should notice the restriction of commutativity in the operation. When dealing with multidimensional queries, we will assume there is no specific order in which we should apply the operation. All the reasoning and implementations in this blog are based on this. I don't know if it is possible to answer non-commutative multidimensional range queries with multidimensional versions of the structures we're working with — at least, I wasn't able to do so without heavily worsening the time complexity or removing the generality of the structures.

Why?

Generalizing the $$$2D$$$ example, the idea will be maintaining data structures of data structures. In dimension $$$D \gt 0$$$, our multidimensional RQDS will maintain a list of internal DS of dimension $$$D-1$$$, where each of them is associated to some $$$D$$$-dimensional range (usually prefixes or ranges whose size is a power of two). The $$$0D$$$ data structures will simply contain single values, since they represent points. This way, we can see that the $$$1D$$$ RQDS we know so far actually operate on lists of $$$0$$$-dimensional data structures.

Supose we want to answer a $$$D$$$-dimensional range query $$$[(x_1,\ y_1,\ z_1, \ \ldots), \ (x_2,\ y_2,\ z_2,\ \ldots)]$$$. The range related to dimension $$$D$$$ is $$$[x_1,\ x_2]$$$, and we will suppose (recursively) that our internal data structures will be able to solve $$$(D-1)$$$-dimensional range queries of the form $$$[(y_1,\ z_1, \ \ldots), \ (y_2,\ z_2,\ \ldots)]$$$ — the base case, of course, is a $$$0$$$-dimensional query $$$[(),\ ()]$$$: simply returning a value. What we want to do is to be able to propagate this $$$(D-1)$$$-dimensional query to as few internal data structures as possible, using the fact that they represent ranges in dimension $$$D$$$. Each structure will do this in a different way, but all of them build their list of internal DS in similar fashion: merging data structures associated to smaller ranges.

Let's understand a little better what it means to merge data structures. Suppose we are working in a dimension $$$D \gt 1$$$ and currently have the following list of internal $$$D-1$$$ dimensional data structures:

  1. $$$DS_{0,0}$$$: associated to the range $$$[0,\ 0]$$$ in $$$D$$$.
  2. $$$DS_{1,1}$$$: associated to the range $$$[1,\ 1]$$$ in $$$D$$$.

If the query range in $$$D$$$ is $$$[0,\ 0]$$$ or $$$[1,\ 1]$$$, we already know where to propagate the query. However, we may want to merge $$$DS_{0,0}$$$ and $$$DS_{1,1}$$$ to form $$$DS_{0,1}$$$, to whom we will propagate queries if the range in $$$D$$$ is $$$[0,\ 1]$$$. For this, suppose (recursively) that we are already able to merge $$$(D-2)$$$-dimensional data structures. Then, we can define their merging as follows:

For all the ranges $$$[l,\ r]$$$ for which $$$DS_{0,0}$$$ and $$$DS_{1,1}$$$ calculate merges of their internal data structures, we will proceed in the same way — let's look to a particular range $$$[i,\ j]$$$. Suppose $$$DS_a$$$ is the merge of $$$[i,\ j]$$$ in $$$DS_{0,0}$$$ and $$$DS_b$$$ is the same in $$$DS_{1,1}$$$. $$$DS_{0,1}$$$ will maintain, for $$$[i,\ j]$$$, the merge of $$$DS_a$$$ and $$$DS_b$$$ (which are $$$D-2$$$ dimensional). Note that this is recursively defined (base case: $$$DS_a$$$ and $$$DS_b$$$ are $$$0$$$-dimensional (just values) and their merge is $$$DS_a * DS_b$$$ ).

A small 3D example

Now, let's see how we do this in practice using the RQDS we know and solve actual problems. From now on, I may refer to Multidimensional Range Query Data Structures as MDRQDS.

Data Structures Implementation

We will make use of template metaprogramming and ellipses (the "...") in C++ for the recursive definition of data structures and their merging. Our code will look like this:

#define MAs template<class... As> //we are going to type this a lot
//MAs stands for multiple arguments
template<int D, class s> //dimension and algebraic structure
struct MDRQ{ //multidimensional case
    using T = typename S::T; //to further reduce verbosity
    int n; vector<MDRQ<D-1, S>> v; //size and our list of data structures
    MAs MDRQ(int s, As... ds):n(s),v(n, MDRQ<D-1, S>(ds...){}
    /* in this dimension, we save our size and propagate the
    sizes of other dimensions (ds) to our internal structures */
    MAs T query(int l, int r, As... ps){ //the query range in this dimension is [l, r]
        /* suppose "merge" uses stored information of some specific ranges
        to efficiently produce the merge of the structures in any range */
        return merge(l, r).query(ps...);
        /* we are propagating the query to lower dimensions,
        passing forward their query ranges (ps) */
    }
};

template<class S>
struct MDRQ<0, S>{ //base case: dimension 0
    using T = typename S::T;
    T val = S::id; //value starts with identity
    T query(){return val;} //0D query
};

This way, we will able to use it like this:

void example(){ //3D example
    MDRQ<3, SomeAlgebraicStructure> ds(3, 4, 5);
    /* this creates a 3x4x5 3D data structure
    based on "SomeAlgebraicStructure" */
    //...
    /* after we are ready to answer queries of the form
    [(x1, y1, z1), (x2, y2, z2)], we do as follows: */
    cout << ds.query(x1, x2, y1, y2, z1, z2) << endl; //also 0-indexed
}

We can see that the code matches our reasoning: we define how the DS should work in any positive dimension (by merging lower dimension data structures) and define the base cases for recursive processes. It may seem magical, but you can think that everytime you declare MDRQ<X, S> (a MDRQDS with dimension $$$X$$$, $$$X \gt 0$$$), the compiler will assure that all versions of MDRQ<D, S> for all non-negative values of D up to $$$X$$$ exist, and it will create the missing ones. This usage of templates provides the correct number of arguments for each method version and correct typing for each variable in all dimensions — all of this will be adjusted in compile time.

Notice that this implementation is not sparse/dynamically sized: we define beforehand the corresponding size in each dimension. I believe that for most structures this simplifies the implementation and usage, but I think it's also possible to implement some of them in the other way. However, it may significantly affect the running time in problems where this wouldn't be necessary. Also, notice that the number of dimensions of our structures should be known beforehand in order to instantiate them, as it's a template parameter (if the dimension of the problem is variable you may have to do some ifs to solve the problem in the correct dimension).

That will be the skeleton for our implementation of MDRQDS. Now, depending on the type of problem we want to solve and the restrictions on the operation to be applied in the range, we will insert the data into the structure and answer queries in different ways.

Static Range Queries

The general idea to solve static range query problems is to do some preprocessing and then answer queries very quickly. We should add these two methods in our implementation:

//... In the multidimensional version:
    MAs void set(T x, int p, As... ps){
        //we want to set the value of the point (p, ps...) to x
        v[p].set(x, ps...); 
        /* the structure corresponding to the coordinate p (or range [p, p])
        will propagate the recursion in the correct direction */
    }
    void init(){
        //we will call this to do the preprocessing, becoming able to answer queries
    }
//...
//... In the 0D version:
    void set(T x){val = x;} //setting the value to x
    void init(){} //no preprocessing needed
//...

The interface will end up looking like this:

void example(){
    MDRQ<3, S> ds(3, 4, 5); //3x4x5 3D structure based on S
    ds.set(val_1, 0, 2, 1); //point (0, 2, 1) has value = val_1
    ds.set(val_2, 2, 3, 0); //point (2, 3, 0) has value = val_2
    ds.set(val_3, 1, 0, 4); //point (1, 0, 4) has value = val_3
    ds.init(); //ready to answer queries
    cout << ds.query(0, 1, 0, 3, 0, 4) << endl; //only val_1 and val_3 influence this query
}

We can see how simple it will be in practice to use the structures to solve multidimensional problems. However, we have to actually develop these structures before solving anything.

Psum

As mentioned before, Psums are used to answer static range queries when we are working with a Group operation (like sums and xor), which for higher dimensions should be commutative (i.e. an Abelian Group). The idea in dimension $$$1$$$ is to calculate the answer for all prefixes of a list — we preprocess this in $$$O(n)$$$, where $$$n$$$ is the list size. Now, we can answer queries in $$$O(1)$$$: answer($$$[l,\ r]$$$) = answer($$$[0,\ r]$$$) $$$*$$$ (answer($$$[0,\ l-1]$$$))$$$^{-1}$$$, where $$$*$$$ is the Group operation and $$$a^{-1}$$$ is the inverse of $$$a$$$. We will proceed exactly the same way in any dimension $$$D$$$.

Note: in our Psum code, we will use $$$1$$$-indexing, because it makes the implementation easier. However, we will consider that set and query will always be called with $$$0$$$-indexing.

Firstly, we should process the prefixes. Suppose we have a $$$n$$$-sized list of $$$(D-1)$$$-dimensional Psums, and each of them has already concluded the preprocessing for lower dimensions. Now, we will, for each prefix of the list, simply calculate the merge of all Psums contained in it:

//... In the multidimensional version:
    void init(){
        //we will use 1-indexing to access internal Psums, located in "v", which has size n
        for(int i = 1; i < n; i++){
            v[i].init(); //lower dimensions processed first
            v[i].merge(v[i-1]); //v[i] will merge itself with v[i-1]
            //v[0] is a Psum filled with S::id and v[i] is the merge of Psums representing [0, i-1]
        }
    }
    void merge(Psum& p){ //merging psums
        for(int i = 1; i < n; i++){
            v[i].merge(p.v[i]); //recursive definition of merge
        }
    }
//...
//... In the 0D version:
    //merging in dimension 0 is simply applying the operation
    void merge(Psum& p){val = S::op(val, p.val);} 
//...

We can use induction to prove that, if all dimensions have size $$$n$$$ (we will always assume this in calculations for simplicity), we are in dimension $$$D$$$, and S::op has complexity $$$op$$$, the init method has complexity $$$O(D \cdot op \cdot n^D)$$$.

After finishing the preprocessing, we will want to answer queries. We can simply use the same logic for the $$$1D$$$ case, but propagating the query to lower dimensions:

//...
//If S is a group, it should have an inv() function: returning the inverse of the element
    MAs T query(int l, int r, As... ps){ //multidimensional query
        return S::op(v[r+1].query(ps...), S::inv(v[l].query(ps...)); //ans([0, r]) * (ans([0, l-1])^-1
        /* note the use of 1-indexing: when l is 0, instead of accessing an invalid position,
        we will just be accessing a Psum filled with identities, which won't cause errors */
    }
//...

If $$$inv$$$ is the complexity of S::inv, we can see that this is $$$O((op + inv) \cdot 2^D)$$$. Joining all the code, we get the following (fully functional) implementation:

Code

The memory used will be something like $$$n^D \cdot sizeof(T)$$$.

You can use this structure in the following problems:

DiST

If we have to work with Monoid operations (like multiplication and minimum/maximum) instead of Groups, we can use DiST for static range queries. This DS is not as well-known as the others, you can read more about it here and here. The idea is similar to Segment Trees and Merge Sort Trees: we will consecutively divide our list in halfs and each node will maintain information about a sublist of our list.

What a $$$1D$$$ DiST does is to store, for each of these sublists, all steps of calculating the operation from the middle of the sublist to it's beginning and to it's end (with $$$O(n \cdot log(n))$$$ total complexity). To answer a query $$$[l,\ r]$$$, we will locate the node in the DiST which can answer the query in $$$O(1)$$$ by merging the answers from it's middle to $$$l$$$ and from it's middle to $$$r$$$. This location will be made using bitwise operations.

Note: we will use $$$0$$$-indexing for DiST, since it works well with the common implementation.

DiST requires that it's size, $$$n$$$, is a power of two. We will also store the value of $$$h = log_2(n)$$$. If lg(x) returns $$$\lfloor log_2(x) \rfloor$$$, we can initialize our structure like this:

#define MAs template<class... As>
template<int D, class S>
struct DiST{
    using T = S::T;
    int lg(int x){return __builtin_clz(1)-__builtin_clz(x);} //log_2 floor
    int n; vector<vector<DiST<D-1, S>>> v; //DiST stores a matrix instead of a list
    MAs DiST(int s, As... ds)n(1<<(lg(s)+(s!=(1<<lg(s))))), //n is the smallest power of two >= s
    h(lg(n)), v(h+(n==1), vector(n, DiST<D-1, S>(ds...))){} //storing h and dealing with corner case n = 1
//...

As always, we will process lower dimensions recursively. Initially, we will have a $$$n$$$-sized list of $$$(D-1)$$$-dimensional DiSTs. Now, we should consecutively partition the list in blocks of size $$$s$$$, where $$$s$$$ is a power of two. In each block, we will merge the DiSTs from the middle to the left and from the middle to the right and store each step:

//... In the multidimensional version:
    void init(){
        for(int i = 0; i < n; i++)v[0][i].init(); //v[0] is the base list of DiSTs
        //lower dimensions have already been processed
        for(int d = 1, s = 2; d < h; d++, s *= 2) //blocks of size 2s, s = 2^d
        for(int m = s; m < n; m += 2*s){ //iterating the middle of each block
            v[d][m] = v[0][m]; v[d][m-1] = v[0][m-1]; //middle of the block is between m and m-1
            //having processed the middle, we will merge structures in both directions:
            //from the middle to the right
            for(int i = m+1; i < m+s; i++)v[d][i].merge(v[d][i-1], v[0][i]);
            //from the middle to the left
            for(int i = m-2; i >= m-s; i--)v[d][i].merge(v[0][i], v[d][i+1]);
            /* v[d][i] is some information on a block of size 2^(d+1). It may be the middle of a block or
            the result of merging data structures from the middle to the left or right of the block */
        }
    }
    void merge(DiST& a, DiST& b){ //merging two DiSTs
        for(int d = 0; d < h; d++)
            for(int i = 0; i < n; i++)
                v[d][i].merge(a.v[d][i], b.v[d][i]); //recursive definition of merge
    }
//...
//... In the 0D version:
    void merge(DiST& a, DiST& b){val = S::op(a.val, b.val);} //base case: applying the operation
    void init(){} //no preprocessing needed
//...

You may verify that the complexity of this process is $$$O(op \cdot (2 \cdot n \cdot log(2 \cdot n))^D)$$$, where $$$n$$$ is the original size of the structure. The $$$2$$$ factors are due to rounding up $$$n$$$ to the nearest power of two.

To answer queries, we should firstly deal with the trivial case $$$l = r$$$, in which we can simply propagate the query to the $$$l$$$-th DiST in our list. The basis of DiSTs efficiency is that, if $$$l \neq r$$$, we can take the value $$$k = $$$ lg( $$$l \oplus r$$$ ) and be sure that a block of size $$$2^{k+1}$$$ has $$$l$$$ located to the left and $$$r$$$ to the right of it's middle. Since we already processed this block, the query becomes simple:

//...
    MAs T query(int l, int r, As... ps){ //multidimensional query
        if (l==r)return v[0][l].query(ps...); //trivial case
        int k = lg(l^r);
        return S::op(t[k][l].query(ps...), t[k][r].query(ps...));
        /* t[k][l] is the merge from l to the middle of some block of size 2^(k+1) and t[k][r] is the merge
        from this middle to r. Thus, merging their answers will be the same as merging the whole range [l, r] */
    }
//...

Similarly to Psum query, this process is $$$O(op \cdot 2^D)$$$. The implementation is done:

Code

The memory used may get to $$$(2 \cdot n \cdot log(2 \cdot n))^D \cdot sizeof(T)$$$ due to power of two rounding.

The following problems can be solved using this structure:

The last three problems can be solved using the Sparse Table structure. It may be an interesting exercise to try and create your own multidimensional version of it using the same process we've used so far. You can compare yours to mine afterwards.

Range Queries + Point Updates

When we are required to process both range queries and point updates, our strategy will be a bit different. Our structure will be initialized with the identity value, and inserting the initial values will be the same as updating values. Update logic will be the most difficult part.

We know that in each dimension $$$D$$$, our DS will store a list of $$$D-1$$$ dimensional data structures, each representing the merge of data structures in some ranges. Let's suppose we know how to efficiently update in dimension $$$D-1$$$ (base case: updating a value in $$$0D$$$). When we want to update the point with coordinates $$$(p_1,\ p_2,\ \ldots)$$$ ($$$p_1$$$ is the coordinate in dimension $$$D$$$), we should only update the data structures representing the merge of ranges that contain $$$p_1$$$ (usually they will be $$$O(log(n))$$$). For each of them, we will propagate a $$$D-1$$$-dimensional update with coordinates $$$(p_2,\ \ldots)$$$ that will correctly adjust the merges in each dimension with the updated value. The way this works in practice will be explored when studying speficic data structures.

In the implementation, we won't be keeping the set and init methods; instead, we will use a different method depending on our structure. If we want to support updates of the form $$$past \ value = new \ value$$$, we will name it u_set, but if what we want to do is $$$past \ value = past \ value * new \ value$$$ (remember $$$*$$$ is the operation symbol), we will name it u_op.

If our structure had an u_set method, it's implementation would look like this:

//...
//... In the multidimensional version:
    MAs void u_set(T x, int p, As... ps){ //set the value of point (p, ps...) to x
        set<int> U = related_to(p); 
        //suppose U contains the indexes of all data structures whose associated range contains p
        for(int i : U){
            T cur_x = recalculate(x, p, range_of(i));
            /* in each i we want to set the value of the point (ps...) in data structure i to the result of applying
            the operation to all points (j, ps...) such that j is in the range associated to the DS i. We need to be 
            able to quickly calculate this value considering that the value of (p, ps...) was changed to x. */
            v[i].u_set(cur_x, ps...); //we will propagate the update to lower dimensions
        }
    }
//...
//... In the 0D version:
    void u_set(T x){val = x;} //setting the value
//...

The usage will also be simple:

void example(){
    MDRQ<3, S> ds(3, 4, 5); //3x4x5 3D structure based on S
    ds.u_set(val_1, 0, 2, 1); //point (0, 2, 1) has now value = val_1
    cout << ds.query(0, 1, 0, 3, 0, 4) << endl; //only val_1 influences this query
    ds.u_set(val_2, 2, 3, 0); //point (2, 3, 0) has now value = val_2
    ds.u_set(val_3, 0, 2, 1); //point (0, 2, 1) has now value = val_3
    cout << ds.query(0, 1, 0, 3, 0, 4) << endl; //only val_3 influences this query
}

BIT

I probably wouldn't have done this blog or studied MDRQDS in a deeper level or understand and implement them the way I do now if I hadn't read "Nifty implementation of multi-dimensional Binary Indexed Trees using templates." by mouse_wireless. My approach will be slightly different to theirs, mostly due to different objectives and coding styles — it may be a good idea to read their blog too and make your own implementation based on both.

We know that our multidimensional BIT should answer queries related to Abelian Groups (same restriction on Psums). Besides, single values will be changed between queries. The modification that BITs support is the u_op type. However, since we are working with Groups, it's easy to create our own u_set when we want to:

    MAs void u_set(T x, int p, As... ps){
        T past_val = get_value(p, ps...); //suppose we know the current value at (p, ps...)
        u_op(S::op(S::inv(past_val), x), p, ps...);
        //past_val = past_val * (past_val)^-1 * x = x
    }

Note: as with Psums, we will use $$$1$$$-indexing when working with BITs. It's important for the idea and makes the implementation easier. As always, we will still assume that the interface methods ( query and u_op ) use $$$0$$$-indexing.

What a $$$1D$$$ BIT does is to store, in each $$$1$$$-indexed position $$$i$$$, the result of applying the operation on the subarray of size $$$lp(i)$$$ that ends in $$$i$$$, where $$$lp(i)$$$ is the largest power of two that divides $$$i$$$. This enables us to get the answer corresponding to any prefix by simply applying the operation to the stored values at $$$O(log(n))$$$ positions (similarly to Psums, we can transform a query $$$[l,\ r]$$$ into two prefix queries).

Our multidimensional BIT will work basically the same way. To get the answer for a query $$$[(0,\ l_2,\ \ldots),\ (r_1,\ r_2,\ \ldots)]$$$, we will iterate some positions of the BIT such that the ranges associated to these positions are disjoint and cover completely the range $$$[0,\ r_1]$$$. In each of these positions, we will propagate the query for lower dimensions (and, of course, the query in dimension $$$0$$$ is simply returning a value). The answer for this prefix will be the merge (applying the operation) of the answers calculated by the internal data structures in each of the iterated positions.

//...
//... In the multidimensional version:
    int lp(int i){return i&-i;} //largest power of two that divides i
    MAs T query(int l, int r, As... ps){
        T lv=S::id, rv=S::id; //answers of prefixes l-1 and r
        r++; //due to 1-indexing
        for(; r; r-=lp(r))rv=S::op(rv, v[r].query(ps...)); //r = 0 means the prefix ending in r was fully processed
        // we add to our answer the result of the range [r-lp(r)+1, r] then proceed to the range ending on r-lp(r)
        for(; l; l-=lp(l))lv=S::op(lv, v[l].query(ps...));
        // as always, we are propagating the query to lower dimensions
        return S::op(rv,S::inv(lv)); //use of group property
    }
//...
//... In the 0D version:
    T query(){return val;} //query in 0D is returning a value
//...

You may see that query complexity becomes $$$O(op \cdot (2 \cdot log(n))^D + inv \cdot (2 \cdot log(n))^{D-1})$$$. However, besides answering queries, we also have to process updates ( u_op ). Let's remember how the update is done in dimension $$$1$$$:

Suppose the value at position $$$i$$$, $$$A_i$$$ ($$$1$$$-indexed), is changed to $$$A_i * x$$$. This update will change the stored value in $$$O(log(n))$$$ positions in our BIT: all positions $$$j$$$ such that $$$[j-lp(j)+1,\ j]$$$ contains $$$i$$$. Thus, to process an update, we should iterate all of these $$$j$$$ and update the value stored at $$$j$$$, $$$val_j$$$, to $$$val_j * x$$$. This works due to commutativity: since $$$val_j$$$ can be expressed as $$$A_{j-lp(j)+1} * \ldots * A_i * \ldots * A_j$$$, after this update we have (at $$$j$$$): $$$A_{j-lp(j)+1} * \ldots * (A_i * x) * \ldots * A_j = (A_{j-lp(j)+1} * \ldots * A_i * \ldots * A_j) * x = val_j * x$$$.

We can easily generalize this to any dimension $$$D$$$, with the base case being updating in dimension $$$0$$$ (setting $$$val$$$ to $$$val * x$$$). Let's say the point of the update is $$$(p_1,\ p_2,\ \ldots)$$$ and the value of the update is $$$x$$$. We will iterate all internal data structures whose range contains $$$p_1$$$ and, in each of them, update it's point $$$(p_2,\ \ldots)$$$ with the value $$$x$$$. We are doing exactly what we should do: for all positions (in all dimensions) related to the point $$$(p_1,\ p_2,\ \ldots)$$$, the value $$$val$$$ of this position will become $$$val * x$$$. We can see that this is simpler than the u_set skeleton we've seen in the previous section, due to the nature of these types of updates.

//...
//... In the multidimensional version:
    MAs void u_op(T x, int p, As... ps){//updating (p, ps...) with x
        for(p++; p < n; p += lp(p)){//p++ is for 1-indexing
            /* p += lp(p) proceeds to the next endpoint of a range containing p. You can verify this by
            proving that this endpoint's range contains p and no other endpoints' ranges up to it contain p. */
            v[p].u_op(x, ps...); //propagating the update for lower dimensions
        }
    }
//...
//... In the 0D version:
    void u_op(T x){val = S::op(val, x);} //u_op in 0D
//...

It's easy to see that the update complexity is $$$O(op \cdot log(n)^D)$$$. And this concludes our implementation:

Code

We end up using around $$$n^D \cdot sizeof(T)$$$ memory. Now you can solve the following problems:

SegTree

Again: Groups are nice, but sometimes we are forced to work with Monoids. Range query + point update with Monoids is the most general version of the range query problem we will study here, solvable by one of the most famous data structures in competitive programming: Segment Tree.

Note: in our SegTree, we will index the root at $$$1$$$ and the query logic will be based on half-open intervals ($$$[l,\ r[$$$). However, we will still consider that query and update methods are called with $$$0$$$-indexing and closed intervals ($$$[l,\ r]$$$).

Each node in the SegTree stores the merge of a range whose size is a power of two. We will use the iterative implementation as a basis, as it's shorter, faster and easier to adapt for our needs. You can here more about it in this amazing blog.

Queries in SegTree are simple. In $$$1D$$$, what we do is select some nodes in the SegTree such that their associated ranges are disjoint and they cover completely the range of the query. Then, the answer of the query is simply merging (applying the operation to) the values stored at each of these nodes (similarly to BITs).

Now, suppose we want to answer a query $$$[(l_1,\ l_2,\ \ldots),\ (r_1,\ r_2,\ \ldots)]$$$ in dimension $$$D$$$ and we already know how to answer $$$D-1$$$-dimensional queries in our multidimensional SegTree. Then, answering a query will be simply computing the merge of answers of the $$$D-1$$$-dimensional query $$$[(l_2,\ \ldots),\ (r_2,\ \ldots)]$$$ given by internal data structures whose associated ranges are disjoint and cover completely the range $$$[l_1,\ r_1]$$$.

//...
//... In the multidimensional version:
    MAs T query(int l, int r, As... ps){
        T lv=S::id, rv=S::id;
        /* in a usual iterative SegTree, we would calculate the merge of node values in a way to maintain the
        order of operations, using a value to calculate from left to right and other to calculate from right
        to left. This is not necessary here, due to the restriction of commutativity, but I believe it helps
        resembling a usual SegTree (and it does not change the complexity). */
        for(l += n, r += n+1; l < r; l /= 2, r /= 2){ //i/2 is the parent of i
            //i+n corresponds to the position of a leaf in the SegTree representing the range [i, i]. 
            /* summing n+1 to r instead of n is to get the answer of [l,r] through a query in the half-open
            interval [l,r+1[. */
            if (l&1)lv = S::op(lv, v[l++].query(ps...)); //node at l is in the border of query range
            if (r&1)rv = S::op(v[--r].query(ps...), rv); //note at r-1 is in the border of query range
            //we propagate the query for lower dimensions
        }
        return S::op(lv, rv); //merge values from left and right
    }
//...
//... In the 0D version:
    T query(){return val;} //query in 0D is returning a value
//...

We can see that the complexity of query is $$$O(op \cdot (2 \cdot log(n))^D)$$$. Now, we should worry about updates. A SegTree can actually handle both u_op and u_set updates, but we will implement only u_set. Whenever we want to, we can create our own u_op:

//...
    MAs void u_op(T x, int p, As... ps){//set the value val of the point (p, ps...) to val * x
        T val = get_value(p, ps...); //suppose we know the current value at (p, ps...)
        u_set(S::op(val, x), p, ps...); //simply set val to val * x
    }
//...

In a SegTree, the update is as follows: firstly, update the leaf, and then update the ancestors from the leaf's parent up to the root using their children. Let's understand better how to unite this idea with the u_set skeleton we have seen earlier:

Suppose we know how to update $$$D-1$$$ dimensional SegTrees (base case: $$$0D$$$, a single value). We want to support updates of the form: set the value of the point at coordinates $$$(p_1,\ p_2,\ \ldots)$$$ to $$$x$$$ and update our $$$D$$$-dimensional SegTree accordingly. Firstly, we will update our internal data structure (a node) corresponding to the range $$$[p_1,\ p_1]$$$ by propagating an update with the value $$$x$$$ at the point $$$(p_2,\ \ldots)$$$. Then, we will successively update the parents of the current node. In each of them, we will propagate an update at the point $$$(p_2,\ \ldots)$$$ with a value $$$y$$$: the merge of values at points $$$(i,\ p_2,\ \ldots)$$$ for $$$i$$$ in the parent's associated range. This $$$y$$$ can be easily computed, because children have been already updated: $$$y$$$ is simply the merge of the values of point $$$(p_2,\ \ldots)$$$ in the left child and right child.

We can implement this using an additional helper method:

//...
//... In the multidimensional version:
    MAs T get(int p, As... ps){ //return the value of point (p, ps...)
        return v[p+n].get(ps...); //v[p+n] is the leaf associated to the range [p, p]
    }
    MAs void u_set(T x, int p, As... ps){
        v[p+=n].u_set(x, ps...); //we should update the corresponding leaf first
        while(p/=2){ //then iterate ancestors up to the root
            T y = S::op(v[2*p].get(ps...), v[2*p+1].get(ps...)); //merge values of updated children
            v[p].u_set(y, ps...);
        }
    }
//...
//... In the 0D version:
    T get(){return val;} //the value at this point
    void u_set(T x){val = x;} //setting the value of the point
//...

You may verify that the update complexity is $$$O(op \cdot log(n)^D)$$$. We now have a functional Multidimensional Segment Tree:

Code

This implementation uses around $$$O((2 \cdot n)^D \cdot sizeof(T))$$$ memory.

I wasn't able to find a good amount of multidimensional range query problems not solvable by any of the structures we have studied before (most are solvable with BITs or DiSTs). If you want, you may go back to the recommended problems in previous sections and try to solve them with SegTree, although for some of them it may be impossible.

You can use the multidimensional SegTree at:

Conclusion

I hope this blog has helped you understand better multidimensional range query data structures or, at least, showed an interesting approach to them. You should now be ready to solve range query problems in any dimension! No more suffering with $$$2D$$$ or $$$3D$$$ implementations, and if anytime you want to solve something $$$4D$$$, $$$5D$$$ or any $$$D$$$, it won't be any more difficult to solve or implement than a $$$1D$$$ structure.

On the downside, these type of problems aren't very common out there. I think it's actually very rare to find range query problems where a multidimensional data structure is part of a concrete solution. However, as seen in this blog, implementing MDRQDS can be very simple and the idea is not difficult — there is no reason why this topic shouldn't appear more in competitions, and many variations of the multidimensional range query problem can be explored. I'm talking to you, fellow problemsetters!

Of course, we should note there is no magic going on here. Complexity increases quickly as dimension increases, both in time and space, and usually this is an important part of problems. As I have stated earlier, my goal with the code was to provide something scalable, generic, understandable and efficient — specific problems may require specific techniques. A lot of recurring topics in range query problems weren't discussed here (range updates, coordinate compression, persistency...).

Also, even though we've only properly addressed four RQDS, there are many more out there, solving different problems or the same ones in different complexity. I would be very interested in seeing how you would implement multidimensional versions of them! I will to try to keep my MDRQDS repository updated with new structures I eventualy implement.

Lastly, I would to like to say thanks to mouse_wireless (again) for helping me understand C++ and MDRQDS a lot more through their blog, to duduFreire, Vilsu, dharinha and MagePetrus for giving me feedback on this blog, and to you for reading it!

Feel free to ask questions, share problems and express your opinions in the comments (:

Full text and comments »

  • Vote: I like it
  • +115
  • Vote: I do not like it