Trace_X1729's blog

By Trace_X1729, 9 months ago, In English

Hello everyone :)

I am going to present a new way(I couldn't find it on the internet) to do range queries with point updates, back story and all other details later:

It is very intuitive and easy to understand, also I have included a lot of examples so please read till the end, it will be especially useful for those who are reluctant to study segment trees (I think).

You are given an array A of size N, and Q queries involving both point updates and range queries.

First we will have to do some pre-computation.I will call the precomputed structure as SST (full form later) from now.

Also, for the sake of simplicity, just assume that we are constructing the structure for sum right now, the language is going to be cpp throughout the blog.

Declare SST:

int N = A.size() - 1;//this is assuming that the array passed is 1-indexed.
vector<vector<int>>SST(N + 1);
//It is basically a 2-D vector of size N + 1, where N is the size of the array. It is N + 1 to have 1-based indexing.

What does it store?

// SST[i][j] = sum of the L length subarray ending at index i, where L = (2 raised to the power j).

for each index i, we will store all sums such that i is divisible by L (basically j will vary from 0 to the largest power of 2 that divides i).

since I am not good at explaining stuff, I will take an example:

A = [6, 4, 2, 5, 3, 9]

SST[1] = [6] (1 is divisible only by L = 2^0)

SST[2] = [4, 10] (2 is divisible by L = 2^0 and 2^1)

SST[3] = [2]

SST[4] = [5, 7, 17] (since 4 is divisible by 2 power [0, 1, 2]).

and so on...

I hope its clear so far.

The space complexity for building SST:

Some index j will appear in SST[i] if and only if i is divisible by 2^j.

Index 0 will appear in N elements (all indexes i are divisibly by L = 2^0 = 1).

Index 1 in N/2 elements.

Index 2 in N/4 elements.

and so on...

the overall space required is

O(N + N/2 + N/4 + .... ) .

and this is a common GP having sum = 2N.

So the total space the structure takes is O(N).

Implementation of the build:

		//Building the SST
		for(int i = 1 ; i <= N ; i++){
		   S[i].resize(__builtin_ctz(i) + 1);//this is just a cool way to find the highest power of 2 that divides i.
		   S[i][0] = A[i];
		}
		while(jump <= N){
                   jump <<= 1;
                   for(int i = jump ; i <= N ; i += jump){
                     S[i][p] = S[i][p - 1] + S[i - (jump >> 1)][p - 1];
                     //if not sum, just replace the above merge step with the relevant operator.
                   }
                   p++;
                }
	    }

For each SST[i][j], we can calculate it using the above recurrence.

Example:

A = [6, 4, 2, 5, 3, 9].

SST[i][0] is just A[i].

for SST[i][1], which is the 2 size subarray ending at index i, we will use the SST[i][0], the 1 size subarray ending at index i, and then add it with SST[i - 1][0].

similarly for SST[i][2] (the 4 size subarray ending at i), we will add these two :
 
SST[i][1] (the 2 size subarray ending at i)

AND

SST[i - 2][1] (the 2 size subarray ending at i - 2). 

Thus, we calculate each SST[i][j] in constant time and there are a total of O(N) (i, j) pairs (as explained in the space complexity part).

So the overall time complexity is O(N).

We are done with the Pre-Computation.

Updating the value at an index

lets say we want to change the value at some index x, with it we will also have to update all blocks that contain index x.

The number of these blocks might seem very large at first, but giving it a deeper thought, it shouldn't be very tough to realize that the actual number of such blocks is O(logN).

This is because for each power of 2 (let's call it j), the number of indices i where you will need to update the jth power is exactly 1.

I am not sure if the above is clear, so here's

An Example:

A = [1, 9, 1, 2, 43 ,23, 22390, 23920, 592, 201], N = 8.

lets say we are updating index 6, A[6] = 14.

now we will need to update all blocks that contain index 6.

For all indices before 6, nothing changes, obviously.

For index 6, we will update SST[6][0] to 6, and since 6 is also divisible by 2, SST[6][1] exists and we will update it as well.

Now the cool thing is that since we have updated powers 0 and 1 for an index, we will never need to update powers 0 and 1 for any other index.

Let's check this for 2 power 1 = 2.

After 6, the next index k that is again divisible by 2 and thus has SST[k][1] is 6 + 2 = 8, but since we have already moved a distance 2 from 6, the size 2 block ending at index 8 no longer includes the index 6.

Thus the Time complexity for the update function is O(Log(N)).

Implementation of update

void update(int index, ll value){
	    int p = 1;
	    S[index][0] = value;
	    if(S[index].size() == 1){
	      index += 1;
	    }
	    while(index <= N){
	      S[index][p] =S[index][p - 1]+S[index - (1 << (p - 1))][p - 1];
              //if not sum, just replace the above merge step with the relevant operator
	      ++p;
	      if(p == S[index].size()){
	        index += (1ll << (p - 1));
	      }
	   }
       }
 

Querying

For querying, we can keep reducing R by the highest power of 2 that divides it, this is O(LogN) since the highest power of 2 available for R at each step increases (I am too tired to explain it more, it shouldn't be hard to understand if u use pen and paper, or just trust me bro).

Once we reach the point that reducing the highest power of 2 that divides from R takes us to the left of L, we will stop, since we cant allow that (if we take some elements to the left of L, we will lose out on the advantage of the operator not necessarily having to be invertible).

So, After this we find remaining elements in the range [L,R] = R — L + 1, and iterate on the bits of this value.

The time complexity of this step is O(LogN) too.

So, the overall complexity for querying is O(LogN).

Implementation for querying

ll query(int L, int R){
  int answer = 0, d, p, i;//answer is the identity element, for sum it is 0 but u can change it accordingly.
//till the time we are well ahead of L, lets just keep taking the highest available power of 2.
  while(R >= L && (R - (1ll << (S[R].size() - 1)) > L - 2)){
    answer += S[R].back();//if not sum, replace with the operator in this step.
    R -= (1ll << (S[R].size() - 1));
   }
  if(R < L){
    return answer;
  }
  d = R - L + 1;
//now just iterate over the bits of d.
  p = logll(d + 1);
  for(i = p ; i >= 0 ; i--){
    if((1ll << i)&d){
      //if the bit is set, we will just take that value and reduce R.
      answer += S[R][i];//if not sum, replace with the operator
      R -= (1ll << i);
    }
    //otherwise, we ignore the 0 and just continue.
  }
  return answer;
}
Full Code for sum-SST

Comparison with Standard Well-Known Stuff:

1)Fenwick Trees:

It's somewhat similar to Fenwick Trees, but stores stuff not just for the highest power of 2 that divides index i, but for all such powers.

Also, as far as I know, A single Fenwick Tree can't be used when the operation to be performed is non-invertible, whereas this structure doesn't need the inverse at any point, so it can be used for these operations as well.

The Theoretical Time and Space complexities are the same for both.

2)Segment Trees:

Again, the asymptotic space and time complexities are the same for both.

Also, the two structures support the exact same set of operations (I am not talking about Lazy prop and stuff).

My first implementation (not the one in this blog) was roughly 60 times slower than the recursive segment tree. But after breaking my head for about a day, I managed to reach the current cleaner and much faster version.

However, this version of SST is still roughly twice as slow as the recursive Segment Tree :(

(I tried it on the CF Edu range min with point updates problem, the screenshot is attached below. Also, I know that benchmarking doesn't work this way but at this point I am too lazy to do anything more than this.)

.

The bottom-most submission uses the same structure as in the blog (without any changes),

The middle one is a slightly optimized version of the SST,

And the top-most submission uses a standard implementation of the Segment Tree.

As should be evident from my handle colour, I suck and so do my implementation skills. Thus, I have a feeling that the constant factor of the current version could be lowered significantly if some strong person were to look at it.

Also, I tried converting the current 2-D version into 1-D by maintaining an offset array for indexing, it did lead to some improvement in the constant factor, but I am omitting that implementation from this blog.

3)Sparse Tables:

I don't know how sparse tables work, but AI as well as some people I talked to told me that SST is somewhat similar to sparse tables but not exactly the same.

4)Binary Lifting:

I watched Errichto's binary lifting video just the previous day of having this SST idea, so yes, it is inspired from Binary Lifting, especially in the update and build part.

AI Thoughts:

When I was done with the build and update part, I gave my code and asked ChatGPT, Grok, Claude and Gemini if this is known and what is it called.

I got almost the exact same answer from each of them, which was that what I have implemented is an overly-complicated and bugged version of Fenwick Trees :(

they couldn't stop giving me non-existent bugs till I asked them to find a counter-case where my code would fail, after which they verified that this indeed works.

One surprising thing that happened along the way is that in the first prompt, every AI model told me that my build part is O(NLogN), which is clearly not true, this was the first time I saw such a trivial mistake from every AI at the same time.

Is this very useful?

Perhaps not so much, unless you don't know how to implement segment trees and you were able to understand everything so far, in that case you might just use the most optimized version of this until you learn segment trees.

I have a feeling that either:

1)It is already known but not very popular.

2)It is just another way to implement a standard data structure.

3)(Most likely) It's stupid.

In any case, please do let me know. Also I am sorry in advance if this blog seemed useless.

The Backstory
Full Form of SST
Story behind the name

Thanks to galen_colin for validating my ideas and helping me at multiple steps. Also, Thanks to uncle de_sousa for validating the idea and motivating me to write the blog.

I am hopeful that someone will find a workaround to make this support Lazy-Propagation. Also, although its too much to ask for, I hope that someone can bring down the query time from O(logN) to O(1) like sparse tables, that would be insane.

(This is one of the main reasons I decided to write this blog).

Just in case someone is interested (although I doubt) , I will share the link to my GitHub repository in a while which contains the current as well as the more efficient versions of the SST having 1D vector + offset array along with some other changes.

Thanks for reading.

  • Vote: I like it
  • +11
  • Vote: I do not like it

»
8 months ago, hide # |
Rev. 2  
Vote: I like it +13 Vote: I do not like it

If you look at for what ranges the sum is stored in a picture, I think you will see a familiar picture (especially with $$$n = 2^k$$$).

Spoiler