ABC339G: Smaller Sum from the last week's ABC had restrictions for online query and someone in the comment section was interested in knowing the alternate easier solution if the queries could be answered offline. So, I've written a short blog that talks about how to use sweep line to answer these 2 problems:

- You are given an array and several queries. A query consists of a subarray and $$$x$$$. Print the sum of all elements from this subarray that are $$$\leq x$$$.
- You are given a tree and some queries. In each query $$$(x, y, T)$$$ print YES if there is a path from $$$x$$$ to $$$y$$$ such that the maximum value on that path is $$$ \leq T$$$.

To help you solidify the concepts discussed, I've created 1 + 1 practice problems. You can access them https://mirror.codeforces.com/group/7Dn3ObOpau/contest/503852

https://cfstep.com/training/tutorials/general-techniques/answering-queries-offline-with-sweepline/

The problems are untested. If you see any issues, please let me know.

If you need any help or hints for the practice problem, you can ask on my Discord Server. In case you are interested, you can also checkout my youtube channel

keep posting this stuff man, these are currently the most helpful blogs on Codeforces.

Thanks for awesome problems. Any hint on how to solve Smaller Sum online?

I've not solved the online version myself, but glancing at the editorial and comment section, it can be solved either using

Merge Sort TreesorPersistent Segment TreeorSquare Root Decomposition.Yes, it is merge sort tree, where you store the prefix sum of sorted array, along with the actual array in each node of the tree.

Solved using wavelet tree.

Not sure why it's so terribly slow, should be $$$O(Q\log(\max A))$$$ + $$$O(N\log(\max A))$$$ build

You can consider using a wavelet matrix instead. There is some discussion here that can help in this respect.

I used very stupid build, with simple fix now 5x faster locally, but on CF just 2x, probably because of queues issue. Still much slower than fenwick+sweepline

Your matrix is interesting, but it's not immediately obvious to me how to modify it for range sum on same bit level. I'll bookmark it to take a deeper look at it later.

I tried to solve this problem with sqrt decomposition. But resulted in tle. Any suggestions to reduce time? Tle: code

Your block lookup is expensive, so you need to decrease number of blocks. So you can increase size twice, and when you need partial scan check

`n < blocksize/2 ? for(0..n) : block.sum - for(0..blocksize-n)`

Great, I was thinking exactly the same. Thanks

Well sorry to bother again. I just changed the block size 450 to 550 and it becomes AC. Why is that? AC

Turns out that 550 is enough. I'd try scanning lower half to check how much faster it can get with simple trick