unt311's blog

By unt311, history, 3 years ago, In English

In my opinion, using box size as (int)sqrt(n) is best for mo's algorithm.

But my two submissions differ in just that(fixing block size or taking it as (int)sqrt(n), and one gets TLE and other one gets AC.

What is the logic here ?

  • Vote: I like it
  • +11
  • Vote: I do not like it

| Write comment?
»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

It really depends on what kind of operations you do in your code (especially in each block and in the part where you "merge" them). Most times, sqrt should work. In this case, don't be fooled by the "magic number" 700. You can see that the solution passed with 4992 ms, which is almost TLE. It just happens sometimes. That means maybe you should optimize time complexity or a major part in your code that might cause slowness (maybe delete asserts? use pairs of pairs instead of tuples? are you doing Mo's algorithm in an actual correct way? is sorting correct?).

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Use C++17 with a 64-bit compiler.

»
3 years ago, # |
Rev. 5   Vote: I like it 0 Vote: I do not like it

Your submission is already pretty close to the time limit, so I wouldn't be surprised if normal run-to-run variation is what caused the TLE.

You can try resubmitting it with the 64-bit compiler, it makes stuff run a lot faster in general. Here's your TLE code getting accepted on the 64-bit compiler: #11415445

As an aside, compilers can optimize integer division by a constant value away into shifts and additions. This speeds up division by a lot, so IMHO best practice for square root decomposition problems is to pick a block size and declare it as a constant integer, unless using a fixed block size is not possible for some specific reason.

EDIT: typos and wording

»
3 years ago, # |
  Vote: I like it +22 Vote: I do not like it

One of the main reasons to fix the block size is that division by constants are well optimized by compilers. In this case I picked N = 450. https://godbolt.org/z/eEfjYPnze You can check the assembly here. In the div function, it gets optimized to a few additions, shifts and subtractions. But in the div1 function where its not constant its plain old division. In Mo's for the sorting order, you divide by block size which isn't constant. So in general make the block size a constant value.

»
3 years ago, # |
  Vote: I like it +9 Vote: I do not like it

By pre-define such constant block size, it will be optimized in multiplying, dividing, mod-ing, ,,,