polosatic's blog

By polosatic, history, 2 hours ago, translation, In English

Recenlty I was solving this problem. It was obvious strongly connected components. I wrote them and got TL14: 277474774.

I knew that push_backing to 1e6 vectors is slow due to the dynamic reallocation. I know the only way to handle it: precompute the size of each vector, and reinterpret vector<vector> as a static array. It was awful to implement, but I did it and got TL38: 277476691

Then I tried many things like removing the vector of Euler tour, adding pragmas, but have not passes the test 38. So, there are 2 questions for experienced optimizers:

  1. Is there another, simpler way to speed up vector push_backs, without precomputing each vector's size?

  2. Why does my final solution get TL, and how to speed up it? Constraints up to $$$10^6$$$ are enough to pass in this problem I think.

  • Vote: I like it
  • +1
  • Vote: I do not like it

»
2 hours ago, # |
  Vote: I like it 0 Vote: I do not like it

sooo?

»
2 hours ago, # |
  Vote: I like it 0 Vote: I do not like it

If you know beforehand approximately how much space you need, you can probably call std::vector::reserve

  • »
    »
    2 hours ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Much time ago I have the same issue and tried vector.reserve(), but it also gave me TL. But that time precomputing of all sizes helped my solution to pass.

    • »
      »
      »
      2 hours ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Well, vectors in general is slower than static array, I'm not very sure about the specifics of it but seems like it's memory allocation mechanics are more complicated

»
2 hours ago, # |
  Vote: I like it 0 Vote: I do not like it

Probably not the most useful idea, however does emplace_back work?

  • »
    »
    105 minutes ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I tried it when I met this issue in the first time. It also did not work, same as vector.reserve().

»
86 minutes ago, # |
  Vote: I like it +8 Vote: I do not like it

Mushrooms function works in sqrt(x)/2

For the vectors, the fastest way is to use something similar to bucket sort, which is to merge all vectors into one array.

»
60 minutes ago, # |
  Vote: I like it +1 Vote: I do not like it

My vector filled copy paste code for SCCs ACd in around half of the TL, so that probably isn't the cause unless there's some compiler specific wizardry going on. I'm pretty sure the comment above hit the spot: the mushrooms function is the cause.