DNR's blog

By DNR, 7 months ago, In English

The following code takes $$$O(n \cdot x)$$$ memory (NOT $$$O(x)$$$!!!):

vector<vector<int>> a(n);
for(int i = 0; i < n; i ++)
{
    a[i].assign(x, 0);
    a[i].clear();
}

You must call a[i].shrink_to_fit() after a[i].clear() to actually free the reserved memory, and cap memory usage to $$$O(x)$$$. I unfortunately didn't know this and was consequently traumatized by seemingly inexplicable MLEs in today's d2D.

Now that I think about it, all of those tens of times where I called .clear() on smaller vectors after merging them into larger vectors (small-to-large merging) have been pointless.

  • Vote: I like it
  • +281
  • Vote: I do not like it

»
7 months ago, hide # |
 
Vote: I like it -67 Vote: I do not like it

https://en.cppreference.com/w/cpp/container/vector/clear.html
Meme

jokes aside, its a good announcement to those who don't know.

»
7 months ago, hide # |
 
Vote: I like it +36 Vote: I do not like it

what about vector<int> ().swap (a[i])

  • »
    »
    7 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    I just read about shrink_to_fit(), my sources say that it's a non-binding request, meaning it may or may not actually shrink the memory.

    imho I feel safer using this example than shrink_to_fit()

  • »
    »
    7 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    Alternative options (wait for the last one containing UB):

    a[i] = vector<int>();
    
    vector<int> kill;
    a[i] = ::move(kill);
    
    a[i] = ::move(a[i]);
    
»
7 months ago, hide # |
 
Vote: I like it +1 Vote: I do not like it

This also happens if you use a[i].resize(0)

»
7 months ago, hide # |
 
Vote: I like it +4 Vote: I do not like it

is the same true for sets and maps?

  • »
    »
    7 months ago, hide # ^ |
     
    Vote: I like it +9 Vote: I do not like it

    no, .clear() does free memory for std::set and std::map

»
7 months ago, hide # |
 
Vote: I like it +6 Vote: I do not like it

I prefer to let RAII do the work.

»
7 months ago, hide # |
 
Vote: I like it +8 Vote: I do not like it

Note: symptom of constant-inefficient code. If your code's efficient but still running into a memory bottleneck, it's usually obvious how that bottleneck happens. Assume that shuffling around data in STL data structures can have unexpected effects on memory and you'll be fine.

Example of code using std::vector that uses known amounts of memory:

// asymptotically 4N^2 bytes
std::vector<std::vector<int>> A(N, std::vector<int>(N));
std::copy(begin(input), end(input), begin(A[0]));
for(int i = 0; i < N-1; i++) calculate(A[i], A[i+1]);

// 8N bytes
std::vector<int> result(N), tmp(N);
std::copy(begin(input), end(input), begin(tmp));
for(int i = 0; i < N-1; i++) {
    calculate(tmp, result);
    std::copy(begin(result), end(result), begin(tmp));
}

The pattern of "reserve resources you'll be working with, then call functions that work in-place or receive their input and output as arguments" is common when memory is a concern since it makes reasoning about consumed memory much easier.

  • »
    »
    7 months ago, hide # ^ |
    Rev. 2  
    Vote: I like it 0 Vote: I do not like it

    I don't think there's any trivial way to do such "in-place" things in my particularly cursed use case (at least without adding a log factor). I was doing something along the following lines:

    vector<vector<int>> a(n);
    
    for(int i = 0; i < n; i ++) 
        a[0].push_back(i);
    
    for(int i = 0; i < n; i ++)
    {
        for(auto x : a[i])
        {
            int nxt = f(i, x);    //i < f(i, x) 
            if(nxt < n)
                a[nxt].push_back(x);    //performed at most O(n^(3/2)) times  
        }
        a[i].clear();
    }
    

    The total "algorithmic" size of all the vectors at any time here is clearly $$$\leq n$$$ but no memory is freed, so the total memory consumed ends up being $$$O(n \sqrt{n})$$$.

    • »
      »
      »
      7 months ago, hide # ^ |
       
      Vote: I like it 0 Vote: I do not like it

      Ok, you're making many push_backs to construct many relatively small vectors, which is going to result in many ($$$O(n \log n)$$$-ish) allocations. Those are slow, perhaps even slow enough to be a real bottleneck. This is what I mean by inefficiency.

      If you store $$$x$$$ at $$$a[f(x)]$$$, then $$$i \lt f(x)$$$ can't always hold when processing $$$a[i=f(x)]$$$. It'd result in TLE or MLE.

      Assuming you're instead storing some g(x) at a[f(x)], where $$$g(x) \gt f(x)$$$, then you could observe that you're dealing with chains of $$$x \rightarrow f(x), g(x) \ldots n, g(x)$$$ from each $$$x = i \in [0, n)$$$. Process them directly and independently, there's no need to store anything. Since your goal isn't to populate a but do something else, you could perhaps do it independently for each chain.

      Even if you wanted to store stuff, you can precalculate sizes of a[i] using those chains and perhaps do something with them. If memory limit wasn't a concern, flattening a jagged 2D array with known sizes would erase the time bottleneck of allocations.

      If your real code is more complex than this example, you could end up in a doomed situation no matter what, but that's the problem with complex code where you're shuffling around a lot of mutually dependent data. It's hard to code without bugs, hard to debug and has all kinds of nasty inefficiencies in runtime. Sometimes it's best stop and try another way... and sometimes to just grit your teeth and look for side effects.

      • »
        »
        »
        »
        7 months ago, hide # ^ |
         
        Vote: I like it 0 Vote: I do not like it

        push_back is amortized $$$O(1)$$$ though. Should we really be concerned about memory allocation speed? I can't imagine it leading to a TLE in real scenarios.

        • »
          »
          »
          »
          »
          7 months ago, hide # ^ |
           
          Vote: I like it +1 Vote: I do not like it

          It absolutely can. $$$O(10^{10})$$$ is $$$O(1)$$$ too and you have to consider that constant factor in real scenarios. I've been burned multiple times by push_backing to create many small vectors.

          • »
            »
            »
            »
            »
            »
            7 months ago, hide # ^ |
            Rev. 2  
            Vote: I like it 0 Vote: I do not like it

            Hmm, that might be the case in high rated problems where you have to do too many operations per test case. Personally, I have never faced this issue.

            Still I try to optimize the memory used by using tricks like:

            1. swap(dp, ndp) when the current row states depend only on the previous row states. You should use swap ($$$O(1)$$$) instead of std::copy ($$$O(n)$$$) in your first example as it's less error-prone.

            2. Using a k length vector and i % k if the current state depends on last k states.

            3. vector<array<T, K>> (single allocation), instead of vector<vector<T>> (one allocation per inner vector), if K is a constant or if its upper bound is small.

            Processing each chain independently is a lot cooler though.

            • »
              »
              »
              »
              »
              »
              »
              7 months ago, hide # ^ |
               
              Vote: I like it 0 Vote: I do not like it

              I wouldn't know, I solve mid-rated problems. Of course it won't happen in problems that have straightforward clean solutions.

              You should use swap instead of std::copy in your first example as it's less error-prone.

              Also less obvious what it's supposed to do — but it uses move semantics instead of copying data using a temporary, so that's nice.

»
7 months ago, hide # |
 
Vote: I like it +3 Vote: I do not like it

Usually you don't need to think particularly about the memory usage of vectors in CP. The time complexity of initializing memory is the same as the memory complexity. When you exceed the memory limit you are probably exceeding the time limit as well.

»
7 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

for everything ? or just for vectors ? didn't know about this