New coders: Stop making this simple mistake

3 years ago, # |

+56

Embrace lambdas so you never have to use a global variable ever again!

Example of a submission to a multitest graph problem with no global scope used.

→ Reply

3 years ago, # ^ |

+15

You can also get rid of that annoying auto &self parameter, by just typing the type explicitly rather than using auto:

Spoiler

        function<bool(int, int)> dfs = [&](int u, int c) {
            if (color[u] != -1)
                return color[u] == c;
            color[u] = c;
            for (auto [v, w] : adj[u])
                if (!dfs(v, c ^ w))
                    return false;
            return true;
        };

instead of:

        auto dfs = [&] (auto &self, int u, int c) -> bool {
            if (color[u] != -1)
                return color[u] == c;
            color[u] = c;
            for (auto [v, w] : adj[u])
                if (!self(self, v, c ^ w))
                    return false;
            return true;
        };

→ Reply

3 years ago, # ^ |

+46

So what you're using here is actually something different, std::function, and there's surprisingly a performance difference between the two. std::function tends to have significantly more overhead. Try running this simple example in custom invocation for example:

Code

#include <bits/stdc++.h>
using namespace std;

int main() {
    int n;
    cin >> n;
    
    vector<vector<int>> adj(n);
    for (int i=0; i<n-1; i++)
        adj[i].push_back(i + 1);
    
    auto dfs1 = [&] (auto &self, int u, int p) -> int {
        int ret = 1;
        for (int v : adj[u])
            if (v != p)
                ret += self(self, v, u);
        return ret;
    };
    
    function<int(int, int)> dfs2 = [&] (int u, int p) -> int {
        int ret = 1;
        for (int v : adj[u])
            if (v != p)
                ret += dfs2(v, u);
        return ret;
    };
    
    cout << fixed << setprecision(10);
    
    auto start = clock();
    int ret1 = dfs1(dfs1, 0, -1);
    cout << "Passing self: " << (double) (clock() - start) / CLOCKS_PER_SEC << endl;
    
    start = clock();
    int ret2 = dfs2(0, -1);
    cout << "std::function: " << (double) (clock() - start) / CLOCKS_PER_SEC << endl;
    
    assert(ret1 == ret2);
    
    return 0;
}

Output

n = 1e5
Passing self: 0.0000000000
std::function: 0.0180000000

n = 1e6
Passing self: 0.0310000000
std::function: 0.0780000000

n = 2e6
Passing self: 0.0620000000
std::function: 0.1750000000

n = 3e6
Passing self: 0.1090000000
Runtime error: exit code is 2147483647

These results were generated using C++20 compiler, though I got similar results using other compilers.

As an extra note, I've benchmarked all the different styles of recursion before in the past (normal recursion, std::function, passing self, using Y combinators). Normal recursion is of course still the fastest, but passing self has surprisingly competitive performance in exchange for minimal additional code.

→ Reply

mon0pole

3 years ago, # ^ |

← Rev. 6 →

A bit off topic,
But why both the ways gets a runtime error for n=3e6 in custom invocation when I used C++ 17 (64 bit).
Is there some optimization in C++ 20 for recursion ??
UPD1 : Actually the self type recursion gives runtime error only for C++17 (64 bit) and works fine on other compilers.
UPD2 : I think the reason may be that C++ 17 (64 bit) default recursion depth is set very low which doesn't allow this much depth in recursion. Then my question is how to increase recursion depth limit.
UPD3 : I guess I will have to stop using C++17 (64 bit) because there is no way we can increase stack space for OJ, Sorry for pinging you.

→ Reply

3 years ago, # ^ |

← Rev. 2 →

Yeah sorry I don't know either. All I know is that passing self has less overhead, but I'm not very knowledgeable about the intricacies between different compilers.

About increasing recursion depth, all C++ compilers on Codeforces, such as C++20 and C++17 compile with 256 mb of stack size, so there shouldn't be a significant difference between max recursion depth in the two.

→ Reply

mon0pole

3 years ago, # ^ |

Yeah that's what I didn't understand
That even if the stack size for all compilers are same why C++ 17 (64 bit) give Runtime Error.

→ Reply

3 years ago, # ^ |

+40

The reason behind less overhead with lambdas is the following:

std::function is a type-erased implementation of function objects (and hence it is useless for competitive programming unless you want to do something like make vectors of "lambdas", which is impossible, so you need to use something like std::function instead), and it requires heap allocations and extraneous pointer indirections, which makes it slower.
Lambdas are implemented as anonymous structs with operator(), i.e., they are callable (they're called functors). Using auto in the parameter list makes them a generic lambda, which is analogous to structs with templated operator(). When you call stuff like dfs(dfs, root, color), type deduction can be done at compile-time, and due to the lack of indirections, the compiler is able to optimize it much better.

→ Reply

3 years ago, # ^ |

+22

Ok, didn't know about this, thanks! I have two questions:

What is behind auto then, if it's not std::function?
Does this higher overhead of std::function really matter in practice? (i.e. can it actually cause my solution to get TLE in a contest problem?)

→ Reply

Thallium54

3 years ago, # ^ |

To my knowledge, it's a unique unnamed class type so you can only declare it with auto.
Probably not, but I don't like writing function parameters in both function and lambda.

→ Reply

3 years ago, # ^ |

+10

I guess it might matter in problems like this where they ask you to run DFS up to $$$10^6$$$ recursion depth. Though it shouldn't be a huge difference as long as problem authors are reasonable with time limits.

→ Reply

3 years ago, # ^ |

+13

It worked 143040776 XD

→ Reply

3 years ago, # ^ |

+21

As far as overhead is concerned, using std::function got me TLE once, so I stay away from it as much as possible. Similar things happen when you use std::function in segment tree implementations and so on.

→ Reply

3 years ago, # ^ |

+12

So is it safe to use it for simple stuff that only makes $$$O(n)$$$ recursive calls?

→ Reply

3 years ago, # ^ |

+10

Personally speaking, I don't use std::function for competitive programming purposes at all, but your mileage may vary. If $$$n$$$ is not too large, $$$O(n)$$$ recursive calls might be fine.

→ Reply

prashant_th18

3 years ago, # ^ |

-10

I used to use std::function, but because of some reasons (like not being able to set default arguments), I was told to use y_combinator (Although we need to have some extra code to use it). What are your views on it? Should we use it?

→ Reply

3 years ago, # ^ |

y_combinator is just a wrapper for another way of getting something similar to the "recursive" lambda trick, so it should be fine. In most cases it will be as fast as the recursive lambdas (the only issue I know of is that sometimes those functions don't get inlined, but I have never heard of anyone getting TLE using that).

→ Reply

tht2005

3 years ago, # |

← Rev. 2 →

I think memset is faster than the iterative assign: memset(a, 0, n * sizeof *a);

→ Reply

tfg

3 years ago, # ^ |

+55

Yes but if that's the difference between AC and TLE and your whole solution is O(N) then you can just curse the problem setter and move on :)

→ Reply