This is obvious but easy to overlook. If you use something like ios_base::sync_with_stdio(0); cin.tie(NULL); for fast printing, it does not affect the cerr output stream. Hence, if you print too many debug statements to cerr, you might TLE for seemingly no reason.
Reference: https://mirror.codeforces.com/contest/2128/submission/331216699, https://mirror.codeforces.com/contest/2128/submission/331215429
EDIT — clog is faster and serves a similar function, though it is unflushed and hence might not produce output if the program doesn't end normally.
https://mirror.codeforces.com/contest/2128/submission/331238365







