Блог пользователя Fyind

Автор Fyind, история, 3 года назад, По-английски

Problem: 1864F. Exotic Queries

The time limit is 4s. The input size is $$$n,q \le 10^6$$$. The standard solution uses a segment tree and the input/output size is very large. The sync of cin,cout with stdio is turned off in all implementations if cin,cout is used.

Cin,cout vs Fast IO, Scanf on C++20 (64)

  • 220904724 Array implementation with cin,cout on C++20 (64) -> TLE
  • 220952057 Array implementation with Fast IO on C++20 (64) -> 1762 ms
  • 220950571 Array implementation with Scanf on C++20 (64) -> 3181 ms
  • 220950753 Pointer implementation with Scanf on C++20 (64) -> 3868 ms
  • 220947347 Pointer implementation with cin,cout on C++20 (64) -> 1872ms
  • 220954316 Pointer implementation with Fast IO on C++20 (64) -> 2277ms

C++17 (32) vs C++20 (64)

  • 220904724 Array implementation with cin,cout on C++20 (64) -> TLE
  • 220904724 Array implementation with cin,cout on C++17 (32) -> 1621ms
  • 220947347 Pointer implementation with cin,cout on C++20 (64) -> 1872ms
  • 220953695 Pointer implementation with cin,cout on C++17 (32) -> 1934ms

Pointer implementation vs Array implementation

  • 220904724 Array implementation with cin,cout on C++20 (64) -> TLE
  • 220947347 Pointer implementation with cin,cout on C++20 (64) -> 1872ms

Some guessed conclusions

These facts makes no sense to me. Based on the facts, I guess

  • It's better to use Pointer implementation if you use cin,cout on C++20 (64)
  • If you prefer array implementation, use Fast IO or scanf on C++20 (64)
  • The 32/64 bit compiler has some influence on cin,cout performance

Do you guys have some idea why such cases happen? Or did I implement something improper so that it reduced the efficiency?

UPD: Reason of TLE

Thanks areke for pointing out the reason, and related comment and blog.

The thing is, when using cin in a 64-bit compiler, do cin in a separate loop.

Specifically, do

vector<int> vec(n+1);
for (int i = 1;i <= n; ++i) {
    cin >> vec[i];
}
for (int i = 1;i <= n; ++i) {
    pos[vec[i]].push_back(i);
}

instead of

for (int i = 1;i <= n; ++i) {
    int x; cin >> x;
    pos[x].push_back(i);
}

This modification speeds up the code from TLE to 1216ms !!! 221383761

  • Проголосовать: нравится
  • +42
  • Проголосовать: не нравится

»
3 года назад, скрыть # |
Rev. 3  
Проголосовать: нравится 0 Проголосовать: не нравится

Today, I noticed same thing.

In this problem on using cin/cout I am getting TLE. But if I use fast I/O i.e. scanf/printf my solution gets accepted.

Accepted code

Getting TLE

Accepted Solution

220982107 TLE 220982253 AC

TLE Code Accepted

»
3 года назад, скрыть # |
 
Проголосовать: нравится 0 Проголосовать: не нравится

Your rmq is recursive

»
3 года назад, скрыть # |
Rev. 2  
Проголосовать: нравится 0 Проголосовать: не нравится

Array (actually vectors) and cin worked fine for me.

  • »
    »
    3 года назад, скрыть # ^ |
     
    Проголосовать: нравится 0 Проголосовать: не нравится

    Hi, thank you for your segment tree implementation and your submission. I only replaced your segment tree implementation in my old submission. Unfortunately, it got TLE. 221057129. Maybe you got accepted because your implementation in other parts is different from mine and somehow has low constant?

»
3 года назад, скрыть # |
 
Проголосовать: нравится +5 Проголосовать: не нравится

Strange, shouldn't array segtree version be faster than pointers segtree one?

»
3 года назад, скрыть # |
 
Проголосовать: нравится +4 Проголосовать: не нравится

See this comment.

It's related to https://mirror.codeforces.com/blog/entry/99038.

If you read in the input separately from pushing back to the pos array, the implementation only takes 1216ms: https://mirror.codeforces.com/contest/1864/submission/221383761.