YF_YUSUF's blog

By YF_YUSUF, 8 months ago, translation, In English

A few hours ago, the hacking phase for the round (Div. 3) ended and the system testing began. I wanted to check how many solutions failed the system tests, and I saw a huge number of TLEs on problem C. After looking at several solutions, I realized that they all had one thing in common — unordered_map.
I think many people know that unordered_map (hereafter UM) can work in both $$$O(1)$$$ and $$$O(n)$$$ time. Because of this, some people found a test case where UM runs in $$$O(n)$$$, which makes the overall complexity $$$O(n^2)$$$.
It got me wondering why people massively use UM when a regular map passes within the time limits. I asked ChatGPT to solve this problem, and it gave me a solution using UM.

ChatGPT's code

On the one hand, it’s kind of funny that people who copied from ChatGPT (or another LLM) got their solutions rejected.
On the other hand, there are people who wrote the code themselves but decided to use UM instead of map (although I think that’s their own fault because they could have just used a regular map).

  • Vote: I like it
  • +70
  • Vote: I do not like it

»
8 months ago, hide # |
 
Vote: I like it +21 Vote: I do not like it

This is hilarious and a good entertainment too. That is, to go to the status page, pick problem C and check the TLE submissions only to see, that contestants with such submissions follow... a certain pattern, you know.

»
8 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

bruh i was one of those who wrote their solution with UM but i got MLE instead, can someone explain to me why i got MLE this is my code i also wrote a version in c++ and had failed again with MLE, after a while i had used the vectors with a space of O(n) and a complex of O(n*log(n))

this is my submission 333342133

  • »
    »
    8 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    the map should have worked in space O(2 * n + some factors) right?

    • »
      »
      »
      8 months ago, hide # ^ |
      Rev. 2  
      Vote: I like it 0 Vote: I do not like it

      I'm guessing it's because

      When you iterate from

      for i in range(k // 2 + 1):
              if i == 0 or (k % 2 == 0 and i == k // 2):
                  if array_mod_k_a[i] != array_mod_k_b[i]:
                      print("NO")
                      return
              else:
                  if array_mod_k_a[i] + array_mod_k_a[k - i] != array_mod_k_b[i] + array_mod_k_b[k - i]:
                      print("NO")
                      return
      

      and i doesn't exist in your defaultdict you probably create an entry in it.

      Edit: Wanted to add k can be up to 10^9+7. So you're creating up to (10^9+7)/2 entries in your dict which would be higher than the 256mb allowed.

      • »
        »
        »
        »
        8 months ago, hide # ^ |
         
        Vote: I like it 0 Vote: I do not like it

        why am i creating these entries shouldn t the dictonary create only (n) entries?

        • »
          »
          »
          »
          »
          8 months ago, hide # ^ |
          Rev. 2  
          Vote: I like it 0 Vote: I do not like it

          Calling array_mod_k_a[i] will create an entry if it doesn't exist.

          From: https://docs.python.org/3/library/collections.html#defaultdict-examples

          When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the default_factory function which returns an empty list.

          (the example this is talking about is using a defaultdict of lists. In your case it would be 0 instead of an empty list)

          So e.x. when you hit if array_mod_k_a[i] + array_mod_k_a[k - i] != array_mod_k_b[i] + array_mod_k_b[k - i]: in your loop. If array_mod_k_a[i] does not exist, then it gets created.

          Since you're iterating from i = 1...k//2+1 you create an entry for every value between 1..k//2+1.

          This creates like 10^9/2 entries which is more than 256mb. I think it's probably like 1.something Gb ish?

          • »
            »
            »
            »
            »
            »
            8 months ago, hide # ^ |
             
            Vote: I like it 0 Vote: I do not like it

            thx very much i'm actually stupid for all this time i tought i was iterating over n not over k on the contest i brainlagged too much

            • »
              »
              »
              »
              »
              »
              »
              8 months ago, hide # ^ |
               
              Vote: I like it 0 Vote: I do not like it

              Yeah I originally did that too if you look at my 7 wrong submissions for that question

»
8 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

Honestly I was scared with a normal map, looking at the time it took during contest and seriously considered resubmitting, thank god i didn't

»
8 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

I believe this will pass if you sort the array before hand

»
8 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

bro ig there are stupid people like me who still use UM. I know in the worst case it takes O(N), but some of us are victims of our own carelessness. Tho I do think there should be a test case for this during the contest only :(

»
8 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

I am a beginner so I don't if I have a place to speak here lulw but I use unordered_map like all the time whenever I have to use something sort of a hashmap. why is it preferred to use a normal map over unordered_map? does the sorting provide some sort of a powerup?

  • »
    »
    8 months ago, hide # ^ |
     
    Vote: I like it +3 Vote: I do not like it

    The reason map is preferred over unordered_map is because, when you use umap[x], it uses the hash function (default: std::hash) and some other stuff to check if umap[x] exists and returns the value if so. But if there's a hash collision (two keys in umap, say x and y) it takes up to $$$O(n)$$$ to resovle the collision and get the corresponding value. Carefully selected keys can result in a time complexity as high as $$$O(n^{2})$$$.

    • »
      »
      »
      8 months ago, hide # ^ |
       
      Vote: I like it 0 Vote: I do not like it

      Ah ok, so ordered_map doesn't have this problem then? if that's so, shouldn't the normal ordered_map be used in every situation?

      • »
        »
        »
        »
        8 months ago, hide # ^ |
         
        Vote: I like it +3 Vote: I do not like it

        std::map takes $$$O(\log n)$$$ operations where $$$n$$$ is the number of elements in the map.

        The reason people (and ChatGPT) use std::unordered_map does $$$O(1)$$$ operations, and so is generally faster than std::map, except when hash collisions happen. Some people also don't know about the hash collisions part and don't use a custom, stronger hash which lead to hash collisions with GCC's default std::hash and std::unordered_map.

        Read this blog on std::unordered_map for preventing most hash collisions. Basically, if you use

        struct custom_hash {
            static uint64_t splitmix64(uint64_t x) {
                // http://xorshift.di.unimi.it/splitmix64.c
                x += 0x9e3779b97f4a7c15;
                x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9;
                x = (x ^ (x >> 27)) * 0x94d049bb133111eb;
                return x ^ (x >> 31);
            }
        
            size_t operator()(uint64_t x) const {
                static const uint64_t FIXED_RANDOM = chrono::steady_clock::now().time_since_epoch().count();
                return splitmix64(x + FIXED_RANDOM);
            }
        };
        

        and unordered_map<int, int, custom_hash>, hash collisions will become much less likely.

»
8 months ago, hide # |
 
Vote: I like it +10 Vote: I do not like it

No human would ever use unordered_map.reserve in a contest

  • »
    »
    8 months ago, hide # ^ |
    Rev. 2  
    Vote: I like it 0 Vote: I do not like it

    Why not? If I'm using unordered_map, it means I'm already concerned about performance, and reserve might speed up the code significantly and has no downsides, so practically no reason (at least I'm not aware of any) not to use it

  • »
    »
    8 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    Why not? It prevents rebuilding hash map as it grows. Thus, some contestants may think that executing the reserve method may improve the performance

  • »
    »
    7 months ago, hide # ^ |
     
    Vote: I like it 0 Vote: I do not like it

    Why do you think so? That method doesn't seem to be something unusual. With approximate understanding on how UM works and what does unordered_map.reserve do the UM can become a powerful enough tool.

»
8 months ago, hide # |
 
Vote: I like it 0 Vote: I do not like it

wait until GPT starts adding random time-based hash function to its code XD