noobnocap's blog

By noobnocap, history, 3 months ago, In English

Unorded_map lets us access any element in O(1). So unordered_map is very handy for some problems. But sometimes it gives TLE as the worst case time is o(n). How to spot it beforehand and know the hash function will generate heavy collisions?

  • Vote: I like it
  • 0
  • Vote: I do not like it

»
3 months ago, # |
  Vote: I like it +1 Vote: I do not like it

You need to know:

  1. The test cases
  2. The hash function you are using

See if the test cases will generate heavy collisions.

»
3 months ago, # |
  Vote: I like it 0 Vote: I do not like it

just don't use unordered map simple

»
3 months ago, # |
  Vote: I like it 0 Vote: I do not like it

who said that unordered_map lets us access any element in O(1), this is a misconception.

Worst case: O(n) Best case: O(1)

So just use map instead, cuz map works O(log(n)) in any case

  • »
    »
    3 months ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Actually, you can still use unordered_map with custom hash. Please refer to this blog if you want to use a safe unordered_map. Link. If you are feeling lazy to open the blog, here is the source code:

    struct custom_hash {
        static uint64_t splitmix64(uint64_t x) {
            // http://xorshift.di.unimi.it/splitmix64.c
            x += 0x9e3779b97f4a7c15;
            x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9;
            x = (x ^ (x >> 27)) * 0x94d049bb133111eb;
            return x ^ (x >> 31);
        }
    
        size_t operator()(uint64_t x) const {
            static const uint64_t FIXED_RANDOM = chrono::steady_clock::now().time_since_epoch().count();
            return splitmix64(x + FIXED_RANDOM);
        }
    };
    

    Via this safe custom hash, you can use unordered_map like this:

    unordered_map<long long, int, custom_hash> safe_map;
    
    • »
      »
      »
      3 months ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      Even if you do use custom hash, its constant factor is way too high to be practical. I'd suggest using gp_hash_table