Is Hashing Using Vector better than using Unordered Map?

So I was stuck in this problem 2053B - Outstanding Impressionist for a long time; finally I am able to solve this problem a little while ago. I have used hashing technique to solve it. Firstly, let me explain how I have solved it. (If you do not want to hear me chattering about my solution approach, then skip the next two paragraphs.)

I have noticed that whenever l_i and r_i are equal, it is guaranteed that there is a w_i with the value of l_i (or r_i) exists in the impression. If the same equal value of l_i and r_i appear again, that means there must be multiple instances of the same value of w_i is present in the impression, which means the ith piece of the impression must not be unique; otherwise, it will be unique as all the other pieces can be set to different values other than that of w_i. An easy way to implement this logic would be to use hashing and store the frequency of those values of w_i such that l_i = r_i. If the frequency of w_i is greater than 1, then w_i is not unique; and if it is 1, then it is unique.

Now what happens when l_i and r_i are not equal? In that case, we have to check if there exists any v such that l_i ≤ v ≤ r_i and for any 1 ≤ j ≤ n if l_j = r_j, then v ≠ l_j. It is because if such v exists, then we can set all the values of the other pieces of impression not equal to v and set w_i to equal to v and hence, w_i will be unique. Now how to implement this logic? At first, I have stored every possible values of w_i in increasing order in a vector which are not guaranteed to be present in the impression. (Only when l_i = r_i, a piece of impression of the value w_i = l_i is guaranteed to be present in the impression.) I can do this by iterating from 1 to 2n and check whether the value is already present (or have frequency greater than 0) in the hash array/map which was previously created to store the frequency of w_i which was guaranteed to be present. It is possible because of the constraints 1 ≤ l_i ≤ r_i ≤ 2n and l_i ≤ w_i ≤ r_i and n ≤ 2.10⁵, which means to do this, I have to do at most 4.10⁵ operations since the largest value of r_i or w_i is 4.10⁵. After that, I have used binary search to look for the smallest value equal to or greater than l_i in the newly created vector (let this value be c) and checked if c is equal to or less than r_i or not. If c is equal to or less than r_i, then we can set the value of w_i to be c and the other pieces of impression to be not equal to c and hence, w_i will be unique. Otherwise, w_i will not be unique.

So this was all the logic and idea I came up with to solve the problem. At first, I implemented the hashing using unordered map and the verdict I received was TLE on testcase 12. Here is the submission 322353823. After receiving TLE, I thought since my code has time complexity of O(nlogn) for using binary search n times, I have to somehow get rid of this binary search and do the searching in constant time. But I did not know how to do so and ended up with the notion that my approach was not right. Then for some reason, I decided to implement the hashing using vector. I did it because I thought although unordered map takes constant time on average to insert and update its elements, a hash array created using vector also takes constant time to do the same and also not on average but always. After doing so, my solution got accepted 322354822. So today I have learned a new lesson that hash array created using vector is faster than unordered map. But my question is: Why is there a significant gap in performance between these two methods? Although unordered map is slower, shouldn't it also pass the testcases? Since the accepted solution takes 93 ms to pass the 12th testcase and the time limit is 1s, does that mean vector is more than 10 times faster compared to unordered map? And when the input is in 10⁶ range, should I always implement hashing using vector? When would using unordered map be appropriate? These are some questions that have arisen in my head. Any sort of help would be appreciated.

Comments (16)

Write comment?

__flyrex__

12 months ago, hide # |

← Rev. 3 →

yess, i also faced the same problem in this question. your priority for using should be unordered_map < ordered map < vector. unordered map gives tle due to collisions(O(n^2)), i have faced similiar problem a lot of times and now always use ordered map instead of unordered map. the extra log n tc will very rarely give tle and vector is just the best cuz its straight up O(n) even in worst case

→ Reply

Raising_Heaven

12 months ago, hide # ^ |

Yes, you are right. Solved it with map just now and the solution is accepted. I used to use unordered_map because of that average O(1) complexity but I guess there can always be a test case where it does not work in O(1) complexity. So I guess map should be prioritized over unordered_map. Now I wonder if it is the same with set and unordered_set also.

Shuvo_06

← Rev. 2 →

Vector provides constant time, O(1) access, and an unordered_map also offers O(1) access on average. So in theory, unordered_map should be just as fast as vector—and it usually is, under normal circumstances.

However, this performance relies on a key assumption: that the number of hash collisions per item remains low, on average O(1). This assumption typically holds true for random input data. But when the input is not random—especially if it's crafted maliciously, as in the hacking phase of a contest—this assumption breaks down. If someone knows the hash function used, they can deliberately create many different keys that hash to the same value. This results in a high number of collisions and degrades performance to O(n²) and boom, TLE.

You can refer to thia blog, this is very insightful : blowing up unordered_map

The link to the blog is not working.

i have updated the link, can you try again?

Working now.

m_manasc

-19

I don't get why people downvote such posts, he just wants help if someone knows something regarding the topic, they may answer the query, otherwise they can simply ignore or upvote for reach, But instead nowdays people just downvote such blogs, He has a asked a genuine question and many begginers like me will get help if such questions are answered. Plz stop such toxity in community.

This comment was kind of needed for me. I was a bit upset with all the downvotes.

avighnakc

Just don't use unordered_map. Has never worked out well in my experience

TalkySafe143

321707794 vs 321707920

I'm aware of the $$$O(n^2)$$$ worst case, just is useful sometimes.

I am a bit confused now. Unordered map is sometimes better than map?

gametophyte

Yes, In most cases Unordered map with a custom hash (SplitMix64) can prevent hash collisions, take a look at my submission https://mirror.codeforces.com/contest/2053/submission/322464603

And for most use cases gp_hash_table is faster than unordered_map

gp_hash_table with Splitmix64 : 125ms https://mirror.codeforces.com/contest/2053/submission/322468193

unordered_map with Splitmix64 : 233ms https://mirror.codeforces.com/contest/2053/submission/322468508

(Extra) cc_hash_table without Splitmix64 : 125ms https://mirror.codeforces.com/contest/2053/submission/322468148

But I still recommend gp_hash_table with Splitmix64 for most use cases

Update1 : Some cases where you are only hashing int or long long normal gp_hash_table<K,V> will be faster

In average time complexity is faster, like this comment remark.

Unordered map have a time complexity of $$$\Theta(1)$$$ but also $$$O(n)$$$ in the most important (all?) operations, and Map have a $$$O(\log n)$$$ time complexity in their operations. (Note the notation detail)

But you can use a custom hash as this comment suggest to make the number of collisions $$$O(1)$$$ and also make the operations $$$O(1)$$$.

You can read more about hashing and how unordered map works in this Stanford slides from Slide #41 if its unclear to you

Fair enough, fair enough. I guess it was inaccurate of me to lay out something that general as a ground rule. Whenever I need unordered_map-like features, personally I've always just tried to rewrite it without one.

In your example, I suppose just noticing that each number is a[i] shifted right a bunch of times, what I'd personally do is just replace it with an iterative $$$O(n \log{n})$$$ that stores all valid values, std::uniques them and goes from small values to large values. I haven't actually tested whether this would be faster in this case. Might not.

Anyways, thanks for providing me with an example where unordered_map is just outright faster than map.

Sounds good, I'll test this out, thank you!

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3611
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	Radewoosh	3415
8	Um_nik	3376
9	maroonrk	3361
10	XVIII	3345

#	User	Contrib.
1	Qingyu	162
2	adamant	148
3	Um_nik	146
4	Dominater069	143
5	errorgorn	141
6	cry	138
7	Proof_by_QED	136
8	YuukiS	135
9	chromate00	134
10	soullless	133

Raising_Heaven's blog