multiset is bad! - Codeforces

→ Обратите внимание

До соревнования
CodeTON Round 9 (Div. 1 + Div. 2, Rated, Prizes!)
42:19:19
Зарегистрироваться »

*есть доп. регистрация

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

Страны | Города | Организации

Всё →

→ Лидеры (вклад)

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя FAT_

multiset is bad!

Автор FAT_, история, 12 месяцев назад, По-английски

What do you think of the time complexity of the following code?

multiset<int> S;
for (int i = 0, x; i < n; ++i) {
    cin >> x;
    S.insert(x);
}

for (int i = 0, x; i < n; ++i) {
    cin >> x;
    cout << S.count(x) << " ";
}

For me, I thought it should be $$$O(n \log n)$$$ until yesterday, because I thought STL implemented it by adding a field count to set, but actually this code can be as bad as $$$O(n^2)$$$, because the time complexity of S.count(x) is $$$O(\log n + k)$$$ instead of $$$O(\log n)$$$, where $$$k$$$ is the number of occurrences of x in S.

The solution is to use map in this case:

map<int, int> M;
for (int i = 0, x; i < n; ++i) {
    cin >> x;
    M[x] += 1;
}

for (int i = 0, x; i < n; ++i) {
    cin >> x;
    cout << M.count(x) ? M[x] : 0 << " ";
}

I've stepped into this trap in yesterday's educational round, my solution of D was hacked because of this. I could become master if I didn't made that mistake :(

c++, multiset

+121

FAT_
12 месяцев назад
22

Комментарии (22)

Написать комментарий?

arvindr9

12 месяцев назад, # |

← Rev. 2 →

+89

I guess this shouldn't be too surprising, since you're counting occurrences of an item (and a multiset is effectively a balanced binary search tree with duplicates allowed). One way to do this with a treap, is to perform two split operations so that one of your treaps has all elements equal to $$$x$$$, and then outputing the size. I'd imagine a C++ STL BBST generally wouldn't have that kind of functionality.

Edit: I think post brings up a good point since .count() is often used to check the existence of something, so this is a trap people should watch out for (credit to bitset for bringing this up to me).

→ Ответить

lOthBroK286

12 месяцев назад, # ^ |

+10

if it was set (not multiset), I think using count to see if it's there or not is totally fine.

→ Ответить

-0.69

12 месяцев назад, # |

← Rev. 2 →

+19

multiset is overpowered don’t say anything bad about my love >~<

→ Ответить

0x81

12 месяцев назад, # |

← Rev. 2 →

-13

I thought..., but...

I thought, that std::stack.pop() returns last (removed) element.

C++ users, why std::stack.pop_top() doesn't exist? Do you really like writing 2 func_calls instead of 1?

→ Ответить

chromate00

12 месяцев назад, # ^ |

Because that is how optimization works. If we just copy from top() as a reference, that turns out as one copy. However if we were to make pop_top(), we would no longer have that element we needed as a reference. So we need to copy once first, and return that, and outside the function there is another copy. Do we like using 2 copies instead of 1? Probably not.

→ Ответить

ivan100sic

12 месяцев назад, # ^ |

+27

Excessive copies can be solved by using std::move. The function can be easily implemented in the library like this:

auto pop_top() {
  auto __r = std::move(top());
  pop();
  return __r;
}

RVO will ensure that no copies are ever made.

→ Ответить

a_ban_donon

12 месяцев назад, # |

-20

I always use .count as it's O(N).

→ Ответить

Gassa

12 месяцев назад, # |

+39

Yeah, a common trap.

A way to think about it is: why doesn't multiset just add a count for equal items? Recall that a multiset can store arbitrary objects with arbitrary comparison operation, not just integers. For example, you can have a class for contest participants, and use a multiset to order participants by their current rating. But they are still individual participants, each with their handle and other data. You can't replace the whole class by just the value you use to compare them, and count the occurrences of that value.

If you wish a count, you can make that wish explicit in your program by using map (value -> count).

→ Ответить

peltorator

12 месяцев назад, # |

+68

Notice that proving that it works in $$$O(k + \log n)$$$, and not in $$$O(k \cdot \log n)$$$ is a nice little exercise to check your understanding of balanced binary search trees.

→ Ответить

G.J

12 месяцев назад, # ^ |

← Rev. 2 →

-6

Can't we just say that we are applying binary search to find the pointers, lower_bound pointer and upper_bound pointer (in O(lg(n)), and then we are just finding the distance between the pointers, since the set is not indexed we can only do this in O(k) where k is the actual distance (number of occurrences of the element). So, it will be O(lg n + k).

Q1 : We can do this in O(lg(n)) using ordered_set (which can tell the index; the number of elements less than this element)?

Kindly share your proof / s as well.

→ Ответить

peltorator

12 месяцев назад, # ^ |

← Rev. 2 →

+16

How would you find a distance between two pointers in $$$O(k)$$$? That's exactly the question. I claim that in general for an std::set it is not possible. One needs to go to the next pointer one-by-one, and each step takes $$$O(\log n)$$$ time. So why is it $$$O(k + \log n)$$$, and not $$$O(k \cdot \log n)$$$?

→ Ответить

G.J

12 месяцев назад, # ^ |

← Rev. 2 →

-14

It takes O(k), resource : https://www.geeksforgeeks.org/stddistance-in-c/

Its written in the time complexity that O(1) for those which have random access and else O(n).

Actually, I don't think each step takes O(lg(n)).

→ Ответить

peltorator

12 месяцев назад, # ^ |

+37

I wouldn't believe everything that is written on geekforgeek. I am pretty sure it takes $$$O(k + \log n)$$$ time. Because, for example, it++ takes $$$O(\log n)$$$ time, and not constant. And well, that's exactly what my question is: WHY it works in this time :)

→ Ответить

chromate00

12 месяцев назад, # ^ |

← Rev. 3 →

Would this be a good summary of the proof?

Spoiler

→ Ответить

peltorator

12 месяцев назад, # ^ |

I think you're right, yes. But the fact that the size of this subtree is limited is exactly the place where you need to apply some knowledge of how BBSTs work.

→ Ответить

chromate00

12 месяцев назад, # ^ |

← Rev. 2 →

Honestly I would have argued that the size of that subtree is limited to $$$\mathcal{O}(dist+height)$$$ for any general BST. It's just that the painful case exists such that the height is $$$\mathcal{O}(n)$$$, where $$$a$$$ and $$$b$$$ exists on the rightmost node of the left subtree and the leftmost node of the right subtree, so $$$size=\mathcal{O}(dist)$$$ simply does not hold here.

UPD: Just came up with idea to prove this. A walk from $$$a$$$ to $$$b$$$ is very structured, in fact it is really just Subtree->Path->Subtree. The path is of length $$$\mathcal{O}(height)$$$, and the two subtrees are of size $$$\mathcal{O}(dist)$$$. So this proves it.

→ Ответить

Prajwalrayal

12 месяцев назад, # |

+12

Whenever I had to check if an element was present in a set I used .count() instead of find() and one day I used the same on a multiset and got TLE, I also learned it the hard way :)

→ Ответить

ritesh_gupta97

12 месяцев назад, # |

-9

You may use lower bound (or/and upper bound) for counting in multiset. That will be in log n (hopefully I'm correct)

→ Ответить

MatrixGroup

12 месяцев назад, # ^ |

+17

Recall that STL can't calculate the distance of two iterators of a (multi)set in constant time.

→ Ответить

ritesh_gupta97

12 месяцев назад, # ^ |

Phh sorry and thank you. I need to look it up.

→ Ответить

Prajwalrayal

12 месяцев назад, # ^ |

Counting can be done using a map but I tried to act smart by using count as find function to avoid !=s.end() though I still use count instead of find in sets

→ Ответить

S3cr3t_C0d3r

12 месяцев назад, # |

Indeed .

→ Ответить

Соревнования по программированию 2.0

Время на сервере: 21.11.2024 23:15:42 (h1).

Десктопная версия, переключиться на мобильную.

При поддержке