Блог пользователя roycf123

Автор roycf123, история, 15 месяцев назад, По-английски

Today, I was asked in an interview to build a data structure as follows:

Let there be some elements and some groups. Each element associated to 'exactly 1' group has a score. The data structure must support the following operations:

  1. insert(el_id,grp_id,x): Insert element with id el_id with a score x to group with group_id grp_id
  2. set(el_id,x): change the score of element with id el_id to x.
  3. set(grp_id,x): change the score of all elements in the group with id grp_id to x.
  4. print(grp_id): print the max score element's id in that group. (Return any if multiple exist)

Constraints:

1 <= no_of_elements <= 1e6
1 <= no_of_groups <= 5
1 <= score <= 5

I couldn't solve it during the interview and also couldn't think of any solution later. Would someone please help?

  • Проголосовать: нравится
  • +12
  • Проголосовать: не нравится

»
15 месяцев назад, # |
  Проголосовать: нравится +10 Проголосовать: не нравится

Is it possible for one element to be in multiple groups? If so, does each element just have one score, or does each element-group association have its own score?

  • »
    »
    15 месяцев назад, # ^ |
      Проголосовать: нравится +8 Проголосовать: не нравится

    No that is not possible, each element will be associated to one group only.

    • »
      »
      »
      15 месяцев назад, # ^ |
      Rev. 2   Проголосовать: нравится +4 Проголосовать: не нравится

      It is possible to support each operation in amortized constant time with linear memory, even if the number of groups is much larger. (I'm treating the maximum score as a constant, since it's only 5.)

      For each group, we'll maintain a bucket queue, which is a type of priority queue where each score has a "bucket", in this case a circular doubly linked list. Each bucket will start with a sentinel node to make implementation easier.

      • For insert, add the element's node to its group's bucket with the corresponding score.

      • For the element version of set, remove the element's node from its group's bucket with the corresponding score.

      • For the group version of set, splice all of the other buckets onto the end of the bucket with the corresponding score.

      • For print, scan the buckets in reverse order until you find one that isn't empty and return its first element.

      Operations 3 and 4 are linear in the number of buckets, but again, we're treating that as a constant.

      Implementation

      For an interesting challenge, see if you can figure out how to achieve amortized logarithmic time complexity per operation even when both the number of groups and the maximum score can be large.

»
15 месяцев назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Auto comment: topic has been updated by roycf123 (previous revision, new revision, compare).

»
15 месяцев назад, # |
  Проголосовать: нравится +1 Проголосовать: не нравится

Not a very efficient approach but i think it should work: Think of all groups to be max heaps containing the elements. Also remember each elements index in its respective heap or group after each operation. Inserting element in group would simply be heap push operation O(logn), Changing score of element would be using the elements index in its heap and doing decreaseKey() operation in the respective heap 0(logn), changing all elements in a group would be changing each value in heap to O(group size) and printing max score of a group would be heap.top() O(1).

  • »
    »
    15 месяцев назад, # ^ |
    Rev. 4   Проголосовать: нравится 0 Проголосовать: не нравится

    Sorry I read the comment in a hurry...

    This is exactly what I did (except for the use of std::set instead of heaps, I performed search operation and changed for 2), But the set(grp_id,x) would take O(group_size) which can be O(1e6) in the worst case, so may be costly (in terms to time taken)...

    Is there any way to do it faster, that is any data structure that may support this form of group update fast?

    P.S: I think there may be a better way to utilize the constraint 1 <= score <= 5, although not sure...

    • »
      »
      »
      15 месяцев назад, # ^ |
        Проголосовать: нравится +3 Проголосовать: не нравится

      Timestamps for operations set(el_id,x) and set(grp_id,x). Store changed values of operation set(grp_id,x) for groups instead of elements. Changing all elements in a group could be done in O(1). Each element has two values: one belongs to itself (determined by operations insert(el_id,grp_id,x) and set(el_id,x)), the other belongs to its group (determined by operation set(grp_id,x)). Choose the one that has bigger timestamp when do other operations in the heap.

»
15 месяцев назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

a self balancing binary tree? where each node represents a group which is a binary tree of elements, all of the operations can be done in O(log(n) + log(m)) ... set(grp_id,x) will take O(log(n) + m).

  • »
    »
    15 месяцев назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Yeah same as before, many people use std::multiset instead of std::priority_queue and it does no harm in most cases.