roycf123's blog

By roycf123, history, 16 months ago, In English

Today, I was asked in an interview to build a data structure as follows:

Let there be some elements and some groups. Each element associated to 'exactly 1' group has a score. The data structure must support the following operations:

  1. insert(el_id,grp_id,x): Insert element with id el_id with a score x to group with group_id grp_id
  2. set(el_id,x): change the score of element with id el_id to x.
  3. set(grp_id,x): change the score of all elements in the group with id grp_id to x.
  4. print(grp_id): print the max score element's id in that group. (Return any if multiple exist)

Constraints:

1 <= no_of_elements <= 1e6
1 <= no_of_groups <= 5
1 <= score <= 5

I couldn't solve it during the interview and also couldn't think of any solution later. Would someone please help?

  • Vote: I like it
  • +12
  • Vote: I do not like it

»
16 months ago, # |
  Vote: I like it +10 Vote: I do not like it

Is it possible for one element to be in multiple groups? If so, does each element just have one score, or does each element-group association have its own score?

  • »
    »
    16 months ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    No that is not possible, each element will be associated to one group only.

    • »
      »
      »
      16 months ago, # ^ |
      Rev. 2   Vote: I like it +4 Vote: I do not like it

      It is possible to support each operation in amortized constant time with linear memory, even if the number of groups is much larger. (I'm treating the maximum score as a constant, since it's only 5.)

      For each group, we'll maintain a bucket queue, which is a type of priority queue where each score has a "bucket", in this case a circular doubly linked list. Each bucket will start with a sentinel node to make implementation easier.

      • For insert, add the element's node to its group's bucket with the corresponding score.

      • For the element version of set, remove the element's node from its group's bucket with the corresponding score.

      • For the group version of set, splice all of the other buckets onto the end of the bucket with the corresponding score.

      • For print, scan the buckets in reverse order until you find one that isn't empty and return its first element.

      Operations 3 and 4 are linear in the number of buckets, but again, we're treating that as a constant.

      Implementation

      For an interesting challenge, see if you can figure out how to achieve amortized logarithmic time complexity per operation even when both the number of groups and the maximum score can be large.

»
16 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by roycf123 (previous revision, new revision, compare).

»
16 months ago, # |
  Vote: I like it +1 Vote: I do not like it

Not a very efficient approach but i think it should work: Think of all groups to be max heaps containing the elements. Also remember each elements index in its respective heap or group after each operation. Inserting element in group would simply be heap push operation O(logn), Changing score of element would be using the elements index in its heap and doing decreaseKey() operation in the respective heap 0(logn), changing all elements in a group would be changing each value in heap to O(group size) and printing max score of a group would be heap.top() O(1).

  • »
    »
    16 months ago, # ^ |
    Rev. 4   Vote: I like it 0 Vote: I do not like it

    Sorry I read the comment in a hurry...

    This is exactly what I did (except for the use of std::set instead of heaps, I performed search operation and changed for 2), But the set(grp_id,x) would take O(group_size) which can be O(1e6) in the worst case, so may be costly (in terms to time taken)...

    Is there any way to do it faster, that is any data structure that may support this form of group update fast?

    P.S: I think there may be a better way to utilize the constraint 1 <= score <= 5, although not sure...

    • »
      »
      »
      16 months ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      Timestamps for operations set(el_id,x) and set(grp_id,x). Store changed values of operation set(grp_id,x) for groups instead of elements. Changing all elements in a group could be done in O(1). Each element has two values: one belongs to itself (determined by operations insert(el_id,grp_id,x) and set(el_id,x)), the other belongs to its group (determined by operation set(grp_id,x)). Choose the one that has bigger timestamp when do other operations in the heap.

»
16 months ago, # |
  Vote: I like it 0 Vote: I do not like it

a self balancing binary tree? where each node represents a group which is a binary tree of elements, all of the operations can be done in O(log(n) + log(m)) ... set(grp_id,x) will take O(log(n) + m).

  • »
    »
    16 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Yeah same as before, many people use std::multiset instead of std::priority_queue and it does no harm in most cases.