Блог пользователя Saucemaster102

Автор Saucemaster102, история, 20 месяцев назад, По-английски

So recently I was interviewing for a big data engineer role, I had cleared the first round which was just DSA, SQL, and discussion on the big data tech that I had worked on, in the second round I came across an interesting problem so I wanted to share it with you all, it was same as 2 sum but i was expected to use map-reduce:

Problem statement:

There are a large number of numbers residing on multiple nodes (storage + commodity compute), you have one main compute server where you can perform your computing, find all distinct pairs which equal to a sum of "target" and the count of those pairs, now you can't load all the numbers on your main compute as you do not have enough storage to do so.

find all x+y=target

The solution the interviewer expected:

approach: Map-Reduce:

send a map program to all the nodes where the map function calculates the following :

(number,count(number))

next is the reduce phase where we use the following hashing algorithm in order to assign the above-calculated info in various nodes into buckets in our main compute

The bucket a given number is assigned to, is calculated as follows: min(number, target-number)

in that way, a pair, if they have a sum equal to target, will be assigned to the same bucket and we can compute our answers easily (edge case for equal numbers but yeah that can be handled somehow)

Now I know this question has a bunch of loop holes but the I hope the intent of the question was understood.

This was the first time I came across such a problem, can someone recommend me resources to learn/practice such problems?

Полный текст и комментарии »

  • Проголосовать: нравится
  • +1
  • Проголосовать: не нравится

Автор Saucemaster102, история, 2 года назад, По-английски

hey guys , a while back I had come across this problem(sorry I don’t have the link to problem statement) , it was basically, given two arrays a and b of size n and m , find the gcd of all the elements of the resultant array c where c is the Cartesian product of a and b , I could not figure out a better approach than O(n*m) Does anyone have a better approach ?

Constraints : n,m<1e+5

Полный текст и комментарии »

Теги gcd
  • Проголосовать: нравится
  • 0
  • Проголосовать: не нравится

Автор Saucemaster102, история, 4 года назад, По-английски

Hey guys im new here and i wanted to know if i participate in virtual contests will i be given rating and also currently i have no rating and i participated in my first contest where i got just one question right, will i get finally get a rating? or is there some threshold i need to cross? Thanks in advance :)

Полный текст и комментарии »

  • Проголосовать: нравится
  • +3
  • Проголосовать: не нравится

Автор Saucemaster102, история, 4 года назад, По-английски

so im new here and i entered my first contest, i started it an hour late and was able to crack only the first problem and after the contest second sum, initially it showed accepted for both these sums, first one under contest submissions and second one under practice submissions, but now suddenly all my submissions are missing for this contest, i dont know where did it go, the first sum especially, it shows i have not submitted any code whereas the second one shows my code is accepted but on the problem page it appears in red and not green and the first one appears as though i have no attempted it

Полный текст и комментарии »

  • Проголосовать: нравится
  • +7
  • Проголосовать: не нравится