Google Hashcode 2021 Practice Round Discussion

→ Обратите внимание

До соревнования
Codeforces Round (Div. 2)
6 дней

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3611
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	dXqwq	3436
8	Radewoosh	3415
9	Otomachi_Una	3413
10	Um_nik	3376

Страны | Города | Организации

Всё →

→ Лидеры (вклад)

№	Пользователь	Вклад
1	Qingyu	161
2	adamant	149
3	Um_nik	146
4	Dominater069	143
5	errorgorn	141
6	cry	139
7	Proof_by_QED	135
7	YuukiS	135
9	chromate00	134
9	TheScrasse	134

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя TheOneYouWant

Google Hashcode 2021 Practice Round Discussion

Автор TheOneYouWant, история, 5 лет назад, По-английски

Hello everyone,

Since the practice round doesn't seem to provide a scoreboard, I'd like to open this thread to discuss scores and possible approaches for the problem. We got the following scores after some mostly trivial ideas:

A: 65
B: 13,328
C: 702,974,812
D: 7,602,227
E: 10,477,632
Total: 721,068,064

We mostly did some greedies, followed by randomly taking a small subset and computing best answer for that subset. I tried to use max weight bipartite matching but failed to make it work well; I don't have fast codes for max weight general matching, which could have been used to compute "good" pairs of pizzas. Did anyone manage to make this approach work or have a better idea which gave significantly better score?

+134

TheOneYouWant
5 лет назад
28

Комментарии (24)

Показать архивные | Написать комментарий?

tob123

5 лет назад, скрыть # |

← Rev. 2 →

+39

We also did some greedy construction (mostly taking good matching pizzas with many ingredients first) and then used Simulated Annealing to improve it.

A: 49
B: 13,261
C: 714,252,603
D: 7,845,044
E: 10,773,367
Total: 732,884,324

→ Ответить

Priyansh31dec

5 лет назад, скрыть # ^ |

+17

Can you throw some light on what is Simulated Annealing?

→ Ответить

tob123

5 лет назад, скрыть # ^ |

+13

Actually, the greedy part is way more important. In the end, we take the most important groups (e.g. the 50 ones having the pizzas with many ingredients) and just switch two random pizzas of that group as a step. But this Annealing is actually not that much better for us than greedily taking steps that improve our current solution.

→ Ответить

7pablof

5 лет назад, скрыть # ^ |

We also used a greedy algorithm and randomized fine tuning step (simple local search in our case). By considering wasted ingredients in the greedy, we managed to further increase the score on D and E.

A: 74.
B: 13,773.
C: 712,312,932.
D: 8,061,713.
E: 10,988,069.
Total: 731,376,561.

→ Ответить

sharath101

5 лет назад, скрыть # |

We greedily tried picking the next "best pizza" (with most new ingredients) for each team from team of 4 to 2. We later also tried some optimizations but could increase the score by only a million at max. Randomized greedy solutions didn't seem to give a better score either.

A: 49
B: 11,145
C: 706,523,791
D: 7,832,329
E: 10,758,832
Total: 725,126,146

→ Ответить

Shahraaz

5 лет назад, скрыть # ^ |

+16

Disclaimer: this is not my work; I just happened to stumble upon it and felt like it is worth sharing

I found this repository by someone named Sagi Shporer , to add some context his team has been participating in hashcode since 2018, you can see all his repositories here

This year his team's practice round Score was 731,455,475 (soure)

A – Example — 74 points (74 before optimization)
B – A little bit of everything — 13,533 points (12922 before optimization)
C – Many ingredients 712,692,751 points (706,624,572 before optimization)
D – Many pizzas — 7,911,296 points (7,863,102 before optimization)
E – Many teams — 10,837,821 points (10,789,627 before optimization)

Algorithm description:

Phase 1: Build a solution
Phase 2: Optimizations

Phase 1 — Build a solution

Sort pizzas by the number of ingredients.
Build deliveries, first teams of 4, after that 3, after that 2:
2.1 Select the pizza with the most ingredients.
2.2 Select the pizza that will give the best improvement in delivery (most new ingredients, with the least overlapping ingredients).
2.3 Repeat 2.2 until the delivery is ready.

Phase 2 — Optimization

Try to swap 2 pizzas between any 2 deliveries — if it improves the score, make the swap.
Try to swap 1 pizza between any 2 deliveries — if it improves the score, make the swap.
Try to swap a pizza from any delivery with unused pizza — if it improves the score, make the swap.
Try to move 1 pizza between 2 deliveries (# of pizza in the 2 deliveries must be -+1) — if it improves the score, make the swap.
If any improvement performed in 1-4 — go to 1

Notes

Phase 1 takes about 5 seconds to run.
Phase 2 takes about 50 minutes to run with the current restrictions (implemented for D & E which are huge). About 1% score improvement.

Full source code

I hope you found this helpful :D

→ Ответить

Modi_bhagwan

5 лет назад, скрыть # ^ |

How did Phase 1 run so fast for him? Wouldn't the time complexity be quadratic in terms of the number of pizzas (M) ?

→ Ответить

akshitm16

5 лет назад, скрыть # ^ |

← Rev. 2 →

Breaking as soon as you get a pizza with almost new ingredients runs fast in D and E. You can't afford to traverse the whole array for choosing a pizza every-time in D and E.

→ Ответить

ilovecheapthrills

5 лет назад, скрыть # ^ |

Can you explain the reason for 4 to 2. Why starting with teams of size 4 is beneficial?

→ Ответить

sharath101

5 лет назад, скрыть # ^ |

← Rev. 2 →

Since we're greedily picking the pizzas with most ingrediants first, if we start with team of 2, then all of the "good pizzas" will be used up in teams of 2. We dont want that because the score depends on the sum of union of squares of ingredients in each team. And obviously the the distinct ingredients would be more if we use the good pizzas on a larger team size.

We also tried all 6 combinations just to be sure, but 4-3-2 expectedly gave a much better result.

→ Ответить

Priyansh31dec

5 лет назад, скрыть # |

← Rev. 4 →

A — 74 points — This one can be done by hand
B – 13,750 points — Greedy with some obvious brute force in picking.
C – 706,619,049 points — Picking greedily maximum ingredients pizzas first
D – 7,345,043 points — Same with some randomization
E – 10,369,792 points
Total — 724,347,708 points

Edit: Picked up small subsets and took the best possible combination from the sorted list (decreasing order of pizzas). Tried to swap pizzas between 2 deliveries randomly (swap only if it increases the score)... More the size of subsets better the answer... More the number of iterations of swapping pizzas better the answer.

A — 74 points — This one can be done by hand
B – 13,750 points — Greedy with some obvious brute force in picking.
C – 711,922,487 points — Picking greedily maximum ingredients pizzas first
D – 7,783,549 points — Same with some randomization
E – 10,636,376 points
Total — 730,356,236 points

→ Ответить

Priyansh31dec

5 лет назад, скрыть # |

Does having some knowledge of Machine Learning help in Google Hash Code?

→ Ответить

akshitm16

5 лет назад, скрыть # |

← Rev. 3 →

+12

First test set is small. So, you can try everything. For the rest, mostly picking the pizzas with largest ingredients and matching the pizzas such that increase of ingredients is maximum and/or intersection is less. Also, tried partial randomization of the sorted (based on count of ingredients) pizzas array and ran 10-2000 times depending upon the size of test set.

$$$A: 74$$$
$$$B: 12,676$$$
$$$C: 706,624,573$$$
$$$D: 8,061,427$$$
$$$E: 10,988,056$$$
$$$Total: 725,686,806$$$

→ Ответить

codertonk

5 лет назад, скрыть # |

A – 74 points
B – 13,862 points
C – 712,218,689 points
D – 7,923,609 points
E – 10,831,175 points
Total score — 730,987,409 points

→ Ответить

Errichto

5 лет назад, скрыть # |

+61

2.5-hour live stream with final score of around 730.7 million — https://youtu.be/BD57-3Zt5r4

Final code: https://github.com/Errichto/youtube/blob/master/hashcode/2021_practice_pro.cpp

→ Ответить

fsshakkhor

5 лет назад, скрыть # |

+19

Total: 732,415,324 points

A : None

B : 13,439 points

C : 713,419,440 points

D : 8,027,794 points

E : 10,954,651 points

Our Approach:

In our approach we tried to ensure two things mainly.

We tried to make some big deliveries with as many as ingredients possible. The reason is — the score is calculated based on square of the number of unique ingredients. So one big delivery can be far better than a few average deliveries.
Sometimes the pizzas in a delivery can have so many intersecting ingredients. So we substracted some cost for each intersecting elements.

Then we greedily tried to make deliveries starting with pizzas with most ingredients. Also we tried to make the deliveries in 4-3-2 team order which produced the best result. Lastly we tried to replace some of the pizzas with unused ones to check if it produces better result.

→ Ответить

BitSane

5 лет назад, скрыть # |

← Rev. 2 →

I mostly used some greedy code (picking the best every time), along with some randomization in the bigger files.
- A: 74
- B: 12,015
- C: 705,472,464
- D: 7,748,466
- E: 10,674,444
- Total: 723,907,463

→ Ответить

kinhosz

5 лет назад, скрыть # |

← Rev. 2 →

A. 74
B. 13,692
C. 703,144,369
D. 7,199,154
E. 9,771,415
Total: 720,128,704

→ Ответить

7pablof

5 лет назад, скрыть # |

← Rev. 2 →

A: 74.
B: 13,880.
C: 714,128,859.
D: 8,062,943.
E: 10,988,318.
Total: 733,194,074.

Our final approach, as most of the solutions discussed in this blog, consists in a two step process:

Greedily building a solution (based on maximizing score and also minimizing wasted ingredients).
Performing randomized improvements (i.e., simulated annealing).

In our case, improvements achieved by the second setp were negligible on most datasets (<1k), except for dataset C, which jumped from 706.6 to 714.1 million.

We would like to share some highlights on running times for dataset C:

712 million in 10 minutes (also achievable with hill climbing).
712.5 million in 30 minutes.
713.3 million in 2 hours.
714.1 million in 18 hours.

→ Ответить

ilovecheapthrills

5 лет назад, скрыть # ^ |

Great Score!

On which machine did you run your codes for 18 hours?

→ Ответить

7pablof

5 лет назад, скрыть # ^ |

Thanks! All the tests were executed my regular laptop (i7 @ 2.5GHz).

→ Ответить

PanicStation

5 лет назад, скрыть # |

Simulated Annealing without any greedy initialization got me to 719,012,213.

→ Ответить

PanicStation

5 лет назад, скрыть # ^ |

Pure Simulated Annealing gets 704,123,568 in C. Seems like greedy is the way to go.

→ Ответить

bully....maguire

5 лет назад, скрыть # |

I don't know if i should be ashamed of myself , but i think just implementing input and output was more difficult than solving any heavy implementation problem on CF.

→ Ответить

Соревнования по программированию 2.0

Время на сервере: 11.05.2026 00:12:50 (f1).

Десктопная версия, переключиться на мобильную.

При поддержке