GP of Udmurtia 2018-2019

Блог пользователя dragoon

Автор dragoon, история, 8 лет назад, По-английски

Let's discuss solution

opencup, opencup2018-2019

dragoon
8 лет назад
37

Комментарии (37)

Написать комментарий?

_LeMur_

8 лет назад, скрыть # |

+13

How to solve D?

→ Ответить

dragoon

8 лет назад, скрыть # ^ |

← Rev. 3 →

+18

For D, we did a matrix exponentiation on number of different graphs (for n=5, there were only 34 different graphs, link). However, it was still somewhat slow to raise power for every query and precomputing "power of 2" matrices was not enough. So the idea was instead of doing (A(B(CD)))v we did (A(B(C(Dv)))) which is qlogn*n^2 instead of qlogn*n^3. And that was sufficient to get AC (My teammate Bruteforceman mentioned this blog).

→ Ответить

krijgertje

8 лет назад, скрыть # ^ |

← Rev. 3 →

+15

You can normalize each state. Then for n=5 there only remain 34 states, 13 of which are unconnected. If you know the probability distribution of states at time l, you can use matrix exponentiation to get the answer, since 14³·q·log(r - l) fits within the time limit (one extra state for 'having been in any connected state').

Unfortunately 34³·q·log(l) seems too slow to use matrix exponentiation directly for getting this probability distribution. Instead, using matrix exponentiation again, calculate the probability of exactly k edges flipping in l steps (only 11 states so it again fits within the time limit). Now iterate over the (unnormalized) state at l and using previous result it is relatively straightforward to compute the probability that this will be the state at l.

Combining these two steps got me AC in 373ms.

→ Ответить

dragoon

8 лет назад, скрыть # ^ |

Can you elaborate: <<calculate the probability of exactly k edges flipping in l steps (only 11 states so it again fits within the time limit)>>?

Don't you need to know the exact edges? Or it's like, for each bitmask of edges, you check what's the probability only these will flip?

→ Ответить

krijgertje

8 лет назад, скрыть # ^ |

If you want to go from one state to another state, you know how many edges should flip and using the matrix exponentiation you get to know the probability of exactly this many edges flipping. To get the probability of this specific set of edges flipping, divide by $\text{[math]}$ , since each set of edges of the same size has the same probability of being flipped.

→ Ответить

AryssonFigueiredo

8 лет назад, скрыть # ^ |

Could you elaborate how to maintain the "extra state for having been in any connected state"?

→ Ответить

TooDumbToWin

8 лет назад, скрыть # |

How to solve E?

→ Ответить

SirirNicheBirirDokan

8 лет назад, скрыть # ^ |

+18

Generate a million binary strings and pray.

→ Ответить

dragoon

8 лет назад, скрыть # ^ |

← Rev. 3 →

Solved by Bruteforceman. The main idea is: suppose you would like to know what is at i'th position. Then query for:

[i, j], [i+1, j]
[i, j+1], [i+1, j+1]
[i, j+2], [i+1, j+2]

and so on. Say the reply to the first query is X and to the second query is Y. Now, the probability for both being valid is 1/4. Also, you may assume that if: Y = {X, X-1} then may be (X, Y) are valid. So record, X-Y. If you see number of 0 or 1 going over say 2 or 3, then you know that by high probability the answer is that number (0 or 1).

→ Ответить

alex9801

8 лет назад, скрыть # ^ |

+18

Is there any proof for probability bound in this solution? I think the chance of getting AC in one try is quite low, considering that there were 150 test cases :(

→ Ответить

rng_58

8 лет назад, скрыть # ^ |

It's important to use long intervals: When i < 500, use [i, 1000], [i, 999], ..., and when i > 500, use [1, i], [2, i], ....

→ Ответить

alex9801

8 лет назад, скрыть # ^ |

+18

That is exactly what I submitted, but got WA..

→ Ответить

TooDumbToWin

8 лет назад, скрыть # ^ |

Same. Got WA on test 33.

→ Ответить

Kaban-5

8 лет назад, скрыть # ^ |

← Rev. 10 →

-10

Waiting for 4 evidences accumulating either to the side of 0 or to the side of 1 was enough to get AC (but 3 is still not enough, 5 is also OK and 6 exceed the query limit).

UPD This is wrong as pointed by --Pavel-- below (I also locally stress tested the solution and achieved the same results).

UPD 2 Interestingly enough, I also made a mistake when accounting for query caching. Actually it is not so wrong, at least for 4 (but 5 already exceeds query limit and 3 sometimes identifies the string incorrectly).

Code: https://pastebin.com/xHKKWFSX Said code passes 1000 random tests with less than 17000 queries used on each. Note that linked code does not ask queries till the end of the string, but rather on substrings of length ≈ 400.

UPD 4 Removed all statements about weak tests, actually it was just me counting queries incorrectly.

→ Ответить

--Pavel--

8 лет назад, скрыть # ^ |

← Rev. 2 →

+38

For waiting for 4 evidences you need ≈ 31950 queries according to my local stress test. Indeed for each 0 ≤ i < 1000 you need to wait for 4 appearances of event with probability $\text{[math]}$ . So average waiting time is $\text{[math]}$ . 2 queries for each step and we get 1000·16·2 = 32000. So even 32000 should not be enough.

→ Ответить

rng_58

8 лет назад, скрыть # ^ |

To get the i-th position you call [i, 1000], [i, 999], ... and [i + 1, 1000], [i + 1, 999], .... Then to get the i + 1-th position you call [i + 1, 1000], [i + 1, 999], ... and [i + 2, 1000], [i + 2, 999], .... This means that many queries can be shared and the number of queries is roughly halved.

→ Ответить

--Pavel--

8 лет назад, скрыть # ^ |

+30

Thanks, that's a good point. It was stupid, I forgot to cache queries in local test, did it only in server version. But I still getting ~ 18700 queries.

→ Ответить

rng_58

8 лет назад, скрыть # ^ |

+30

Wait until the difference of number of evidences becomes 3. In each attempt you get a correct one with probability 1 / 4, and a wrong one with probability 1 / 1000. This means that the failure probability for a single position is about (1 / 250)³ and the failure probability is about 1 / 15625. https://ideone.com/jOqZgM

→ Ответить

dacin21

8 лет назад, скрыть # ^ |

← Rev. 2 →

+13

Let DP[a][b] be the probability that we succed if we're waiting for a pieces of good evidence or b pieces of bad evidence. With a single query, we either get good or bad evidence or nothing. Let p_g be the probability of getting good evidence and p_b the probability of getting bad evidence. Then

$\text{[math]}$

$\text{[math]}$ . If we choose to query the longer side, $\text{[math]}$ . ( $\text{[math]}$ to get at least one random value and $\text{[math]}$ for the last random value to match.) Computing this DP gives us (code).

n	1 - DP[n][n]	1000·(1 - DP[n][n])
1	0.00596421	5.96421
2	0.000106291	0.106291
3	2.10265e-06	0.00210265
4	4.36567e-08	4.36567e-05

By a union bound, the probability to fail at all is at most 1000 times the probability to fail at any point, so 1000·(1 - DP[n][n]).

So it seems like waiting for 2 pieces of evidence is not enough, but waiting for 3 pieces should work at least half of the time if we assume 250 test cases (by another union bound).

The solution where you wait for a difference of at least 3 should be evaluatable with a Markov chain instead of DP. I might do this after dinner.

Edit: Markov chain code. Table:

n	fail[n]	1000·fail[n]
1	0.00596421	5.96421
2	3.59987e-05	0.0359987
3	2.16e-07	0.000216
4	1.296e-09	1.296e-06

So here n = 3 works at least 95% of the time if we assume 250 test cases.

→ Ответить

--Pavel--

8 лет назад, скрыть # ^ |

+10

Our solution which passed stress testing:

Let's guess the second half of the string. For each $\text{[math]}$ ask for K values: [0, r], [1, r], ..., [K - 1, r]. For each $\text{[math]}$ consider multiset $\text{[math]}$ . Sort all pairs by decreasing of frequency of the most frequent value in $\text{[math]}$ .

Now we found some correct differences of some prefix sums. So go through these pairs in sorted order and join indices in DSU (like in Kruskal's algorithm) until we got $\text{[math]}$ components. To get remaining $\text{[math]}$ edges just ask for value in some positions, like in other solutions waiting for 4 evidences. Then do dfs on generated tree and find out all $\text{[math]}$ .

We used $\text{[math]}$ .

→ Ответить

rng_58

8 лет назад, скрыть # ^ |

+45

Let me summarize the solution for E. Many people are downvoting descriptions of solutions, but they are correct and I have a complete proof for it. However, various details are important in the proof and slight difference of algorithm may result in significant difference of number of queries / failure probability.

We decide letters one by one. Basically, to decide the i-th letter, we choose some j, and compute the difference of [i, j] and [i + 1, j]. For a single choice of j, it returns the correct answer with probability at least 1 / 4. So, if we repeat this for various j and get some statistics, it's likely that we get the correct value of s_i.

Suppose that s_i = 0. As we see above, for any j, the probability that we get the evidence of s_i = 0 (i.e., Query[i, j] = Query[i + 1, j]) is 1 / 4. What is the probability that we get the evidence of the opposite thing (i.e., accidentally get Query[i, j] - Query[i + 1, j] = 1)? To keep this probability small, it's important to use long intervals. When j - i ≥ 500, this probability turns out to be about 1 / 1000 (by a simple calculation). There's no such j when i is big, in this case we should use Query[j, i] - Query[j, i - 1] for some j.

Since this is probabilistic, we need to try various js. How many js should we try in particular? Imagine a lottery that gives one of "A" or "B" with probability 1 / 4, the other letter with probability 1 / 1000, and "?" in all other cases. We want to decide which of "A" and "B" is more frequent by drawing this lottery several times. One natural way is to repeat drawing the lottery until the difference between (the number of "A"s) and (the number of "B"s) becomes 3. Again, some computation shows that with this strategy, we can almost always find the correct one: the failure probability is about (1 / 250)³. (This is equivalent to Gambler's ruin. See the section "Unfair coin flipping"). Note that, another natural way is to stop when you get the same letter three times: it has a worse error probability.

If we use the strategy above for all positions, since in each position we fail with probability (1 / 250)³, the total failire probability per testcase is about 1000 * (1 / 250)³ = 1 / 15625. Thus this algorithm will get AC with about 99% probability even with 150 testcases.

What is the number of queries? Again, the detail of the choice of j matters. One possible way is to use j = 1000, 999, 998, ... when j < 500 and use j = 0, 1, 2, ... otherwise. For a single i, we draw the lottery about 12 times on average. This is because in 12 attempts the expected difference between the number of two letters is about 3. Thus, for small i, we send queries for intervals [i, 1000], [i, 999], ..., [i, 989] and for large i, we send queries for [0, i], ..., [11, i]. Note that, when we try to get s_i we use queries [i, j] and [i + 1, j], and when we try to get s_i + 1 we use queries [i + 1, j] and [i + 2, j]. We should memoize the answer for [i + 1, j] and do not ask it multiple times. In total, roughly speaking, we need 12000 queries on average. Of course, the number of attempts for a single i is not always exactly 12 because this is a probabilistic parameter. This raises the number of queries a bit, but still we are comfortable with the limit of 18000.

My code: https://ideone.com/jOqZgM

→ Ответить

Nyatl

8 лет назад, скрыть # ^ |

+28

I'm the author of this problem and I understand that I failed and set too low query limit.

We had too much problems with contest preparation and I asked my team to solve problem E and write solution that you described. Their solution has different constant 6 and uses not more than 18000 queries. But only after contest I realized that they use some optimization. The solution only asks segments with length up to 500. So almost every segment is used four times.

→ Ответить

Petr

8 лет назад, скрыть # ^ |

+25

We didn't think about reusing queries, so we came up with an approach that requires less queries. First, for the first BUBEN numbers, do your solution, but without any memorization (I think we had BUBEN=300). This does on the order of 24 queries per item as you describe. Then, when we know all numbers from 0 to i-th, we can ask queries [j, i+1] to determine (i+1)-th number, since we know the sum in segment [j, i] already. So now we have probability of success 1/2 instead of 1/4, and for each attempt we need 1 query instead of 2. The segments are a bit shorter, though, so to be safe we waited for 5 votes here.

→ Ответить

dragoon

8 лет назад, скрыть # |

+10

How to solve F, G, H?

→ Ответить

AlexFetisov

8 лет назад, скрыть # ^ |

+15

My solution for F: Let's build the sorted list of the players one by one. To find the correct position for the next player we use binary search. That gives us 1000 * log operations which is about what we need.

Now the question is how to compare two players in the binary search? We have 5 groups: higher then 1st, 1st, 2nd, 3rd, lower then 3rd. Let's take any 2 people and ask if one is lower than the other. If we have no answer then we swap them and ask again. You can see that only for (1st, 2nd) and for (1st, 3rd) that will not give you the order. For others you will have the answer and complete the binary search iteration. If during the binary search we failed to insert the element, let's save it for later.

Now in the end we have one or two saved elements and the sorted list. I iterate from the lowest to highest asking the order and that's how we can find 1st-3rd person in the list. Then insert our saved for later there in any order and just sort those 3. That is easy — ask all 6 questions and it's enough to find the order.

I liked the problem personally, really nice. Kudos to the author.

→ Ответить

aid

8 лет назад, скрыть # ^ |

+10

Wow, this sorting algorithm is faster than merge sort. Only $\text{[math]}$ comparisons instead of $\text{[math]}$ ! I solved it with mergesort and thought that this problem is just about casework. Really nice that it has such an awesome solution.

→ Ответить

majk

8 лет назад, скрыть # ^ |

+10

My solution: Pick a guy, let's call him zero. Ask zero about everyone, and everyone about zero. This way, you know whether zero guy is first, second, third (I call these medal guys), above first or below third. Furthermore, you know the categorisation of all the remaining guys (except that sometimes you cannot distinguish a medal guy from either above or below guys). Write quicksort to sort the above and below guys. For the medal guys, you sometimes need special handling:

For instance, when the zero guy is above first, then all the medal guys behave the same with respect to him. Sorting the medal and below together cannot distinguish the medal guys properly, but they will be sorted as the top three in the "below" list. You just ask 6 queries, and one of them will identify second and third guy.
If zero guy is third, you cannot distinguish second from above first, so you sort them together. The second guy will naturally end up last.
and so on

Then it's fairly standard:

Waste a lot of time on implementation.
Get WA on test 49.
Randomise the selection of zero guy.
Get WA on test 41.
Realise contest ends in 5 minutes.
Submit the same code 10 times.
Get AC.

→ Ответить

ppavic

8 лет назад, скрыть # ^ |

I have a simmilar soluton, but we I pick the pivot I only ask the pivot about everyone and not also everyone about the pivot. It requires some extra casework. I actually had a bug in a very special case. And it didn't always work even on the sample but it was so rare I decieded to submit. Firsr submir, WA on 49. Submit the exact same code once more because the solution is randomized, AC :D.

→ Ответить

dacin21

8 лет назад, скрыть # ^ |

G: When deciding where to move, try the following in order (r is some value depending on your dice, described below.)

Move to (0, 0).
Move to a point from which you can reach (0, 0) with a roll of r.
Move directly toward (0, 0) if you're currently to far away.
Move away from (0, 0).

As for dice selection, there are a few possible issues

If we get too few or too low dice, then we spend many moves doing (3).
If we get too many dice, we spend many moves doing (2), waiting to roll r.
If r is too small and we roll some big value, we have to do (4) instead of (2), wasting the next move.

Two dice dice selection strategies that work:

Get four dice with an average of at least 8000 and set r to the sum of maximums. (Needs around $\text{[math]}$ moves to get into distance r and around 6⁴ = 1296 moves to finish which leaves some leeway for random variations.)
Get seven dice each having a duplicate value that is at least half the maximum and set r to the sum of duplicate values. (Needs around $\text{[math]}$ moves to get into distance r and around 3⁷ = 2187 moves to finish. This a bit risky if you get too low dice and can be improved by rejecting dice with an average below 5000.)

(I ignored the time taken to get the dice in my analysis, should be around 1'000 for the first and 7'000 for the second strategy in total, but the influence of this is halved as we're already moving toward (0, 0) during this time.)

→ Ответить

aid

8 лет назад, скрыть # |

+47

In the middle of the contest after 2 WAs on E our team decided to open pdf statements and found out that there is query limit in all interactive problems. :(

→ Ответить

krijgertje

8 лет назад, скрыть # |

+16

I stared for ages at why my codes for E and F gave wrong answer. I just found out that there is a query limit, which was not mentioned in the 'one-by-one' problem statement, but was only mentioned in the pdf which you can get by clicking on 'download statements'.

To be honest, I found it kind of strange that there was no query limit, so I read the statement (many times) letter for letter, but did not think to look in the pdf. Well, now I now better for next time.

→ Ответить

redjohn

8 лет назад, скрыть # |

How to solve A?

→ Ответить

dragoon

8 лет назад, скрыть # ^ |

First the slow solution:

DP(pos, set_of_taken_digits):
ret = 0
for i < pos:
  if digit[i] == digit[pos]:
    ret += DP(i - 1, set_of_taken_digits)
if digit[pos] is not in set_of_taken_digits:
  ret += DP(pos - 1, set_of_taken_digits Union digit[pos])
return ret

If you think a bit, you will be able to optimize the solution. First of all, set_of_taken_digits is a 10 digit bitmask. So, we will store dp[i] for each 0 <= i < len(S) where dp[i] is a 2^10 length array storing dp value for each bitmasks.

To compute dp[pos], we need dp[pos-1] (for the second if) and sum of dp[i] where S[i — 1] == S[pos]. So, we only need two 2^10 array (dp of i-1, and dp of i) and 10 2^10 arrays (for each digit).

→ Ответить

helThazar

8 лет назад, скрыть # |

+16

Does somebody know whether div2 upsolving is already enabled?

→ Ответить

rsFalse

8 лет назад, скрыть # |

Div.2 problem I was the easiest one. To compose lexicographically minimal string of zeroes and ones with given occurrences of '00' .. '11'. First idea was to write code with many "if ... else ...". Later I tried to partly bruteforce (--> spoiler). But what was the most simply or most elegant solution?

Spoiler: partly bruteforce

→ Ответить