Блог пользователя eatmore

Автор eatmore, история, 4 года назад, По-английски

Google Code Jam 2021 has been announced. Here is the schedule:

The location of the finals will be announced later.

The dates of other Google competitions, Hash Code and Kick Start, has also been announced.

  • Проголосовать: нравится
  • +118
  • Проголосовать: не нравится

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +82 Проголосовать: не нравится

congratulations gennady for winning

»
4 года назад, # |
  Проголосовать: нравится +18 Проголосовать: не нравится

Will their be system testing after wards?

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    A few of the problems have hidden testcases that will be revealed after the contest. (In particular, the second and fourth problem.)

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      If i fail at that hidden test case will i get any score as i have passed test1 and test2.

      Extra test will be added or not?(like codeforces)

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Please provide the setting (if possible) to change tabs character width, I like the editor otherwise. Of course, I don't complain that it doesn't support snippets :p

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

I think the hidden 1 point is the most expensive point to get for the amount of extra work required. Agree?

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +29 Проголосовать: не нравится

    Somewhat agree, but there is actually a nice solution to P2 that works for all subtasks without any extra work at all :)

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +16 Проголосовать: не нравится

      Yeah I didn't even know there was a greedy idea that worked only for positive weights till I opened the comments here.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +3 Проголосовать: не нравится

    I didn't get a solution that works for the P2 but doesn't work for the hidden 1 point

    • »
      »
      »
      4 года назад, # ^ |
      Rev. 4   Проголосовать: нравится -117 Проголосовать: не нравится

      Deleted

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +25 Проголосовать: не нравится

        OMG, contest is still running, don't give any hint. Delete this reply now.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +1 Проголосовать: не нравится

          I thought the discussion of the qualification round was allowed?

          Google Code Jam Rules state that

          Sharing information with other contestants is permitted in the Qualification round only. Notwithstanding Section 7.1(E) of the Terms, you will not be disqualified for using information from or sharing information with others about problems during the Qualification Round of Code Jam.

»
4 года назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

:)

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

How many point do we need to qualify? Is it decided after?

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

I'm getting this error in interactive problem on my laptop: "judge: Case #71 failed: Couldn't read a valid line". I checked and it's actually EOFError. I submitted my solution anyway and it got 18 pts (as of right now), so I guess it's correct, more or less. So why such error?

»
4 года назад, # |
  Проголосовать: нравится +13 Проголосовать: не нравится

Are you sure that the start and end time for qualification round are correct?

The end time in the post is after ~15 hour. While Codejam page shows that the remaining time is only ~8 hours.

Please check this and update the post if there is an error. Some people may miss the round due to that.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +8 Проголосовать: не нравится

    According to Code Jam Rules 3.1,

    Code Jam will start with the online Qualification Round on Friday, March 26, 2021 at 13:00 UTC and run for thirty (30) hours, ending on Saturday, March 27, 2021 at 19:00 UTC.

    So this post writes a wrong start and end time of qualification round.

    By the way, schedule of other rounds are consistent with the rules and schedule page of Code Jam.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +30 Проголосовать: не нравится

    Google changed the schedule of the qualification round after this post was published. I updated the post.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится -44 Проголосовать: не нравится

      Sir, please extend the deadline for Google Codejam, I registered for the event and didn't got any mail and thus missed the round, now just 7 minutes left and I saw this post on Codeforces and then came to know that the round is almost over, please extend it now. Please

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +14 Проголосовать: не нравится

        Bruh, it doesn't work like that. It is the world's most prestigious competitive programming championship, not gully cricket.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        How did your extended Google Codejam round go? :P

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +4 Проголосовать: не нравится

I think Google might need to apply a cheating detection algorithm to the Cheating Detection problem

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    To be fair, Cheating Detection is solvable by even someone with no competitive coding background if they think about the Math behind it for some time and just try out ideas that seem like they might work.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +8 Проголосовать: не нравится

      I agree, however the pattern of submissions in the last 2-3 hours was pretty suspicious. Literally hundreds of new entrants submitting all 5 questions at the same time, in a very short time frame.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +23 Проголосовать: не нравится

    Note that discussing solutions is actually allowed for this round:

    7.1 Qualification Round. Sharing information with other contestants is permitted in the Qualification round only. Notwithstanding Section 7.1(E) of the Terms, you will not be disqualified for using information from or sharing information with others about problems during the Qualification Round of Code Jam.

    https://codingcompetitions.withgoogle.com/codejam/rulesandterms

»
4 года назад, # |
Rev. 4   Проголосовать: нравится +17 Проголосовать: не нравится

I've no proof to my solution of Cheating Detection but I did something which looked good to me and it worked :)

I took all the players having $$$>=5000$$$ score and sorted them in non increasing order of the solves. Assuming no cheating, I thought the adjacent players will have less mismatches because of similarity in value of $$$S$$$. Mismatch in a particular question is when score of player $$$i$$$ is different from score of player $$$j$$$. Calculated mismatch for all players. For players which are not on the end points, I took the average value.

I defined a function $$$f_i = a \cdot mismatch_i + b \cdot solves_i $$$.

I took the player having the maximum $$$f$$$ because there is good chance for the cheater to have good amount of solves and good amount of mismatches from its adjacents.

Tried bunch of values for $$$a$$$ and $$$b$$$ and it passed both sub-tasks with $$$a=5$$$ and $$$b=4$$$.

Solution (bit messy)

Sadly, I missed $$$3rd$$$ sub-task of Moons and Umbrellas and couldn't get perfect score. I should've tested it properly :(

»
4 года назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

How to "prove" the full solution to P5? I just tried an intuitive idea hoping it would pass Test Set 1 and got AC on both Test Sets.

I'm assuming there are several different approaches that works, so I'm going to roughly state my solution:

Let the easyness of a question be the number of people who solve it.

Then I just select the participant with the max variance of easyness over problems they solve.

I think its pretty intuitive why this is a fairly decent metric, but how do you even begin to prove that it will work on average at least 86% of the time with random data?

Was there some other idea that was easy to prove?

Code
»
4 года назад, # |
Rev. 2   Проголосовать: нравится +26 Проголосовать: не нравится

Problems solutions:

A
B
C
D
E
  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Similar in E, divide success rate in 10% hardest questions by overall success rate and whoever has the highest is the cheater.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    In C's solution

    "If you reversed from last to beginning, you can reverse it again from beginning to last."

    Can someone please explain it a bit more

    Thanks!

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      For example, if we have initial array:

      1,2,3,4,5

      If we reversed 4th index till last:

      1,2,3,5,4

      If we reversed 3rd index till last:

      1,2,4,5,3

      We have an array of: 1,2,4,5,3

      Now, if we reversed from 3rd index till last:

      1,2,3,5,4

      If we reversed from 4th index till last:

      1,2,3,4,5

      We returned to the original array. Meaning, that this operation, is reversible so the problem turned from finding an array with C to find how many reverses you need to reach C(so it became really much easier like the first problem).

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +45 Проголосовать: не нравится

Thanks for the round! I think my E solution is considerably stronger than intended--I generated a suite of 2000 test cases locally and found, rather shockingly, that it had 100% accuracy on the sample. I also tested locally and found that it has 100% accuracy on the actual input.

For anyone curious, here's my approach. I sort the problems by solve counts and split them into 50 buckets of 200 problems each. Then, I assume the difficulty of problems in the i'th easiest bucket (using 0-indexing) is $$$-2.94 + 0.12i$$$. (In other words, I assume the buckets cover evenly distributed subintervals of $$$[-3, 3]$$$ and that the difficulty of each problem is approximated by the average difficulty in its bucket.) For each contestant, I iterate over all possible skill levels of the form $$$-3 + 0.06i$$$ (assuming that this will be a reasonable approximation for their actual skill), and I compute an error function (the sum of the squares of expected solve count minus actual solve count over the fifty buckets) for each player-skill level combination if the player is cheating and if they are not cheating. Then, I compute each player's minimum error function if they are cheating and minimum error function if they are not cheating, and I output the player who minimizes the quotient when the error function if the player cheats is divided by the error function if the player does not cheat.

Here's my code: https://pastebin.com/4Vy70PRg

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Wow, that distribution is way more accurate than me just getting mean values which still has 99.9% success rate(10 failed on 5K cases).

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +8 Проголосовать: не нравится

      I'm impressed that you're able to achieve such high accuracy with a relatively simple approach--thanks for sharing your solution!

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +34 Проголосовать: не нравится

    If your success rate on 50% is very high, you can try evaluating accuracy when the cheater only cheats 10% of the time or something. My solution gets 1000/1000 with a 50% cheater, and about 80% accuracy on a 10% cheater. I bet you can do better though.

    Here's my approach: The "theoretically correct" solution is to directly apply Bayesian analysis and evaluate a 10100-dimensional integral of conditional probabilities, because you have a complete set of priors of the inputs. This seems pretty infeasible (though maybe someone knows how?), so one approximation you can take is assuming that the Bayesian posterior of the people/problem ratings are all very sharp peaks, and you just try to estimate the location of the peak. You can do this iteratively by alternating optimizing people-ratings and optimizing problem-ratings. Once you've estimated the maximum-likelihood ratings, you can then take your conditional probabilities: Bayes' theorem says that you should rank people based on the ratio of $$$P(given-scores | cheating) / P(given-scores | not-cheating)$$$, which is a ratio of sigmoids.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      My solution essentially has the same idea as yours, so I've tried my solution on a 10% cheater as you suggested. Although I get 100% accuracy on a 50% cheater, the result is around 5%.

      Can you share more details on how do you optimize people and question ratings? How many iterations do you do? I could afford only 10, otherwise, I got TLE.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        I actually only do 1 iteration; I initialize the people ratings by taking the inverse of the sigmoid CDF of their solve rate (assuming questions are uniformly distributed), and then I ternary searched to find optimal problem ratings.

        Did you change your constants in the Bayes' rule evaluation with the updated cheat percentages?

        Code here: https://ideone.com/nRYKXS

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          You were right, I didn't change the constant in Bayes rule... Now the accuracy on 10% cheater is around 65%, which is not that bad, but far from your result.

          My solution is substantially a log-likelihood gradient ascent, starting from same yours initial values. Probably with a better tuning of the learning rate and squeezing more the time limit to make more steps I can improve the result, but definitely ternary search is better.

          Thanks!

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +25 Проголосовать: не нравится

      I used this Bayesian approach but with a simple one-step approximation for question difficulties like Geothermal. The algorithm is what Geothermal described except you replace the sum-of-squares error function with a log-likelihood. On 5000 test cases with a 10% cheater it passes 4156.

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

How to solve problem D for hidden test case ?

I used binary search and number of iterations will be around $$$log2(3)+log2(4).....+log2(50)$$$ in worst case which is >200 but since data in input are randomly made , I think average could be less and thus it might pass the hidden test case .

Also when hidden test case verdict will be out ?

  • »
    »
    4 года назад, # ^ |
    Rev. 2   Проголосовать: нравится -31 Проголосовать: не нравится

    What you did is binary search .but if think carefully you'll find that actually you can split list into 3 parts at the same time. Here's my code in python3

    t, n, q = map(int, input().split())
    for i in range(0, t):
        done = 3
        print(1, 2, 3)
        y = int(input())
        if (y == 1):
            x = 2
            z = 3
        elif (y == 2):
            x = 1
            z = 3
        else:
            x = 1
            z = 2
        ans = [x, y, z]
        next = 4
        f = 0
        while (next < n + 1 and f == 0):
            start=0
            e = 0
            end = len(ans) - 1
            while (e == 0):
                
                a=start+(end-start+1)//3
                mid = (a + end) //2 + (a+end)%2
                # print(a,mid)
                # print(ans)
                print(ans[a], ans[mid], next)
                y = int(input())
                if (y == -1):
                    f = 1
                    break
                elif y == ans[a]:
                    end=a-1
                    if(start-end==1):
                        ans.insert(start,next)
                        e=1
                    if(start==end):
                        end+=1
                elif y == ans[mid]:
                    start=mid+1
                    if(start-end==1):
                        ans.insert(start,next)
                        e=1
                    if(start==end):
                        start=mid
                else:
                    start=a+1
                    end=mid-1
                    if(start-end==1):
                        ans.insert(start,next)
                        e=1
                    if(start==end):
                        start=start-1
                    # if (mid - a == 1):
                    #     ans.insert(mid, next)
                    #     e = 1
            if (f == 1):
                break
            next += 1
        if (f == 1):
            break
        for i in range(0, len(ans)):
            print(ans[i], end=' ')
        print()
        y = int(input())
        if (y == -1):
            break
    

    Hope it helps

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +31 Проголосовать: не нравится

    Hidden case verdicts are available now. The main idea of D is to solve the problem using insertion sort in $$$O(n \log_3 n)$$$. Essentially, if the position of an element $$$x$$$ to be inserted is bounded within the range $$$[l, r]$$$, query $$$x$$$ with the elements at positions $$$\frac{2l+r}{3}$$$ and $$$\frac{l+2r}{3}.$$$ This will result in a range roughly $$$\frac{1}{3}$$$ as large, achieving the desired $$$n \log_3 n$$$ complexity.

    • »
      »
      »
      4 года назад, # ^ |
      Rev. 3   Проголосовать: нравится 0 Проголосовать: не нравится

      Binary Search is actually sufficient as long as you ensure that a region of size $$$n$$$ is divided into at regions of size at most $$$\frac{n - 1}{2}$$$ and end the loop if you find the required point partway through.

      I'm not sure how to show this reduces it enough but it passed the hidden subtask and didn't exceed an average of $$$165$$$ queries over a few thousand runs locally with random input. I think the uniform random test generation plays a role in it.

      Code (ignore the unneeded messy endpoint conditions in check that aren't even needed)
    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +17 Проголосовать: не нравится

      Quicksort with two pivots also worked.

  • »
    »
    4 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +10 Проголосовать: не нравится

    The search for each new element can be reduced to $$$\log_3(n)$$$ by splitting the array into thirds when you are querying for the place of a new element.

    As a more general tip, remember you are not doing a comparison (i.e. binary operation), your query operation has 3 different possible outcomes, so you should be able to break up the space of possibilities into 3 different partitions on each query.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      gaurav,Geothermal,Kognition log2(n) didn't worked :( . But i understood the log3(n) part .thanks i understood that i should have queried at points l + (r-l)/2 , r — (r-l)/2 for getting 1/3 of length in each iteration .

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        I applied binary search only with some some modifications to decrease queries per iteration and I was able to pass the last sub-task. I tested it locally and it was taking $$$155-175$$$ queries on an average.

        Code
        Code for testing
        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          Thanks for your code , could you please write in words how you modified the binary search ? It's little hard to read the code directly without any written info .

          • »
            »
            »
            »
            »
            »
            4 года назад, # ^ |
            Rev. 4   Проголосовать: нравится 0 Проголосовать: не нравится

            Here it goes-

            You can easily find the relative position of $$$3$$$ elements with just $$$1$$$ query. Now, I try to place elements from index $$$4$$$ to $$$N$$$ in already existing list one by one.

            Let's say during the current iteration, size of already existing list is $$$l.$$$ I define $$$start=0$$$ and $$$end=l-1.$$$ For insertion of new element, it can have potentially have $$$l+1$$$ positions ($$$l-1$$$ in between the elements and $$$1$$$ before start and $$$1$$$ after end). Define $$$mid=(start+end)/2.$$$

            To find the appropriate position-

            I ask $$$(start,mid,i)$$$ where $$$i$$$ is the index of current element we want to place. You can also ask $$$(end,mid,i)$$$ depending on your implementation.

            Now, we want to analyze how our search space will be reduced.

            1. If median is $$$start,$$$ we can stop our search and directly place $$$i$$$ before $$$start.$$$

            2. If median is $$$mid$$$, we can be sure that position of $$$i$$$ is definitely not before $$$mid.$$$ We can initialize $$$start=mid+1.$$$ This way we reduce the search space by half.

            3. If median is $$$i,$$$ we can be sure that position of $$$i$$$ is definitely not after $$$mid.$$$ We can initialize $$$end=mid-1.$$$ This way we reduce the search space by half. Also, we can increase $$$start$$$ by $$$1$$$ because we are also sure that the position is definitely after $$$start$$$, otherwise the result of the query would have be $$$start.$$$

            Also, I handle the cases when search space reduces to $$$3$$$ or less separately because I don't wanna mess up and end in some infinite loop.

            The total queries it takes is $$$1$$$ + $$$\Sigma^{N-1}_{i=3} (\log_2i-c)$$$ where $$$c \sim 1$$$. That $$$c$$$ is due to above mentioned few optimizations and the fact that input is random.

            Also, I tested it locally and it takes $$$150-175$$$ queries. My initial solution was taking around $$$\sim 250$$$ queries because I was extra cautious and assigned $$$start=mid$$$ and $$$end=mid$$$ in case $$$2$$$ and $$$3$$$ respectively and did $$$1$$$ more query in case $$$1$$$ to confirm.

            • »
              »
              »
              »
              »
              »
              »
              4 года назад, # ^ |
              Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

              Also, you can compare the both codes for better understanding-

              ~250 queries
              ~170 queries
»
4 года назад, # |
  Проголосовать: нравится -8 Проголосовать: не нравится

I have a nice recursive solution for $$$Median$$$ $$$Sort$$$. Inititially, call the function $$$sort(a[1,2,...,n])$$$. For all $$$3 \leq j \leq n$$$, query $$$a[1]$$$ $$$a[2]$$$ $$$a[j]$$$. From the output of the query, one can determine the position of $$$a[j]$$$ in the array, i.e. $$$left$$$, $$$middle$$$ or $$$right$$$ w.r.t $$$a[1]$$$ and $$$a[2]$$$. After this one can recursively solve $$$left$$$, $$$middle$$$ and $$$right$$$ parts. But the problem now is the three parts can be independently in correct or reversed order. We can determine the order by querying one of the elements of each part along with $$$a[1]$$$ and $$$a[2]$$$. I don't have a strict upper bound on the number of queries this program asks, but it asks about 165 queries per test case (tested locally on random tests).

My submission
  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    I've implemented the same solution, except that instead of using the first 2 elements in the array, I choose them randomly. This because it's easy to find test cases where your solution fails to use less than 175 queries, even on average (i.e. array already sorted). I think that if the problem was in a CF round somebody was going to hack you. Probably CJ test cases were weak enough to pass.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Nope. It doesn't matter if you take random elements. Because of this line in the problem statement: The order for each case is generated uniformly at random from all possible orders, and independently of any other information.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        I've missed that line in the statement, I'm sorry :(

        But still, I think it's important to keep in mind that kind of solution are easily hackable in CF

»
4 года назад, # |
  Проголосовать: нравится +21 Проголосовать: не нравится

Here's a link to my screencast: https://youtu.be/RThkP1UORhk. It'll be available in HD whenever the HD version finishes processing.

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +19 Проголосовать: не нравится

Seen lots of good solutions for E.

Here is how I went about the problem — note that the coin toss compresses the sigma function. If we compare a 'clever' person (let's say skill in the top 10%, i.e. > 2.4) with a cheat of average intelligence (0), we find the biggest gap between their probability of success lies in the middle questions.

So what worked for me was to take the top 5% of questions (0-499 in lowest solves list), find the 10% (i.e. 10) of people with the most solves on those questions, and then subtract their solves from the 'middle' 5% of questions (4750-5249 in the same list). Whoever has the biggest difference is the cheat.

clevercheat

»
4 года назад, # |
Rev. 3   Проголосовать: нравится +13 Проголосовать: не нравится

Question E — For every contestant, I calculated the Pearson correlation coefficient with

  • whether the contestant got the question correct for each question
  • how many contestants got the question correct for each question

I guessed that the cheater is the one with smallest Pearson correlation coefficient. After the qualifiers, I found out that my answer was accepted on the borderline (with 7 mistakes out of 7 allowed mistakes).

Spoiler
»
4 года назад, # |
  Проголосовать: нравится +33 Проголосовать: не нравится

I share also my solution for problem E, since nobody has discussed this approach yet.

I've got 100% accuracy in problem E by implementing a gradient ascent for finding the maximum likelihood estimation of $$$S_i$$$ and $$$Q_j$$$, ignoring the presence of a cheater. I've then chosen as a cheater the student that increases most the likelihood if the cheater-distribution is used.

Here is the code

»
4 года назад, # |
Rev. 4   Проголосовать: нравится +25 Проголосовать: не нравится

For E, sort the questions by number of solutions and for each player compute average squared distance (in the sorted question list) for all his answers from his median position.

Seems to work pretty well, 5 errors on 25,000 runs. (99.98% accuracy)

python code
»
4 года назад, # |
Rev. 12   Проголосовать: нравится +10 Проголосовать: не нравится

Here is a simple solution for E which passed both test cases:

Sort the problems by how many people solved them, and sort the players by how many questions they solved.

Let's say the "Difficult Problems" are the hardest 500 problems.

  1. If there is anyone ranked in the bottom 80 who solved more than 200 Difficult Problems, I say that they are the cheater.
  2. If not, then I say that whoever solved the most Difficult Problems is the cheater.

This solution tries to catch lower-skill cheaters with (1), and catch higher-skill cheaters with (2).

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

In problem D, Median Sort, I just got stuck to one idea ie using a BST, ultimately I couldn't get it right. Can anyone share insights if it's possible to solve it using BST?

»
4 года назад, # |
  Проголосовать: нравится -17 Проголосовать: не нравится
O(N) solution for C
»
4 года назад, # |
  Проголосовать: нравится +53 Проголосовать: не нравится

the discussions on problem E makes it sound like it is a hashcode problem: "yeah I don't know how it works, but here is my heuristic"

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

In country standing of codejam, Iran is missing. There are strong coders from Iran in codeforces community, then why the country name is gone?

»
4 года назад, # |
  Проголосовать: нравится +43 Проголосовать: не нравится

Here's a link to my [SPANISH] screencast + detailed explanation of problems

https://www.youtube.com/watch?v=hntk3GShfik

Fun fact: Problem D was actually almost identical to problem "Median" from IOI 2000, China!

I really liked all the Qualification Round problems this year. Congratulations to all the problem setters!

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +28 Проголосовать: не нравится

Another simple approach for E which also achieves a 100% accuracy on 25000 local tests:

  1. Estimate player $$$i$$$ skill level ($$$S_i$$$) as the sigmoid inverse of its ratio of correct answers.
  2. Estimate question $$$j$$$ difficulty ($$$Q_j$$$) as the sigmoid inverse of its ratio of failed answers.
  3. Loop through the questions answered by a player: If the probability of a player answering the question correctly is high (e.g., > 0.75) but the player has failed the question, then add $$$S_i - Q_j$$$ to the suspiciousness of player $$$i$$$.
  4. The cheater is the player with the highest suspiciousness.

The code is really simple:

C++ source code

Fun fact: During the round I wrongly estimated players' skill level and questions' difficulty, using a linear function between -3 and 3 instead of the sigmoid inverse. However, this was still enough to pass test set 2 with just 3 wrong cases.

  • »
    »
    4 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +30 Проголосовать: не нравится

    I see that your approach works very well, but I don't believe the inverse sigmoid accurately estimates the skill/difficulty level of contestants/problems. The inverse sigmoid will calculate the estimated skill of a contestant assuming all questions have difficulty 0, rather than from the uniform distribution [-3,3].

    Instead, i think what should be done, is to integrate across the possibility problem difficulties for each contestant

    $$$p = \frac{1}{6}\int_{-3}^{3}\frac{1}{1 + e^{q-s}}\,dq$$$

    Solving this leads to this formula for the skill level of the contestant given the percent of correct answers $$$p$$$

    $$$s = 3 + ln(\frac{e^{6p} - 1} {e^6 - e^{6p}})$$$

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +10 Проголосовать: не нравится

      Absolutely. I did the same and obtained 999 correct tests on 1000 random tests with a probability cheating of 50%.
      In order to stress the method, I used a 10% probability of cheating (as suggested by ecnerwala) and obtained 801 correct tests on 1000 random tests.
      I don't see any theoretical justification for using the inverse sigmoid method. This method actually underestimate the real skill by 30% to 40% on my tests.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +8 Проголосовать: не нравится

      Thanks, you are absolutely right!

      I changed the estimation to use you formula and now skill and difficulty estimations are considerably more accurate. However, using these improved estimations, the approach performs slightly worse overall (98% accuracy).

      I do not understand why the previous version worked so well...

      Just as a note: In the official Google analysis, they also state that skill/difficulty can be estimated by using the sigmoid inverse.

»
4 года назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

When does google code jam usually start? I want to put a reminder so that I don't miss registering for it next year.

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Apparently they are running a poll on twitter to decide design of this year's codejam. People here would be interested in voting.
https://twitter.com/googlecodejam/status/1379150242579410945

P.S. — I didnt liked any available design. :(

»
4 года назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

Why codejam lost so much hype? I see just one small discussion on this site, ~10k participants (2 times lesser than codeforces div2 round), just 54 points was enough to advance.

BTW why so weird point distribution? I thought that C2 was harder than A2 or B3...

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +9 Проголосовать: не нравится

    Maybe there was some discussion during the contest instead again ;)

    I am curious what was the worst case in B3, i.e. the maximal difference between sum and the answer.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +34 Проголосовать: не нравится

      Or maybe round 1A got missed by many due to late night/early morning time

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Here are the outputs, including differences between sum and answer, and the maximum difference at the bottom.

      https://paste2.org/BjBcKD2L

      The answer is 3025 (see bottom of file), so they engineered a test case to be at the limit, which makes sense.

»
4 года назад, # |
Rev. 3   Проголосовать: нравится -10 Проголосовать: не нравится

Round 1A: Reading the editorial for the "Hacked Exam" problem, I think that maybe they overcomplicated the solutions for test sets 1 and 2 (or I'm completely wrong?). I just figured that, with T/F questions, adding a second student with a lower score doesn't change the expected value. So, the answer is to just replicate the best student (or the worst one inverted, for that matter). Does it make sense or I just got lucky to have passed with this idea? Thanks for any help! And please follow the code below.

string inv(string a) {
    string r;
    trav(x, a) {
        if(x == 'T') r += 'F';
        else r += 'T';
    }
    return r;
}

void run_test() {
    int n, q; read(n, q);
    int mx = 0;
    string ans;
    rep(k, n) {
        string a; read(a);
        int s; read(s);
        if(s <= q / 2) a = inv(a), s = q-s;
        if(s > mx) {
            mx = s;
            ans = a;
        }
    }
    cout << ans << " " << mx << "/1" << endl;
}
»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Round 1A upsolving screencast with a lot of commentary https://youtu.be/RK-NFGYF-wY

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +10 Проголосовать: не нравится

Not sure where to ask this (can't find a R1B thread), so I'll just put it here.

Are you allowed to unofficially participate in Round 1B if you qualified off Round 1A already? I.e. will I be able to look at the problems and make submissions during the contest, or is my account barred from participation during contest?

»
4 года назад, # |
  Проголосовать: нравится +1 Проголосовать: не нравится

I don't know where to ask , that is why I am asking here Does Google run Plag checker in the middle of the contest ? I am asking this because my rank is decreasing , even though I am not submitting any thing .

  • »
    »
    4 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +13 Проголосовать: не нравится

    No, people above you are submitting their solutions again. The tie is broken by the last successful submission(source).

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +3 Проголосовать: не нравится

      Okkk , I didn't knew it , Now it looks like I did good by not taking chances for trying to solve hidden test cases .

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +21 Проголосовать: не нравится

        I afraid that's not correct.

        Submitting solutions again, increases the penalty only when the new solution got more points than the previous one.

»
4 года назад, # |
  Проголосовать: нравится +7 Проголосовать: не нравится

Another day, another lose!

»
4 года назад, # |
  Проголосовать: нравится +57 Проголосовать: не нравится

From qualifying with a 600ish rank in 1A last year, to not qualifying at all this year, oof

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Can someone help me with this, please

If one of your tickets is strictly closer to c than all other tickets

$$$OR$$$

if both of your tickets are the same distance to c and strictly closer than all other tickets, then you win the raffle.

Isn't the portion after or says the same thing as that of the portion before or?? Or it gives some extra condition to win?Can someone please explain it, I only took the first condition that at least one of my selected ticket should be closer to c than any of the given tickets, and it gave WA, also I saw the test data but can't figure out what went wrong

Thanks

  • »
    »
    4 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +12 Проголосовать: не нравится

    "all other tickets" includes the other one of your two tickets, so the second portion is needed.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    I'd point out that the word "strictly" wasn't there in the problem statement initially. It read:

    "If one of your tickets is strictly closer to c than all other tickets or if both of your tickets are the same distance to c and closer than all other tickets, then you win the raffle."

    Notice the missing strictly. I even raised a question about this and got a very general response, they could have just said that the statement was updated :/

»
3 года назад, # |
  Проголосовать: нравится +29 Проголосовать: не нравится

The emails for claiming your GCJ T-shirts are out if you've qualified to Round 3. The deadline to claim it is June 13 — when I told some friends, it seems it had gone to their spam folder, so it's best to check ASAP since the deadline is this close.

»
3 года назад, # |
  Проголосовать: нравится +9 Проголосовать: не нравится

I haven't received any mail for GCJ tshirt(checked spam folder as well) eatmore

»
3 года назад, # |
  Проголосовать: нравится -18 Проголосовать: не нравится

I haven't received any mail for GCJ tshirt(checked spam folder as well) eatmore

»
3 года назад, # |
  Проголосовать: нравится -9 Проголосовать: не нравится

I haven't received any mail for GCJ tshirt as well. eatmore

»
3 года назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

Is there anyone who has received their Code Jam T-shirt now?

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится +16 Проголосовать: не нравится

    Nope, they didn't even send tracking information even though I confirmed the order about a month ago.