attempt's blog

By attempt, history, 6 years ago, In English

There are $$$N$$$ questions, each of them has $$$k$$$ options. For each question, only one option is correct. Giving a black box that returns the number of correct answers of a submission. You try to submit until getting $$$N$$$ correct answers (There is a strategy having at most $$$1 + (k-1) \times N$$$ submissions).

I have 3 questions:

1. What's the official name for this kind of problem? I couldn't find any paper mentioning the problem.

2. What's the best strategy to get all $$$N$$$ correct answers? How many submissions required in the worst case?

3. We can submit at most $$$L$$$ submissions, what's the highest expected number of correct answers can we achieve?

  • Vote: I like it
  • +66
  • Vote: I do not like it

»
6 years ago, # |
Rev. 4   Vote: I like it +8 Vote: I do not like it

Let's say we start with N + 1 queries to test for each location if it is the first or second answer. We'll do these queries by setting all the answers the same and changing them one at a time to see if the count changes. Now we've eliminated 2 of our k answers. Take the remainder and repeat for (k/2) * (N+1) + 1 queries total.

We may improve this slightly by first querying k-1 times to see how many answers there are for each question and doing them in order. Since we guarantee that we trim off i*N/k of the items by the time we reach the ith step that should halve our (k/2) * (N+1) at the cost of k-1 queries upfront

For (very) small n and k you could approach it another way. First store the set of all (query, result) pairs. Now make some query to the black box. Check each remaining query against the result of the black box to see if it is consistent. Remove all the inconsistent ones from our still possible set. Now for each possible (query, result) pair run through the remaining queries and determine how many it eliminates. At each step choose the query that eliminates the most results in the worst case to send to the black box.

  • »
    »
    6 years ago, # ^ |
      Vote: I like it +1 Vote: I do not like it

    Thanks for the reply. I have the same idea with you for case both $$$N$$$ and $$$k$$$ is small. It is pretty good (though I do not know if it is optimal or not). When $$$N = 2, k = 3$$$, we need only 4 submissions.

    The approach can be formulated as follow: we have $$$N \times k$$$ variables $$$x_{ij} \in [0, 1] {\ } \forall i \in [1, N]$$$ and $$$j \in [1, k]$$$.
    $$$x_{ij} = 1$$$ if it is a correct option, $$$0$$$ otherwise.
    Initially, we have $$$N$$$ equations:

    $$$ \sum_{j = 1}^{j=k} x_{ij} = 1, \forall i \in [1, N] $$$

    Define $$$S$$$ as the solution space of current equation system. Each submissions will give us a linear equation. We have to choose a submission that gives minimum size of $$$|S|$$$ in the worst case. The process ends when $$$|S| = 1$$$.

    Searching the internet shows me that finding $$$|S|$$$ is very likely a #P-hard problem. This approach maybe not doable in polynomial time but ironically I can't prove or disprove it.

»
6 years ago, # |
  Vote: I like it +8 Vote: I do not like it

We could also do this in N+k+Nlog(N).

Let's do the same strategy as my other comment for the first two rows. Next we'll go through row by row and query all the items we don't have an answer for. While the query tells us they aren't all wrong or all true, we'll split it and recurse. For each answer we might have to split log(N) times on some level to find it, but won't need to do that on any other level. Therefore we can take Nlog(N) to find all the queries and N+K to get started.

  • »
    »
    6 years ago, # ^ |
    Rev. 2   Vote: I like it +1 Vote: I do not like it

    Could you please explain a little bit further or give a example? I do not understand this idea.

    • »
      »
      »
      6 years ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      First, let's find a way to make all our answers 0. We can do this using just two possible answers and it's simplest to just change people's guesses 1 at a time to see if they are one of the first two answers.

      After that we'll take all the remaining questions that don't have correct answers yet and do a query that sets those to one answer and the other questions to some wrong answer. Then we branch: if the whole group of question-answer pairs we are testing is 0 we'll be returned a sum of 0, and can stop looking in this area. If anything on the range returns a 1, then we need to keep querying for this answer and find that 1. We'll do so by splitting our query into two equal sized chunks we can check seperately.

      Initially you might think that this strategy could take n*k work because we might have to do all the splits all the time and query each location individually. That is not the case because of the 0 range shortcut. The 0-range shortcut means that we only split ranges that have 1's in them. If we consider how many splits we might then have to do in the worst case there are up to log(N) splits above each 1, with N 1's total, so Nlog(N). Plus 1 query to start each row, so overall N+k+Nlog(N)