P_Nyagolov's blog

By P_Nyagolov, 10 years ago, In English

Hello everybody,

Here is a cool problem that I am unable to solve:

There is a table with 3 rows and N columns. The first row contains a permutation 1,...,N. The second and third row contain integers in the interval [1,N] but there can be repeated numbers in each row. Some random guy decided to remove columns from this table. After that our random guy sorts the numbers in each row and wants to have only equal numbers in each column. We are to find the minimum number of columns our guy needs to remove in order to achieve his goal.

In the cases worth 40% of the total points, it will hold N<=10.

In the cases worth 70% of the total points, it will hold N<=10000.

In the cases worth 100% of the total points, it will hold N<=100000.

I can solve the problem only for 40 out of 100 points.

Please, help me to solve it for full score! :)

Update: Solved! Thanks, Enchom!!! His code is here

  • Vote: I like it
  • +5
  • Vote: I do not like it

»
10 years ago, # |
Rev. 3   Vote: I like it +22 Vote: I do not like it

I'm not sure if I fully understood your problem, but here is my solution in the way I understand it:

So the first crucial observation is that "After sorting we want only equal numbers in each column" pretty much means that the rows must have coinciding numbers, regardless of their order. Obviously if you have {1,2,3} in all rows, regardless of how each of them is ordered, you're gonna have equal columns after sorting.

So knowing that, we can see some "forced" moves. Let's call "forced" such column that has a number that does not appear in every row. Let me give a clear example of that. Imagine you have the following rows for N=5:

4 3 5 1 2
4 3 5 2 2
4 3 3 4 5

We can see that the columns we are forced to remove are 4 and 5, since they have the number 2 and number 2 is not present in row 3. So obviously we have to remove those columns. After removing them however, we are left only with the first 3, and it is easy to see that then column 3 becomes "forced", because of the number 5. After removing it, we're finished. This goes to show that after removing some forced columns, more may appear, but it is clear that all of them will surely have to be removed.

Let's assume that we work out all forced columns and remove them, as obviously they will be removed sooner or later. It is obviously necessary to remove those columns, but is it sufficient? It turns out it is and it's easily provable.

Suppose that we have eliminated all forced columns. Now if there are no repetitions of numbers in row 2 and 3, then we can clearly see that we have a solution — once we sort them, we'll get different numbers and since all of them are present in all 3 rows (since we've removed "forced" columns) we're done if that's the case.

But what if we have some numbers repeated in rows 2 or 3? Well suppose after elimination of forced columns, we're left with K columns. If there is a repetition in, let's say, row 2 then it will have at most K-1 unique numbers. But since row 1 is a permutation, it has K unique numbers. This quickly shows that there must be some number in row 1 that is not present in row 2, and this means we would have a forced column that we had not removed, leading to contradiction.

All this goes to show that removing forced columns is enough and obviously minimum.

In summary the algorithm is as follows:

  1. Define "forced" column to be a column containing a number that's not present in all 3 rows.

  2. While there are forced columns, remove them.

  3. Once there are no more forced columns, you are done, and you've got your minimum amount of columns removed.

I believe it can be implemented to work in O(N).


If I have misunderstood the problem, or perhaps I made a mistake in my solution, or maybe simply you didn't understand my explanation, feel free to ask me questions, we're both Bulgarians after all :)

  • »
    »
    10 years ago, # ^ |
      Vote: I like it +14 Vote: I do not like it

    Well I think you understood the problem correctly and explained it in a really neat and understandable way. Really good job! I really liked it. PS: It is better than most of the editorials(maybe the best lol :D).

  • »
    »
    10 years ago, # ^ |
    Rev. 2   Vote: I like it +8 Vote: I do not like it

    Thank you very much, Encho!!!

    You understood the problem correctly. Now I will start thinking of a fast implementation of your algorithm.

    PS: Here is the problem(the first one). I didn't give a link because it is in Bulgarian.

    • »
      »
      »
      10 years ago, # ^ |
      Rev. 2   Vote: I like it +3 Vote: I do not like it

      That's funny, seems like I have solved it about 3 years ago but I had no memory of it until seeing my submission. It's indeed the solution I described above, and the linear implementation isn't very difficult, good luck :)

      P.S. I wrote "is very difficult" instead of "isn't very difficult" by mistake... I hope that didn't scare you :D

      • »
        »
        »
        »
        10 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I decided to write a solution that seems to be O(N^2) for me and it got full score! Here is my code. Do you have the code for O(N) because I can't figure it out? If yes, can you share it please?

        Thanks again for the idea and the time you spent to explain the idea to me!

        • »
          »
          »
          »
          »
          10 years ago, # ^ |
          Rev. 3   Vote: I like it +3 Vote: I do not like it

          Well, your full score is sadly a result of badly prepared tests. Consider the following example for N=100,000

          1 2 3 4 5 6 7 ... 100,000
          1 1 2 3 4 5 6 ... 99,999
          1 1 2 3 4 5 6 ... 99,999
          

          At first only the last column is forced. Once it's removed, the second-to-last is now forced and so on, until only the first is left. Since you check all columns in left-to-right manner, this test would lead to O(N^2) in your solution and time limit.

          My idea of a linear solution is to keep track of numbers that become "extinct" from some row. That is, if a number is missing from some row, then all columns containing that number are forced and you can remove them. Then upon removing a column, you can check whether by removing it, you are making some number "extinct" from its row. Since there are 3 numbers in a column, 3 checks are enough. This should lead to a solution with complexity of about O(3*N) = O(N).

          Here is my code

          • »
            »
            »
            »
            »
            »
            10 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            I think I got it. I'll implement it myself and add your code to the blog when I get home. Thanks!

  • »
    »
    8 years ago, # ^ |
    Rev. 6   Vote: I like it 0 Vote: I do not like it

    Wrong

    4

    2 2 1 1

    2 1 1 1

    2 1 1 1

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      "The first row contains a permutation 1...N" ???

      • »
        »
        »
        »
        8 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        but there can be repeated numbers in each row.

        • »
          »
          »
          »
          »
          8 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          "The second and third row contain integers in the interval [1,N] but there can be repeated numbers in each row."

          Not the first row.

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      farmersrice is correct. The first row is a permutation and has no repeating numbers. The phrasing of the problem in this blog may confuse you about that, but I've read the original Bulgarian statement.