chokudai's blog

By chokudai, history, 3 years ago, In English

We will hold UNIQUE VISION Programming Contest 2022(AtCoder Beginner Contest 248).

The point values will be 100-200-300-400-500-500-600-600.

We are looking forward to your participation!

  • Vote: I like it
  • +58
  • Vote: I do not like it

| Write comment?
»
3 years ago, # |
  Vote: I like it +1 Vote: I do not like it

It will be a great contest XD

»
3 years ago, # |
  Vote: I like it +71 Vote: I do not like it

C++20 support when

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

It took some WAs for D, but I've finally realized extra log N factor is TLE territory.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I got innumerable TLEs and it TLEd on only 1 test case most of the times. I used lower_bound and then implemented by own binary search function but none worked. Don't know how to improve my solution further :)

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      I got an AC in D problem. So basically what I did is I created a map<int, vector> and then stored the indices of all the elements in that map. Then I just took lower_bound and upper_bound of l and r respectively of that vector in the map.

»
3 years ago, # |
  Vote: I like it +1 Vote: I do not like it

how to represent line in E ?

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +6 Vote: I do not like it

    I represented a line as the set of all points (in the input) it passes through

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +11 Vote: I do not like it

    my sol: let $$$vis_{i,j}$$$ be: if the line with point $$$i$$$ and $$$j$$$ is used, set it 1, otherwise set it 0. Then we can simply mark all pairs of points on the same line.

    sample code
  • »
    »
    3 years ago, # ^ |
      Vote: I like it +22 Vote: I do not like it

    I made a set<array<int64_t, 3>> and stored as form $$$a x + b y + c = 0$$$ reduced to lowest form using $$$gcd(a, gcd(b, c))$$$

    and stored both $$$a,b,c$$$ and $$$-a,-b,-c$$$ in set

    and printed size of set / 2 as answer

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +4 Vote: I do not like it

    We know that the equation of a line is $$$y = mx + c$$$, where $$$m$$$ is the line slope and $$$c$$$ is the intersection point with the y-axis. To avoid working with floating-point values, let's manipulate this equation a bit. So far, we have $$$y = \frac{\Delta{y}}{\Delta{x}} x + c \iff y\Delta{x} = x \Delta{y} + c \Delta{x}$$$ Now, we will represent each line with the triplet $$$(\Delta{x},\Delta{y}, c\Delta{x})$$$. And to avoid the problem of having lines with the same slope up to a constant factor, we can re-assign $$$\Delta{x} = \frac{\Delta{x}}{gcd(\Delta{x},\Delta{y})}$$$ Similarly, $$$\Delta{y} = \frac{\Delta{y}}{gcd(\Delta{x},\Delta{y})}$$$. With this representation, we can easily check if a point lies on a line. My submission here.

»
3 years ago, # |
  Vote: I like it +22 Vote: I do not like it

"Be careful not to count the same line more than once."

Haha, super funny. Can you at least explain how that works?

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +3 Vote: I do not like it

    I represented a line as the set of all points (in the input) it passes through. There are O(N^2) lines that pass through at least 2 points, and each line has size O(N), so you can explicitly enumerate all such lines in this representation.

    (optimization: instead of "set of points" you can use "sorted list of indices of points")

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    a line can be defined by its endpoints in the input, using a set you can store all valid lines in pairs of points

»
3 years ago, # |
  Vote: I like it +1 Vote: I do not like it

The idea for D: Range Count Query is essentially the same as CF Div3D: Distinct Character Queries

For each number in $$$[1, N]$$$, keep a vector denoting the indices where this is element is present. If you fill it from left to right, this vector would be sorted. The answer for each query would then be $$$upper\_bound(R) - lower\_bound(L)$$$ on this position vector.

You might think that the complexity might jump to $$$O(n^2)$$$ if you keep a vector for each element, because a single element can occur $$$n$$$ times. The answer to that question is same as: While creating an adjacency list of a graph, why doesn't the time complexity jump to $$$O(n^2)$$$?

Time Complexity : $$$O(n + q \log{n})$$$ Code

Code
  • »
    »
    3 years ago, # ^ |
      Vote: I like it +3 Vote: I do not like it
  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +1 Vote: I do not like it

    I think there are significant differences both ways, for Range Count Query I constructed the frequency dictionary of A[0:i] for each i and answered each query offline. But this doesn't work if operations can change A. And I think it's a legitimately different algorithm from yours because the time complexity is O(N) + Q log Q.

    Likewise you can solve Distinct Character Queries with a segment tree for each letter of the alphabet, but it doesn't generalize to large alphabets.

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Yes, of course. I didn't mean that the problems are same, I meant that if you've read the editorial of Distinct Character Queries, you'd instantly get the idea for solving Range Count Query (as the former contained the trick of maintaining sorted indices of positions and binary searching).

      RCQ is a subset of DCQ but not the other way round, due to update queries.

      • »
        »
        »
        »
        3 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Actually I also meant that RCQ is not a subset of DCQ, since the "use 26 segment trees" solution to DCQ cannot be generalized to larger A easily (I think?)

        You are right that the editorial solution to DCQ is easier to generalize though

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I just learnt that O(n + qlogn) is not the fastest solution. By treating the queries offline, we can achieve O(n + q) instead.

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      But won't you need to sort the queries for offline processing, thereby introducing a $$$\log{q}$$$ factor?

      • »
        »
        »
        »
        3 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Here is how. This is not my submission. I just saw the author explained how it can be done: Submission

»
3 years ago, # |
  Vote: I like it +14 Vote: I do not like it

How to solve G?

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +9 Vote: I do not like it

    Let $$$x$$$ be the gcd of the numbers on the path. We first calculate the sum of $$$k$$$ for all possible $$$x$$$ with only the constraint that all numbers should be a multiple of $$$x$$$ (which means that the gcd might be $$$dx(d>1)$$$ instead of $$$x$$$, but we will deal with that later).

    We can now mark every node whose number is a multiple of x as passable, which means the path we take can only pass passable nodes. We can calculate the answer of each connected component (it's easy to see that they are trees) seperately because we can never move from one component to another.

    The contribution of two nodes $$$i,j$$$ will be $$$dep_i+dep_j-2dep_l+1$$$ , where $$$l$$$ is the LCA of $$$i$$$ and $$$j$$$. We choose a vertex as $$$l$$$ and can calculate the answer with simple math. Then we add the answers together for all connected compnents to get the answer of the current problem.

    Now we get the answer when the $$$gcd$$$ is a multiple of $$$x$$$ , but we need the answer when $$$gcd=x$$$. Let $$$f_x$$$ be the answer, then we should use the current answer to minus the total $$$f$$$ for all multiples of $$$x$$$ , because those whose $$$gcd=dx(d>1)$$$ is calculated exactly once in those values.

    Then we get a $$$O(n\omega(n))$$$ ($$$\omega(n)$$$ denotes the maximum number of factors for a number below $$$n$$$ , when $$$n=10^5$$$ , it's about $$$100$$$) solution if carefully implemented.

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

how to solve c without dp?

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +3 Vote: I do not like it

    I am not sure,but i think it's only solveable using DP.The reason for this is because it has small constraints so i think it is meant to be solved by DP.Also it is <=K so i think DP is needed.If it was =k then i'd use stars and bars.

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Cant you apply stars and bars for all 1 <= i <= k?

      • »
        »
        »
        »
        3 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I think you can.The thing is that you should try to make it in a form like 0<=.It is a very common trick i believe.Let's say that you need to make a+b+c=n with 1<=a,b,c<=n.Then,what you need to do is that make it into a+1+b+1+c+1=n so that it'll be something like 0<=a,b,c<=n-1.Then the solvable form is a+b+c=n-3.Please correct me if i am wrong since i am not that good in math.

        • »
          »
          »
          »
          »
          3 years ago, # ^ |
          Rev. 2   Vote: I like it 0 Vote: I do not like it

          I actually meant this:

          Since K<=2500, you can apply stars and bars to get the total sum 'X' for all N <= X <= K and add their total sum, right?

          should be something like O(K*(N+K))

          • »
            »
            »
            »
            »
            »
            3 years ago, # ^ |
            Rev. 2   Vote: I like it 0 Vote: I do not like it

            Oh,I'm not sure.You see,you need modular inverse to count it right? because we need to take mod and there is a division involved in the stars and bars formula,right? (i haven't tried it myself so i'm not sure whether it's possible or not.But i think it's quite hard don't you think?)

            • »
              »
              »
              »
              »
              »
              »
              3 years ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              Yh, will try this approach. It should be ossible with complementary counting

        • »
          »
          »
          »
          »
          3 years ago, # ^ |
          Rev. 3   Vote: I like it 0 Vote: I do not like it

          I think you're trying to solve for

          a1 + a2 ... an = k, where ai>=M, but the problem is actually ai<=M(which I guess can be done with complementary counting). Is it possible to do it without complementary counting?

          UPD: For those wondering about the PIE complementary counting method:

          Inclusion/Exclusion in O(KlogN)
      • »
        »
        »
        »
        3 years ago, # ^ |
        Rev. 3   Vote: I like it 0 Vote: I do not like it

        yes we can , with stars and bars + complementary counting . (count number of combinations with at least one variable(box) being $$$>M$$$) here is submission which uses that technique .

        time complexity is $$$O(nlog(mod))$$$ but can be optimized to $$$O(n)$$$ if we precalculate modular inverses .

        it's strange that they didn't mention this approach in editorial

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Bonus 2 in the editorial uses formal power series in O(k).

  • »
    »
    3 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I did not want to use DP, so I wrote the answer using memoized recursion. Submission

»
3 years ago, # |
  Vote: I like it +5 Vote: I do not like it

In problem E I represent a line with (delta_y, delta_x, minimum_index_of_point_on_this_line) if line isn't vertical and (1, 0, x_coordinate) if line is vertical.submission

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +14 Vote: I do not like it

    Actually we can uniquely determine a line by checking the first two index of point that is on the line, and to implement it just allocate a 2d bool array. I find it is more easy to code :)

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Simpler version of problem G 990G - Подсчёт GCD (it uses the same idea, but instead of counting sum of all path lenghts you just need to count the number of paths)

»
3 years ago, # |
Rev. 2   Vote: I like it +19 Vote: I do not like it

Would anyone like to share some different solutions to problem F, except for the one in editorials. I find it difficult to determine the state of dp.

UPD1, I don't understand how to find that there could only be two states, state0 and state1, as mentioned in the editorials.

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +30 Vote: I do not like it

    For a smoother experience, read this gist which has embedded images.

    Solving this problem requires a deep understanding of how Subset Sum DP works. In fact, with a proper abstraction, you can almost convert it to a Subset-Sum DP problem.

    In the subset-sum problem, we have a zero-indexed array of $$$n$$$ elements. To simplify the mathematical notation, for each node $$$x$$$, define the node vertically below it as $$$x^*$$$. If we capture each $$$x$$$ and $$$x^*$$$ in a rectangular box, we will get an array of $$$n$$$ blocks (let's say, zero-indexed), and we need to perform some operations on these blocks. This is the first level of abstraction which makes the numbers in subset sum DP analogous to blocks in this problem.

    Visual Representation

    Next, to figure out the DP definition, recall that in subset-sum problem, even though we are supposed to find the number of subsets of the entire array with sum exactly equal to $$$k$$$, we define $$$dp[i][j]$$$ as the number of subsets with sum equal to $$$j$$$ for all $$$j \leq k$$$. Why is this so? It's because $$$j_1 + j_2$$$ can become equal to $$$k$$$. In fact, for each DP problem, all states can be broadly classified into 3 categories,

    • Directly useful
    • Indirectly useful
    • Hopeless

    So, any subset with sum equal to $$$k$$$ is directly useful, subsets with sum less than $$$k$$$ are indirectly useful and subsets with sum greater than $$$k$$$ are hopeless.

    Recall that, in Knapsack problem, we also maintain another dimension of DP capturing the total weights we've already used so far. Since we need to delete some amount of edges from these blocks, let's define $$$dp[i][j]$$$ as the number of ways to delete exactly $$$j$$$ edges from blocks $$$[0, \dots i]$$$ such that the remaining graph is connected. Our answer would then be $$$dp[n - 1][j]$$$ for all $$$j$$$. This DP definition is incomplete, but in a contest, you're most likely to come up with this definition in the first try.

    So, what's the flaw with this DP definition? It only captures states which are directly useful. Since $$$j_1 + j_2$$$ could've become equal to $$$k$$$, in this case, we could have 2 disconnected graphs which would've combined to created a connected one.

    Now we know that we are missing indirectly useful states in our DP table, but we are not sure which ones exactly. Whenever you face this Dilemma, just start the algorithm from the $$$0^{th}$$$ element, and you'll quickly realize what you're missing.

    Notice that each block can contain at most 3 edges. Label them as top, bot and mid edges.

    Visual Representation

    In subset sum DP, the elements are processed in an online fashion. Meaning, we first compute our answer assuming that we only have the prefix $$$[0, \dots i]$$$ and we investigate the transitions when we introduce the $$$i^{th}$$$ element. We then make one of 2 choices for the $$$i^{th}$$$ element: Take or Leave. So, let's start with the first block. It only has a mid edge. We can either take it or leave it, which leaves us with 2 possibilities. Notice that our current DP does not handle the leave case, because it makes the graph disconnected. But we can see that it's a valid choice since the adjacent block can make it connected. This is our first hint on the new state that we need to introduce.

    Visual Representation

    Now, consider the second block, it has 3 edges, top, bot and mid. We can take or leave each edge independently, so there are 8 possibilities. Let us list down all $$$2*8$$$ possibilities for both blocks combined. (The first column of result assumes that we kept the mid edge of the $$$0^{th}$$$ block and the second column assumes that we ignored it).

    Visual Representation

    Can you spot the directly useful, indirectly useful and hopeless states from this graph? All states with one connected component are directly useful, and all the states where at least one connected component is isolated from the the incoming block is hopeless. Since that incoming block can only interact with the terminal $$$x$$$ and $$$x^*$$$, we can conclude that any element which is a part of a connected component not containing $$$x$$$ or $$$x^*$$$ is hopeless, because no matter what you do later, you cannot make the entire graph connected. Hence, for a state to be not hopeless, it should have no connected component not containing $$$x$$$ or $$$x^*$$$.

    Visual Representation

    However, the editorial ignores all the hopeless states, which makes you wonder how did they suddenly guess that there can only be 2 states. But let's not do that, let's define 3 states, connected, two_connected, and hopeless, where two_connected represents an indirectly useful state where there are at most 2 connected components containing the terminal vertices. Although we keep saying 2-connected, we actually mean, the graph should have 2 connected components and with at least one of the terminal blocks. With these hints, we are now ready to define the actual DP definition:

    Define $$$dp[i][j][state]$$$ to be the number of ways to delete exactly $$$j$$$ vertices from blocks $$$[0, \dots i]$$$ such that the resulting graph is in the given state (connected, two_connected, hopeless).

    To perform the transitions, like subset-sum DP, we make 8 choices on the last block and take contribution from the previous states accordingly.

    A hopeless state can only lead to a hopeless state.

    If the last state was connected, and we drop the terminal top and bot edges, it'd become hopeless. If we take at least 2 terminal edges out of top, bot and mid, the graph would remain connected, else it becomes 2 connected.

    If the last state was two-connected, we are not allowed to drop the terminal top and bot edges. If we do, it leads to a hopeless state. Depending on whether we keep the terminal mid edge, it'll either lead us to connected or two_connected graph.

    Since we eventually won't use hopeless states, it does not matter whether we fill the correct values in it. And that's what the editorial does, ignore it altogether, but in my implementation, I do include some parts of it for clarity.

    My Implementation of the above Idea

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it +1 Vote: I do not like it

      Wow, this is incredibly crazy.....

      You are definitely right that at first I come up with dp[i][j], since this is somehow straightforward, like in knapsack problem.

      You are also definitely right that I missed the indirectly useful states, and now I understand that I have never truly realized the existence of such states. In knapsack problem, I know that we should compute dp[i][j] for all j, but somehow, I take it for granted :( After reading your arguments, it is the first time that I begin to think about this. As mentioned, "directly useful, indirectly useful, hopeless states", I believe this is also the first time that I have a chance to learn dp states in such a systemic approach, and I will spend time trying to get deeper and deeper understanding about this.

      Next, you are again definitely right that "However, the editorial ignores all the hopeless states, which makes you wonder how did they suddenly guess that there can only be 2 states". This is truly what I feel after reading the tutorials, it is just like this, they read problem, and then, they come up with state0 and state1.... I really get very very shocked after reading your words (How could you even find out the reason why I feel confused, while I just simply asked "UPD1, I don't understand how to find that there could only be two states, state0 and state1, as mentioned in the editorials.").... This is too crazy....

      Finally, the figures you draw are very clear and help a lot for understanding the solution. Thanks a lot, for your paticient and awesome teaching! If someday, I become good at competitive programming enough, I will also try to provide such clear and detailed arguments for others, as what you have done.

    • »
      »
      »
      17 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Amazing, thank you for detailed explanation.

»
3 years ago, # |
Rev. 2   Vote: I like it +4 Vote: I do not like it
»
3 years ago, # |
Rev. 2   Vote: I like it +8 Vote: I do not like it