Comments
On GassaVRt Contest 2019 Discussion, 7 years ago
+11

Gassa, will you also make the submissions from the contest public, in order to be able to read others' solutions? (especially of the top contestants who won't describe their approach in this thread or anywhere else)

Yes, exactly. That's the form I also used. I cancelled out one (a-1)^(-1) (actually I replaced it by the inverse of (y-1), where y is the argument of the query — this inverse is computed only once per query, so it's fine). But I couldn't cancel out the 2nd one. So, as I said, when k is small, I just computed the sum directly in O(k) time instead of using the formula which required the computation of (a-1)^(-1).

I had a solution which ran in O(sqrt(N)) time per query/update (even in the case of general trees). For general trees it was O(deg) for "light" vertices (deg <= sqrt(N)) to perform some updates, and then had some special logic for "heavy" vertices (deg > sqrt(N)). The hardest case for my solution was when one "light" vertex had many "heavy" neighbors and the swaps involved such a light vertex and a heavy vertex. However, surprise-surprise, none of the large test cases had any heavy vertices! So you could just recompute the minimum and 2nd minimum neighbor of each updated vertex in O(deg) time (because deg was always <= sqrt(N) in the official large test cases).

This inverse was also the bottleneck in my solution (everything else was O(1) per "step" — by step I mean group of consecutive levels in the subtree having the same number of vertices). I used a very simple fix: if k is small (<= 20), just compute the sum in O(k) time :) Otherwise just use the usual logic (which involves computing the inverse). That was enough to pass the test cases. I'm not sure if they were just weak or there can be some proof behind it (no time to think about proofs — I barely had time to compete at all).

But computing the sum in O(log(k)) is much more elegant. I even remember using this "trick" in the past for some similar types of sums, but I didn't think about it this time.

On GassaCodeforces Marathon Round 2, 8 years ago
+8

Re: "There are many marathon and game AI contests in Japan."

Could you point us to some of these contests? I am really curious what kind of problems you are solving. In Topcoder Marathon contests I noticed a big increase in the number of participants from Japan, and also a big increase in the good results obtained by Japanese contestants, so I am personally not at all surprised to see multiple Japanese contestants ranked close to the top in this CF Marathon R2.

I see only 8 problems added to the contest at the time of writing this (and no challenge/approximation problem). Are you going to add the remaining 2 problems (up to 10, like in previous contests), or are these 8 problems all of them?

Since no one is replying on the problem's page anymore, I thought about asking here: For the approximation problem (The Grid Organizer) will there be new test cases added after the contest ends? If yes, then please make this information available ASAP. Since the problem is configured to get a score of 0 if any test case fails, it's important to make our solutions more "robust" than usual in case new test cases will be added after the contest (e.g. in order to make sure we avoid any chance of TLE, etc.).

You mean the worst case of the official solution is just M=N=(2^11)-1? But it works fine for M=N=(2^22)-1? :) [I'm just asking, I haven't actually tried to run the official solution on any test case]

Anyway, rejudging all submissions seems like an extreme measure. I expect that most Accepted solutions use a very small amount of memory. For instance, my solution uses only O(BITS^2 + Q) memory, where I rounded BITS up to ~25.

+15

I see that the top 3 contestants at the moment have the maximum score at the approximate problem. My understanding would be that they all found the optimal solution on each test case (given how the score is computed). Do you mean that there's a tie at the top mostly because of the rounding? (i.e. the scores were not necessarily identical, but they were sufficiently close to the optimal score, that they were rounded to 100%). If yes, then this seems like a big problem and defeats the purpose of the approximate problem of breaking ties (instead the opposite is happening — solutions of not quite equal quality are forced into a tie).

I also have the solution which tries circles passing through 2 or 3 points (or almost zero area circles containing just 1 point) — and, as mentioned, it passed the test cases I got (although there are counter examples to this approach). But what's worse is that it would have passed all the test cases even with trying just circles defined by any pair of points (I printed the results I got without considering triples of points and they were all the same for the official test cases I got). Of course, during local testing, I could easily find test cases where triples of points were producing better solutions than just pairs of points.

I'm pretty sure there was no rejudge, yet. I expect they will update this comment with status updates: www.hackerearth.com/practice/basic-programming/implementation/basics-of-implementation/practice-problems/approximate/unforgettable-mission/?scroll-id=comments-174-801&scroll-trigger=inview#c73985

Another way of checking is to open one of the submissions and click on the inputs — you can see the current test cases (e.g. input #50 for the original set is: 99 394073632). Until you see different tests when clicking on your submissions, then the rejudge didn't take place, yet.

On Um_nikOctober Clash '16, 10 years ago
0

Thanks. This works, indeed. I was trying clicking on the submissions from the leaderboard. The link there is different and the code is not visible: https://www.hackerearth.com/october-clash-16/approximate/different-sums/submission/5717113/

Do you mind also explaining your approach?

On Um_nikOctober Clash '16, 10 years ago
0

How did you approach the approximation problem? I started with a local search kind of approach, trying to swap numbers starting from the identity permutation, but it didn't bring any significant improvements. An approach that did work better than the identity permutation was a simple greedy which adds elements to the permutation sequentially: always pick the number which adds the smallest number of new sums (and pick the smallest such number, in case of ties). But this was too slow (O(N^3)). I tried to improve things using bit operations, or to consider fewer candidates at each step, but they didn't work well (e.g. considering fewer candidates improved the speed, but impacted the score negatively). In the end, I could only run this for N <= ~2500. And, anyway, this approach is still worse than the top solution (even for the cases where it's fast enough).

And another question: I noticed that when I click on the submission of any contestant (myself included) I cannot see the source code anymore (this wasn't happening in previous contests). Do you know why? Can this be fixed? (I want to look at the top solutions for the approximation problem)

Two of my most recent submissions from the contest have been in "Rejudge" mode for many hours (the last one and the 3rd most recent one). What is going on? At this point it will probably not make too much of a difference in the rankings (I think that my best submission was already rejudged), but it's worrying to know that some of the submissions will randomly not be rejudged on the final test cases.

PS: I posted the same comment on the problem page, but maybe there's a higher chance of getting an answer here.

Ah, I didn't really have much time to look at the actual scores on the original 50 test cases. Locally I tried to minimize the sum of (diff1/1 + diff2/2 + ...) and didn't really compute actual scores according to the scoring formula. Anyway, this is another point in favour of not having some parameters range too much across tests (in this case the K parameter), because the class of tests with large K was very small, and luck (or bad luck) on these test cases can influence the final result a lot.

What I liked about the problem was that it wasn't "easy". Up to the end of the contest I could still come up with improvements and I'm sure I was very far from anything close to optimal. In fact, an interesting thing about the problem is that given more time, my solution would be able to keep producing better scores on most of the test cases. I believe this is also true for your solution — I saw the "relax" function, which is quite similar to mine. Given more time, it would have probably "relaxed" the values much more, thus improving the final score.

The approximate problem hasn't been rejudged, yet. Do you have any estimates on when this will happen?

And regarding the approximate problem: I really liked the problem. It seems pretty complex and in a way it's a pity that it was used in a 24h contest instead of a multi-day one (e.g. the Circuits series). I feel that there are many aspects of it which I didn't have time to explore/understand (including understanding the generators, which I just used for generating local tests, without having time to understand their logic).

Regarding approximate problems in general: Please continue providing the source code for the test case generators and regenerating the test cases after the contest. This is the only way to judge solutions based on their quality, instead of how well they "overfit" the test cases.

And finally some general suggestions, based on my experience (this is my opinion only): If you're going to only use 50-100 test cases for judging the problem, there's no need to include a huge variance in the test cases (e.g. large intervals of choosing the parameters, etc.). With large test case variance, it means that the final test cases will contain only a few test cases of each "class"/"type". And what you want is to have more test cases of the same "type", in order to avoid that some solution is just lucky/unlucky on some cases of this type, but has a different behavior "on average" for the cases of that "type". A simple example regarding the problem from this contest could be to have smaller ranges for choosing M, N and K (so that most cases are similar in regard to this, but different regarding the way the matrix values are generated).

How did you come across the Chinese training camp paper? And, more importantly, were you able to translate it somehow? :) I can't seem to download it and I don't think there's any translation option while reading it online.

I composed problem B from last year's ACM ICPC SEERC contest (http://mirror.codeforces.com/gym/100818/attachments/download/3890/20152016-acmicpc-southeastern-european-regional-programming-contest-seerc-2015-en.pdf) with the data structure from my paper as the intended solution. The problem has many more updates than queries, requiring fast updates (O(1)), while the queries can be slower (O(sqrt(N)). I'm not sure if the data structure described in this post can be used efficiently for that problem, though, since the queries were path queries.

It's true that I mostly had path problems in mind when I came up with that idea, but it could also be used for solving some subtree problems, by taking advantage that every node has O(sqrt(N)) ancestors inside its "fragment" (to use your term). For instance, if you need to query/update the subtree of a node u, you could query/update the "special" node below u (if any), plus all the ancestors of u on the path from u to its "special" parent. This works because if there's any update at a node v below u (and in the same fragment), then u will also be explicitly updated, as being one of v's O(sqrt(N)) ancestors inside the fragment.

What's the final decision? Will there be a lightning Marathon contest? I can't tell if the needed conditions were met or not.

I somehow missed this announcement about the lightning Marathon. But it seems like a nice idea (i.e. organizing shorter Marathons, of just 1-2 days). Are you going to organize such Marathons in the future, too?

On LeonidDeadline 24 Finalists [2016], 10 years ago
+17

Congratulations for the win. Can you share your approaches for the problems?

In my team I worked on "Interstellar", but I was unbelievably unsuccessful in coming up with any competitive approaches :) I mainly tried to select a random subset of planets initially and focus on them (the hope was that the other players would be less likely to try to out-buy my shares on these planets). I selected these planets as disjoint collections of long-ish paths (with the condition that they had at least some minimal amount of local income). Then I would place my traders also on disjoint collections of paths, in a greedy manner (repeatedly selecting the path which gave the best output divided by the number of traders used in it — hopefully I implemented the formulas correctly, but I'm not 100% sure of this). I noticed that this strategy brought some strong starts in some of the games (on server 1 I could get even up to 15% of the total income in the first 50-100 turns, but I would drop to 3%-4% by the end), but was usually very poor on server 2. And I didn't have a good strategy in estimating if/when I should buy new shares (I would sometimes buy some shares which didn't bring enough benefit by the end, so buying them ended up being just a waste of money).

+1. It seems that once the contest ends, the scores for the approximation problem are not shown anymore (neither for old submissions, nor for new ones). You only get if the solution was Accepted on each test case or not, but not the score which measures the quality of the solution on each test case (which is the most important thing).

@organizers: Can you please fix this issue with the approximation problem? It's really annoying (both for upsolving and for looking at the best submissions of the other competitors).

I agree. Please don't use approximate problems where the test case generation method is not disclosed. It makes it almost impossible to test our solutions locally. This problem in particular had a large space of inputs and we were supposed to simply guess how the test cases were generated from such a large pool of possibilities? In particular, some approaches which worked really well on my local test cases made almost no difference or scored worse on the official test cases. In some of the recent Hackerearth contests this issue did not show up anymore, so I thought it was finally solved, but now it's back again. I guess it all depends on the experience of the problem setter, but I think the tester should also push for disclosing the test case generation method.

I would like to share my solution from the official finals to 653E-Bear and Forgotten Tree 2, because I find it rather interesting.

First, I wanted to compute the connected components if we exclude vertex 1. In order to do this I repeatedly chose a non-visited vertex x and "expanded" its connected component starting from x. I did this with a BFS, in the usual way, by considering all the non-forbidden neighbors and, if they were not visited before, I added them to the queue. The trick here was to choose a threshold T, which is slightly more than sqrt(M). Then, once T nodes were considered in the BFS, we stop (i.e. we don't necessarily compute the connected component fully). During the BFS's I might encounter vertices which were visited as part of some other component, but were not expanded further due to the threshold. For these cases I used Union-Find, in order to merge the newly expanded component to other components which were not fully expanded because of the chosen threshold.

In the end we get a number of connected components. Any component with less than T vertices was fully expanded (thus, it's a full connected component — no more vertices can be added to it). Any components with >= T vertices can actually be merged in a single component. Why? It's easy to see. Let's assume that we have two such components, one with A vertices, the other with B vertices, with both A and B >= T. There are A * B possible edges between these two components. But, since T > sqrt(M), we have A*B>M. Thus, there exists at least one pair of vertices, one from the first component, the other from the second component, such that the edge between them is not forbidden.

After this we just need to check that:

  • vertex 1 has >=K non-forbidden edges
  • the number of connected components is <= K
  • vertex 1 has at least one non-forbidden edge to at least one vertex from each connected component

The overall time complexity is dominated by O(N * sqrt(M) * Union-Find time complexity). I implemented Union-Find with path compression and union-by-size. During the algorithm there can be at most N-2 successful Unions, but many more Finds. Thus, path compression was very useful here to flatten the Union-Find trees and make the complexity of each Find very close to O(1).

On GassaVeeRoute Marathon is finished, 10 years ago
+8

Thanks, I hand't noticed it (as I said, I kind of stopped working on the problem after the first week, and this announcement was posted one day before the end of the contest?). Anyway, it only cost me one place in the final standings, due to my solution not using the full amount of time that it should have, so it's no big deal. But it would have been nice to find out such things much earlier.

On GassaVeeRoute Marathon is finished, 10 years ago
+28

My approach consisted of constructing a time-expanded network. Basically, each node is a (time,place) pair and there are arcs from (t1,p1) to (t2,p2) if an order or a pair of orders can be satisfied starting from time t1, place p1, moving to do the pickup(s), then doing the dropoff(s). So (t2,p2) was always the time and place of a drop-off. In order to keep the size of this network feasible: - I rounded up all times to multiples of 30 seconds (so, essentially, there were much fewer distinct times) - I introduced "waiting" nodes and arcs in order to avoid creating too many arcs (otherwise there would have been an arc from the end of each order/pair of orders to the beginning of each potential subsequent order/pair of orders) - I also "collapsed" all the arcs with fixed starting times (e.g. pickup from the airport) into one - I limited the number of 2-order arcs to a small-ish number if the test was large (keeping only those with the best "cost", where cost was the difference in travelled distance compared to satisfying the orders independently)

The number of nodes in this network was in the order of 10s of thousands and the number of arcs in the order of millions (up to 19-20 millions arcs on larger tests). Still, I easily exceeded the 512 MB memory limit, which was my biggest source of pain. I had to remove fields from my structs and otherwise optimize things in ways I didn't necessarily intend, just to squeeze my solution under the memory limit.

Then I ran a greedy solution using this time-expanded network. At each step I considered each starting time of a driver (at most 13 different values) and tried to compute a path through this graph containing only arcs with yet unsatisfied orders, such that the average cost per order is as low as possible (i.e. trying to minimize (base penalty + total distance) divided by number of satisfied orders). Of course, I didn't do this optimally (that would have required me to at least keep the best cost for each number of orders — not that large, since I noticed drivers couldn't satisfy more than about 30 orders), but heuristically somehow. I also had to avoid picking up the same order twice (for orders where the travel time is very low, the time-expanded network, despite being a DAG, may contain paths with arcs handling the same order at different starting times).

I also had to make this path computation somewhat independent of the destination garage, since it was too time-consuming to try it out for every (starting time, starting garage) pair. So I just guaranteed that the path can end in some garage with drivers still available at that garage at the given starting time, at most 12 hours after the starting time and with at most 9 hours of travel time.

I noticed that the initial drivers were picking up quite a lot of orders (20+), but as I added new drivers, they were less and less efficient (having even many drivers which only picked up 1 or 2 orders).

This is what I did in a few days in the first week of the contest and gave me about 8374 points (if I remember correctly). In the 2nd week I spent almost no time on this problem, because I participated in another contest (the Codechef monthly long contest), so I only added a step which, while time permits, clears the schedules of a few randomly selected drivers (at least 10% of the used drivers and at least 5% of the satisfied orders) and tries to rebuild the schedule for them by considering the arcs in some order (depending on total distance travelled, but giving preference to arcs with 2 orders) and inserting them in the schedule of a driver where it adds the least additional cost (as long as all the other constraints are still satisfied). This gave me +150 points (so about 8524 points in total).

I have a question for the other competitors, though. I just looked at my results on the 500 test cases now and I was surprised to notice that my solution very rarely ran up to the time limit I set for it (which was 29 seconds). I saw many cases where it ran even less than 20 seconds (sometimes as low as 14 seconds). So, unless I have some stupid bug in the code, I'm now wondering if I measured time incorrectly. I used "gettimeofday", which is what I use (and need to use) in Topcoder Marathon matches. But this measures the actual "real" time, while the time spent running my solution may be less. Was the 30 seconds time limit imposed on just the time spent in my code? (e.g. if multiple solutions were run on the same machine and my solution was sometimes being preempted, then in a physical interval of 30 seconds, my solution could have actually only run much less than that, e.g. 20 seconds only). Was this the case? And if so, was this mentioned anywhere? (I am wondering if I just missed this, which seems to be a pretty important thing)

What did you use for measuring time here on Codeforces?

On GassaVeeRoute Marathon announcement, 10 years ago
0

Never mind. The link to view other contests was just too small somewhere :)

On GassaVeeRoute Marathon announcement, 10 years ago
0

Is the contest over? I can only see the VK Cup Qualification Round when I click on "Contests" (I see no other contests and no link to get to the other contests). I thought this was going to last approx. 1 more day.

On GassaVeeRoute Marathon announcement, 10 years ago
+23

But if I understood the problem statement correctly, one intermediate vertex is allowed if the driver travels with 2 people (e.g. move to A, pick up person 1, move to B, pick up person 2, move to airport; or pick both persons from the airport, move to A, drop off person 1, move to B, drop off person 2). Since triangle inequality is not guaranteed it might happen that in such a scenario the person being picked up first (or bring dropped off last) arrives at their destination faster than using a direct route (I guess it's not very likely, but maybe it could happen).

Our idea (worth only 666.22 points) was also greedy-based. We considered the treatment stages in increasing order of the earliest time at which they are available (initially we have the first stage for each patient available at time 0, but once a stage is scheduled at some time, we have the earliest start time of the next stage for that same patient).

So the only decision to be made is which type of table to use for the current stage (of some patient) and whether to use a new table of that type or not. For a given table type T, we either reuse the table which becomes available the earliest or a new table. For this decision we used a scoring function consisting of a linear combination of 4 parameters:

  • a (fixed) penalty for using a new table (to try to reduce the number of tables used)
  • a penalty for the time at which the current stage is completed (to try to schedule it earlier, if possible)
  • a penalty for "wasted" time (if a table is available at time T1, but we can only start the treatment at time T2>T1 due to dependencies on previous stages)
  • a bonus for each table type, which depended on the set of treatments which could not yet be treated by any previously used type of table (the weight of a treatment is just the number of patients requiring it)

Each parameter (except the fixed new table penalty W0) was of the form weight * parameter_value (e.g. W1 * completion_time, W2 * wasted_time, W3 * bonus_score). We considered various combinations of these 4 weights, ran the greedy for each of them and picked the best.

+5

For me this was very fortunate :) I worked on the problem C-Hospital in the first 3 hours of the contest and with less than 2 hours left to go the approach that my team mate used had trouble finding any solutions on many of the larger test cases. So I decided to stop improving C and also try something at D, since there were many more points to gain there.

Since there would be no chance to implement anything from scratch in the remaining time, I took my solution from TCO'15 Marathon Round 1 and modified it slightly, in order to find small area polygons. Note that I only considered the case of polygons with exactly N-K vertices (no time to try anything fancy there). The main algorithm I used was to repeatedly select a random subset of N-K vertices, then pick a random empty triangle consisting of 3 selected points. After that, I would repeatedly add the smallest area "good" triangle for which 2 vertices were already on the polygon and the 3rd one was still outside the polygon (this essentially added one more point to the polygon).

Once the min-area solution worked, I thought about what I could quickly do to get large area triangles. Since the remaining time was very short, I modified the above solution to repeatedly add the largest area "good" triangle at each step :) This worked, too.

In the end, this approach produced the best solutions for most of the test cases ("best" in the sense of compared to other approaches we tried, not compared to other teams). Since I made all the submissions for this approach in the last hour of the contest, we didn't get any relative scores, so we didn't know how well this performed. We were quite surprised to see our score for this problem to be 779.68 (7th place overall, and I think 6th place for this problem), given that:

  • by mistake we didn't submit the best solution for test 10 (for which we only had a solution which was about 10 times worse from an earlier submission)
  • I didn't get the chance to modify too much of my solution for TCO'15 Marathon Round 1, so I only added empty triangles, i.e. triangles not including any of the N given points (when, in fact, I should have completely excluded the K points which I didn't care about to add on the polygon)

I would be very interested in knowing how other teams approached this problem.

Unrelated to the rejudge, but... why is it not possible to see the score obtained by the challenge/approximation problem submissions after the contest ended? Now I can only see the status of the submission on each test case (Accepted, TLE, etc.), but not the score it obtained on each test case, which is actually the most useful information. And there's also no global score of the submission displayed anymore (e.g. 100, 98.79, 80.76, etc.).

I wanted to look at the submissions of the other contestants and see on which test cases they did better than my solution, but:

1) I can't tell which one is their best submission (since no global score is displayed anymore)

2) I can't see the score per test case (since it's not being displayed)

Note that this is true also for my own submissions, for which I could see a global score and a score per test case during the contest (but not after it).

Hm... I actually thought about it (i.e. pairing a left side with the furthest right corner from which is it fully visible and pairing a right side with the furthest left corner from which it is fully visible), but it seems to me that the sets of visible sides are not supersets of each other. Let's take 4 buildings:

  • B0: [1,2], height = 10
  • B1: [3,4], height = 9
  • B2: [5,6], height = 1
  • B3: [large number, large number+1], height = 1

The left side of B3 is visible from both the right corners of B0 and B1 (if not, just move B3 further to the right until it is fully visible from both). So B0 is the leftmost right corner from which the left side of B3 is fully visible. But the right corner of B0 doesn't see the left side of B2, while the right corner of B1 does.

For this example the final answer might not change if you pair left corner of B3 with right corner of B0 or B1, but I'm wondering if it there are examples where it matters.

It was a nice problem set. After reading the (currently) posted editorials, I would say that most of my solutions are based on the same ideas (except that I actually used the bit masks approach for "Game of divisors" and it fit well within the time limit, because not all bit masks are reachable from the starting configuration — I really didn't want to think about the problem more than was needed :) ).

I read Errichto's comment about how "Light it up" was an amazing problem. For me it really wasn't. My solution is probably not the intended one, but it passes all the test cases and I'd be curious to find out cases where it might fail. So here it goes: for each of the two points on top of a building, construct a set with the lateral sides fully visible from that point (so, for the point on the left side of a building i, this set consists of the left side of building i, plus some right sides of buildings j located to the left of i, which are fully visible). Then, the question becomes: choose a minimum number of sets which cover all the 2*N lateral sides of the N buildings. Here I simply used greedy: always pick the set which covers the largest number of yet uncovered sides. My plan was, in case this greedy didn't pass all the test cases, to use techniques which I normally use for the approximation problem (e.g. randomize the order of the picked sets a bit, add some kind of score which considers more than just the number of yet uncovered sides, use multiple runs, etc.). But the simple greedy approach just worked.

For the approximation problem, the core of my solution was a greedy algorithm. I maintained for each ring the number of shifts which are still "compatible" with the current placement of rings so far. Then, when choosing the next ring to add to the solution, I would choose one from a set of candidates, which minimized the total number of new shifts it "kills". The set of candidates usually consisted of the rings with the smallest number of compatible shifts (or a random selection of them if they were too many). If they were too few, I had a mode in which I also added more candidates (in increasing order of the number of compatible shifts, up to a limit). The reason for considering rings with small number of "compatible" shifts to add at each step is that if I don't try to add them now, I might not get the chance later (i.e. they might lose even more of their compatible shifts and become fully incompatible).

OK, yes, it makes sense. Basically, the x coordinate of the intersection between two such functions can be computed as if one of them is constant and the other one is a line with a fixed slope (because the slope difference between the two functions is always the same: abs(i-j)).

I hope I'm not saying bullshit, but it seems to me that this DP could be handled in O(N*log(N)) time. Once dp[a] is computed, it can be seen as a half-line with a slope F0 until a+1, then slope F0+1 until a+2, etc. This is essentially a convex function. And if at some point the convex function of some dp[b] (b>a) is better than the convex function of dp[a], then we can drop the convex function of dp[a] completely. To me this suggests that we can keep a deque of these functions and when we get a new function, we just remove some functions from the back of the deque. And when the function from the front of the deque "expires", we remove it from the front. If we can compute in O(1) time the intersection of 2 convex functions, then we should get an O(N) solution. I'm not sure how to do it in O(1) time, though, but it's easy to do it in O(log(N)) time (by using binary search), so at least an O(N*log(N)) solution seems doable. Am I missing something here?

In your equation dp[a-1] = dp[a-k-1] + sum(a-k,a-1), does it mean that the most recent resting stop before a-1 took place at a-k-1? In other words, are you suggesting that maybe that distance (i.e. number of cities) between two consecutive resting stops is not too large? If yes, then this is exactly what I tried when I implemented the DP today. Optimal solutions can have large distances between two consecutive resting stops. I think I I got scores about 2%-3% worse if I limited this distance to something like 100-200, compared to not limiting it. But since the DP is faster, maybe the extra time could have been used to find better ways to improve the solution. Now that I'm writing this post, I realized that there is something I didn't try: to use an approximate DP (e.g. with limiting the distance between 2 consecutive resting stops) while doing iterative improvements and use a full DP once, at the end, on the best cycle found using the approximate DPs.

I really liked the problems from this contest. They had various degrees of difficulty and most of them required some non-trivial cleverness in order to solve them. And, more importantly, they weren't implementation-heavy (centroid decomposition was probably the most complicated technique from an implementation point of view, and that's still quite reasonable).

I had about 3 hours left for approaching this problem (due to sleeping and taking a long time to solve "Bear and cookies" — which I couldn't let myself to abandon in favour of the approximation problem :), which in the end paid off). My approach was to start with a proper cycle (I used the MST approach for TSP for the starting cycle) and then try to improve this cycle as long as I had time (by choosing a new starting node along the cycle or by rewiring the cycle — i.e. cutting two segments of the cycle and reconnecting them differently). When computing the score for a cycle I, of course, considered the possibility of adding stops for resting. However, I only did this greedily during the contest :( (i.e. I only added a rest stop when the cost of traversing the segment without resting was larger than rest time + new cost of traversing the segment). The advantage was that this greedy approach was very fast (O(N)), so I could run lots of iterations to improve the initial solution. Somehow I didn't even think about the possibility of using DP instead of greedy for this part. As soon as I read your comment, though, it became obvious that I should have used DP here. So today I had some time to take my solution from the contest and replace the greedy part by a DP solution (with a worse time complexity, though — O(N^2)). This reduced the number of improvement iterations significantly, but the scores were much better. On my local tests I obtained a 22% improvement. Given that my score during the contest was about 73%, this change alone would have probably given me around 94%-95% of the best score. I couldn't test it by running on the official tests, because it seems that now scores are not reported anymore when submitting on Hackerearth.

So I have a question — can the DP be implemented faster? My DP essentially computed S(i,j) = the minimum score to reach the i^th node on the cycle with a fatigue equal to F0+j (if j=0 and i is not the first node on the cycle, then a resting stop took place there). I tried placing small upper bounds for j (e.g. force j <= 50 or j <= 250), but the score improvements were lower (despite running more iterations).

On riadwawFacebook HackerCup Round 1, 10 years ago
0

I did something similar and it worked without problems. Did you assume that the "head" must always be the 4th element in its set (or the 1st, whatever, it's not clear to me how you're counting them) ? If yes, then this is probably the mistake.

+10

I really liked the fact that the process of generating the test data was properly described. This way we could generate local tests which had similar structure as the official tests, making local testing meaningful. In my case, most improvements on the local tests were also improvements on the official test cases, which was really good.

As Xellos said, it would have been nice if finding a solution was easier. Initially I tried various approaches and almost none of them worked (i.e. they couldn't find a solution in <= 10^6 steps). The first approach that was actually able to find solutions for all test cases was also at the core of my best submission :) [ this first solution scored 80.57 in the end, so even if I had stopped competing then, I would have got at least 3rd place, which is not bad :) ]. I will mention this first approach now and I will provide details on the improvements and what I have in my best submission later. Basically, the first approach tried at each step to select a move which:

  • is a maximal interval (in terms of sum)

  • has maximum "score": where the score of a move is equal to the absolute sum of differences between the elements in the interval and their average (i.e. the value with which they are replaced after the move) — for instance, if the interval consists of the elements [1,7,5], their average is (1+7+5)/3=4 and the score of this move is: |1-4|+|7-4|+|5-4|=3+3+1=7.

I don't exactly know why this approach finds relatively good solutions, though :)

Yeah, that's what I also had initially and tried really hard to optimize it to run in time — I could only make it score 80 points (the last 2 test cases still got TLE). But if you take a closer look, you'll see that when computing the answer, this looks a lot like multiplying some matrices together (e.g. you have one matrix for each block). So you can build a segment tree over the blocks, which is updated whenever each block is updated. This way, the answer will be found using the data stored in the root of the segment tree. Of course, at this point you don't really need blocks anymore — you could make each block equal to one character :) But that exceeded the memory for me, so in the end I used blocks of 16 characters and a segment tree over them.

Hm.. I think I missed taking your best solution. I saw that your final score was 99.977, so I went through your last few submissions until I found one which rounded the same as this (since they only show the score for each submission with 2 decimal places). But I think I picked one whose rounded score was 99.97, instead of 99.98 :( Sorry for that. The individual submissions also show the total absolute score (if you hover the mouse over the right place), but using that would have been too time consuming.

So, obviously, the experiments I did were somewhat incomplete. It's possible that some other submission of each contestant (not the best scoring one on the official test cases) might score better on my local test cases. My point was just to point out that the official test cases missed some important cases, which show up easily when generating random test cases. As explained in another comment, this seems to be because the D values used in the official test cases were much smaller than the ones mentioned in the problem statement (more like D<=10 instead of D<=100). And this issue of having test cases generated by some secret strategy shows up again and again for challenge/approximation problems on platforms like Hackerearth or Codechef (only the Topcoder Marathon matches are professional enough from this perspective).

By the way, if anyone is interested, ceilks analyzed some parameters of the official test cases and it seems that the values of D seem to be more like <=10, than <=100 (as mentioned in the problem statement), which would explain why the strategy I mentioned (trading off some inversions for a larger squared sum of distances) has no effect on the official test cases. You can read more about it in the Comments section of the problem

+15

Just out of curiosity, I generated 100 random tests. I chose N randomly between 500 and 1000, and K randomly between 10000 and 100000. I chose the X coordinates randomly between 0 and 10000000 (avoiding repetitions), but I set X[1]=0 and X[N]=10000000 (in order to be similar to the official test cases). I chose the permutation P randomly and I chose each value of D randomly between 1 and 100.

I then copy-pasted the best solutions of the top 5 competitors, except that in my case I used the solution which, for the last 10% of the moves, tries to maximize the squared sum of distances, ignoring the number of inversions (as explained in my previous comment). ceilks's solutions was taking too long (there is no time control in his code, so I'm guessing that his code just worked on the official test cases in time) and I couldn't compile HellKitsune's solution in the setup I had (my compiler couldn't find <bits/stdc++.h> and I didn't spend any time trying to find out how to make it recognize this header). So, in the end, I only evaluated my solution, anta's and Kmcode's.

The results on the 100 test cases are below and my solution obtains better scores than the other two. This is mostly due to the fact that the additional strategy which focused on improving the squared sum of distances in the last 10% of the moves obtained between 1% and 5% better scores than the solution without it on 38/100 test cases. On the other test cases, as expected, my solution was generally worse than the other two, but by a very small margin (while the wins on the 38 test cases were by a large margin).

However, as mentioned in my previous comment, this strategy brought zero improvement on the 20 official test cases. How is that possible? If the official test cases had been randomly generated, then there should have been around 7-8 test cases on which I should have got 1%-5% better scores than what I did. My only explanation is that the test cases were generated according to some hidden strategy which is significantly different from a random one. And this brings me to my biggest complaint about challenge problems on Hackerearth and also on Codechef. The strategy for generating the test cases is almost never described, which is an awful decision, in my opinion. If we can't generate local tests in order to tune our solutions, then everything becomes a guessing game. In the case of this problem, when I saw that a solution which was bringing me huge score improvements locally had zero effect on the official test cases, I basically stopped trying, because it meant that I was testing my solution on the wrong kind of test cases, without having any idea how the "right" (i.e. official) test cases looked like.

Results of 100 random local tests (the scores on each line are: first-mine, second-anta's solution, third-Kmcode's solution; smaller scores are better). I will upload the files somewhere and share them later for anyone interested (though you can obtain similar results by writing your own generator and using the strategy I described at the beginning of this post)

  • 0: 118775.593255 120449.850401 121034.333374

  • 1: 175777.354474 179931.467292 182660.491005

  • 2: 531.401095 525.478701 531.565640

  • 3: 14083.361750 13871.374769 14049.528068

  • 4: 43271.604239 42615.338797 43037.820895

  • 5: 229440.948818 234355.370203 235493.339293

  • 6: 8468.543977 8348.384712 8449.553956

  • 7: 20765.191134 20355.213065 20722.876004

  • 8: 66153.103441 65946.799585 66942.328976

  • 9: 21465.960530 21062.681433 21394.392677

  • 10: 180313.676816 183058.700239 184447.359038

  • 11: 2189.342829 2156.765415 2187.870864

  • 12: 3671.068116 3627.771627 3665.717468

  • 13: 14520.477126 14283.212222 14477.683136

  • 14: 134693.351150 136524.129409 138722.517411

  • 15: 0.005492 0.004444 0.000000

  • 16: 33487.700300 33273.992054 33623.149107

  • 17: 21947.569780 21614.321303 21852.374478

  • 18: 45262.909914 45364.718182 45940.395344

  • 19: 81590.884365 81480.554588 82685.592188

  • 20: 0.168267 0.142691 0.000000

  • 21: 1656.637277 1634.607731 1656.759530

  • 22: 9598.209981 9445.979189 9586.128109

  • 23: 117017.351580 119593.477492 120509.618310

  • 24: 334.270813 330.613132 334.561708

  • 25: 3153.370299 3113.650690 3148.972083

  • 26: 51979.703701 52819.868027 53385.876226

  • 27: 29669.893828 29853.968418 30167.364562

  • 28: 31021.211354 30547.961608 30879.654225

  • 29: 127581.370180 130076.532780 132281.548622

  • 30: 2282.259273 2255.880674 2281.972294

  • 31: 10064.822119 9868.336529 10041.007310

  • 32: 27883.976786 27300.487018 27853.576393

  • 33: 86370.507894 84766.232678 86486.095749

  • 34: 11209.943362 11099.222906 11160.280797

  • 35: 18934.018932 18608.380386 18870.639383

  • 36: 2688.476182 2654.933772 2681.699868

  • 37: 9994.539039 9786.186934 9973.090106

  • 38: 125864.488913 128869.973054 129305.531915

  • 39: 0.011393 0.008895 0.000000

  • 40: 1083.051813 1070.024909 1083.352531

  • 41: 10553.200729 10432.236471 10535.842485

  • 42: 141849.048828 144639.894866 145798.831536

  • 43: 0.551054 0.520647 0.000000

  • 44: 10096.366076 9968.217139 10051.761981

  • 45: 580.658971 572.882198 580.896288

  • 46: 8776.439279 8612.498453 8752.525437

  • 47: 31523.776271 30860.513929 31484.003196

  • 48: 158054.016150 161068.239602 162778.073123

  • 49: 4308.224529 4245.484249 4299.767027

  • 50: 11084.775464 10857.521785 11055.225077

  • 51: 42758.692483 41995.694069 42495.939144

  • 52: 113756.680615 115940.220423 117499.705965

  • 53: 0.402649 0.395810 0.402809

  • 54: 8990.456485 8930.970726 8946.920708

  • 55: 22157.918747 22021.379702 22207.570395

  • 56: 50359.670605 50280.307220 50738.449711

  • 57: 73554.882744 74714.743325 76054.446864

  • 58: 170.974950 168.922806 170.998083

  • 59: 1541.745733 1520.683850 1538.981892

  • 60: 17427.172150 17122.913003 17347.208597

  • 61: 65590.437202 64008.103485 65281.658552

  • 62: 300.470178 297.507711 300.579359

  • 63: 2250.266834 2203.535288 2245.928809

  • 64: 3217.849383 3177.377405 3212.745288

  • 65: 23862.276415 23514.502469 23768.750398

  • 66: 69749.553179 71042.035768 71881.341771

  • 67: 153931.598838 159496.781745 162442.580444

  • 68: 4320.433198 4261.377842 4297.999279

  • 69: 13703.966877 13422.147557 13658.587314

  • 70: 100460.916986 102698.379084 103745.156410

  • 71: 253275.121133 258091.441184 258407.175896

  • 72: 12116.784968 11949.970400 12081.693306

  • 73: 4954.639811 4903.666153 4946.959437

  • 74: 24193.570723 24047.823174 24256.605802

  • 75: 65267.924194 66834.056255 67134.887620

  • 76: 135382.015927 140027.936887 142201.851356

  • 77: 300.368418 296.995973 300.436205

  • 78: 3868.452733 3802.703764 3861.658060

  • 79: 67184.118561 68056.156239 68503.908845

  • 80: 156314.895628 159524.611323 160497.491723

  • 81: 7744.743766 7659.351417 7720.956067

  • 82: 195067.253409 201764.859326 204134.506339

  • 83: 0.723379 0.662980 0.000000

  • 84: 8333.797364 8249.620756 8313.640940

  • 85: 30412.643462 29990.227520 30352.935345

  • 86: 111243.097039 111974.700032 114075.644949

  • 87: 195.116852 192.776825 195.299753

  • 88: 4947.566577 4890.392076 4933.071624

  • 89: 94584.077612 95578.847863 96398.144913

  • 90: 136226.694122 140816.825189 143466.898866

  • 91: 82127.785561 81450.464010 82681.843439

  • 92: 11528.936750 11445.924613 11487.978420

  • 93: 19486.206725 19306.875304 19429.696129

  • 94: 67369.912108 68563.368559 69130.315950

  • 95: 172928.020459 175725.841232 176545.667213

  • 96: 11004.470112 10905.679033 10947.414167

  • 97: 810.542780 798.601247 809.533873

  • 98: 80575.467958 81027.551761 81819.932406

  • 99: 71648.229224 70247.486462 71584.005115

  • Total: 4858824.932541 4918744.004140 4972999.650346

(Small explanation: there are some test cases where Kmcode obtains a perfect score of 0, while my solution and anta's solution don't — in my case, this is because I never allowed to swap P[N], because on the official test cases K was smaller than the initial number of inversion by a sufficiently high difference)

+8

How did you approach the challenge problem?

My solution consists of minimizing the number of inversions first (i.e. every move must decrease the number of inversions by 1) and, among multiple such moves, it picks the one which maximizes the squared sum of distances. This was the basic greedy which I later randomized (e.g. at each step, I picked a random move among the top K best ones). Towards the end of the time limit I only did this for the last moves (i.e. I kept a large prefix of the best solution found so far and only tried to improve something at the end).

What's strange is that on my local tests I had an additional strategy which was giving me between 2%-5% improvement on about 40% of the test cases (and no improvement on the others), but it didn't improve anything on any of the official test cases, which I find quite weird. The additional strategy was to focus only on increasing the squared sum of distances as much as possible (and ignoring if it would decrease/increase the number of inversions). I only used this strategy for improving the last moves (about 10%) of the best solution which tried to minimize the number of inversions first. I also tried doing a greedy which always picks the move which improves the score the most (i.e. it might be OK to increase the number of inversions if this also increases the squared sum of distances significantly). This was giving me similar improvements on my local tests (when used under the same conditions), but none on the official ones. Given that the difference between the top 8 solutions was < 0.1%, an improvement of even 1% on just 1-2 of the test cases with larger scores (out of 20) would have been enough for any of them to win.

Can you check the judge program of the approximate problem? I made some submissions which verified that the interval [X[1],X[N]] is ~10^7 on 19/20 cases — in this case, the worst absolute score on these test cases is < 5 — however, the judge program is even showing me scores > 100000, which are not possible, unless the judge program uses a different scoring function from the one mentioned in the problem statement.

You mean you're giving I_love_Tanya_Romanova a chance to compete in SEERC (which also takes place tomorrow and which his team won last year) :) Or maybe he'll also compete in the Clash afterwards :) Why not?

On niki.s.16IOI 2015 Discussion, 11 years ago
+16

I was thinking about the same thing, but I'm not sure if you can really update the segment tree efficiently (at least not while also keeping the fractional cascading). If you can do it, then I'd be interested in how this is done (I've spent a bit of time thinking about this, but I might be missing something).

However, I thought about another solution which doesn't require segment tree updates (only queries — thus, with fractional cascading, we get truly O(log(N)) per query).

First, let's assume that all b[i]'s are distinct. If they aren't, it's not difficult to extend all the intervals slightly so that all the equal b[i]s are mapped to distinct positions after the renumbering (e.g. if you have 3 b[i]s equal to 7, they can become 7.1, 7.2 and 7.3).

Then we sort the k[i]s in each query and we maintain a stack with the queries which we will do next for finding the intervals which "cover" a k[i]. An element of the stack consists of two intervals [p,q] and [r,s] and represent a query in the 2D segment tree (all the intervals having p <= a[i] <= q and r <= b[i] <= s). Initially the stack is empty.

We will consider the teams in increasing order of k[i]. If the stack is empty, we add to the stack the pair of intervals ([1,k[i]], [k[i],MAX]). Otherwise, let ([p_U,q_U],[r_U,s_U]) be the intervals at the top of the stack. Because of the way the algorithm works, we will have: p_1=1, p_V=q_(V-1)+1, s_1=MAX, s_V=r_(V-1)-1. This says that the intervals [p_X,q_X] are adjacent to one another and have increasing left endpoints, while the intervals [r_X,s_X] are adjacent to one another and have decreasing right endpoints (as we move from bottom to top).

If r_U > k[i] then we add to the top of the stack the pair of intervals ([q_U + 1,k[i]], [k[i], r_U — 1]) ; otherwise we only modify q_U and set it to k[i].

After this, we go through the elements of the stack from top to bottom, until we remain with zero needed intervals (which is initially equal to k[i]) or until the stack is empty. We first make the query [p_U,q_U] x [r_U,s_U], corresponding to the top of the stack (if r_U < k[i] then we first set r_U = k[i]). If the number of intervals is less than or equal to the needed number of intervals, then we pop the element from the stack and decrease the number of needed intervals accordingly. If the stack is not empty, then we update q_(U-1) and set it to q_U (we basically extend the interval [p_(U-1),q_(U-1)], which is now at the top of the stack, to [p_(U-1),q_U]). If, however, the query for [p_U,q_U] x [r_U,s_U] has more intervals than the number of needed intervals, then we need to find X = the Y^th smallest right endpoint among all these intervals (where Y=number of still needed intervals). What this means is that among all the intervals with left endpoint in [p_U, q_U] we need to use all those which have the right endpoints in [r_U,X]. We "mark" this by setting r_U=X+1 (i.e. by reducing the old interval [r_U,s_U] to [X+1,s_U]).

If the explanation above is too technical, think of the stack as a set of disjoint queries in the segment tree, in increasing order of priority (top to bottom). Essentially, you always want to consider intervals with right endpoints close to the current k[i] (but >= k[i]) and with left endpoints <= k[i]. As we advance with the index i, we get new left endpoints which are valid for the current i (those between k[i-1]+1 and k[i]). From the intervals with these new left endpoints we want to consume first those intervals whose right endpoints are smaller than any right endpoint of a remaining interval which has a smaller left endpoint than them. And once all such intervals are consumed, we "add" the remaining intervals with larger right endpoints by extending the interval of left endpoints from the new top of the stack to also contain [k[i-1]+1,k[i]].

There are O(M) push and pop operations and, thus, there are O(M) 2D segment tree queries (which can take O(log(N)) if we use fractional cascading).

In my solution I also used an additional operation which may be called O(M) times: finding the Y^th smallest element (right endpoint) in a 2D range. Of course, we can use binary search with range count queries on the 2D segment tree to support such an operation in O(log^2(N)). But we can use another 2D segment tree with fractional cascading in order to support this operation in O(log(N)) time, too.

I would be interested to know what the contestants which scored full points during the competition implemented.

0

My approach didn't have special cases for small and large K. Reduce N to something small (e.g. around 600000-700000). Then compute all A[i]'s and B[i]'s up to N. Then compute a score for each interval of K consecutive values from the arrays A and B (I used the sum of the most significant bits, enough so that the result fits an unsigned int — but this made it slower, because I iterated it over bits ; probably doing normal sum over long long would have been faster).

Then I kept the best X values of A and Y values of B (X around 1500) and tried all their combinations for finding the best (x,y) pair. The score of a combination is exactly the sum of the elements (I can compute it by iterating over bits and deciding how many bits of each value exist in the KxK rectangle). For K=100000 this could have exceeded long long, so I only used the most significant bits here (from bit 29 to bit 9, thus excluding bits 8...0).

Maybe a batter approach would have been to use a higher limit of N and a lower limit of X and Y. This way, I could have found more promising intervals, but I would have tested fewer combinations. Or simply let X be large and Y be small, or something similar, in this case.

For K=1 I also thought about finding the best matching A[i] and B[j] (which have the most bits together), but I couldn't find an efficient way to do this in the late hours of the contest, so I just kept the approach I already had and tried to tune its parameters more. I would have liked to spend more time on the challenge problem (which is the main reason I participate in Hackerearth contests), but it took me many hours (from the evening until the morning) to get 100 points on the problem "Upgrade". And in the overall standings, getting the 100 points on "Upgrade" was more valuable than a few more percentage points on the challenge problem.

+16

I will explain the "sqrt trick" solution that I was referring to in my first comment, to make sure we're on the same page.

For a line (i.e. in the array case), you can always describe the current status of the array relative to the intervals (in ascending or descending order) of the initial array. For instance, the current array can have: the values from positions [3..5] of the initial array in ascending order, followed by the values from the positions [7..10] of the initial array in descending order, etc.

Initially, the status of the array is just one interval (the interval [1..N] in ascending order). Every reverse operation (u,v) adds at most two more such intervals (by splitting some existing intervals so that the positions u and v are interval endpoints and then simply reverse the intervals between u and v (in time proportional to their number).

If you apply this algorithm, it will be quite efficient in the beginning, but it will become less and less efficient, as the number of intervals grows. To compensate this, once there are too many intervals (e.g. O(sqrt(N)), we rebuild the whole array in linear time. What this means is that the we recompute the values for each position of the array and now say — this is the new initial array, thus reducing the number of intervals back to 1.

Overall, this approach clearly takes O(sqrt(N)) amortized time per operation.

So what I did was to apply this approach to the paths of a heavy light decomposition. Every path is described as a sequence of intervals (in ascending or descending order) of positions of paths from the initial tree, e.g. path 1 consists of: the positions [3..5] of path 2, in ascending order, followed by the positions [7..10] of path 3 in descending order, followed by the positions [2..4] of path 1 in descending order, etc.

Every operation introduces some splitting points, in order to be able to properly gather all the "original" intervals contained in an interval of positions of a path from the decomposition (and we know that the path between two vertices u and v consists of several such intervals of positions of at most O(log(N)) paths from the decomposition).

Then, for a reverse operation, the "original" intervals forming the path from u to v are reversed. This is not as simple as changing their order, though. When swapping and reversing two intervals from two different paths we need to first "trim" the longer interval to the length of the shorter one (i.e. split it in two) and then swap the equal length pieces and then continue. Thus, many more splitting points (i.e. intervals) are introduced this way (O(log(N)) additionally per operation).

Finally, one last piece of the algorithm is to rebuild the whole tree (i.e. reduce the number of intervals to 1 per path) when there are too many intervals overall.

This approach gets TLE and it took me a long time to optimize it enough. First, I introduced a "re-merging" operation, to re-merge consecutive intervals of the same path, which can form a larger interval. Then, in my solution, I repeatedly needed to find out the interval containing a given position P. I tried linear search, STL sets, a combination of them, but they were too slow. The final thing was to notice that in most cases I already had this information readily available, so I just used it.

I'm not sure what the final time complexity of my solution is (something between O(N*(log(N) + sqrt(N))) and O(N*log(N)*sqrt(N)), I guess).

+19

Oh, well, I wasn't there in the end, it seems :) My ideas for the challenge problem were not good enough and I spent way too much time on solving the problem "Upgrade" (because my solution uses the sqrt trick instead of splay trees, so it was timing out a lot until I eventually managed to optimize it enough).

I would be interested in finding out how the other contestants (especially the top ones) approached the challenge problem, I will also share my approach later, when I have more time to write it.

And to the author: The problem set was very nice! Congratulations. But I do have one comment about the challenge problem: Due to the way the score was computed (sum of scores of all the test cases), this made the score on test cases with smaller K (where the answer could be higher) much more important than the score on test cases with larger K (where it was more difficult to get similarly good answers). A scoring function which made every test have similar weights would have been better, in my opinion.

Also, did anyone get positive scores for tests 60 and 91 of the challenge problem?

On I_love_Tanya_RomanovaMay Clash 2015, 11 years ago
+11

The contest started almost 7 hours ago, but it only has 5 standard algorithmic problems. What happened to the 6th task, the approximate/challenge one?

+8

I also liked Frames best :) (out of the standard algorithmic problems)

And I also liked the scoring for the approximation problem, which allowed each test case to have more or less equal weight — this is very good. But I have a question to the author of the approximation problem: what was the exact scoring function used ? :)

I didn't fully understand what the scoring function was, just that fewer moves are better :)

+33

For me the biggest challenge with the approximation problem was to get my solution Accepted on all the tests :) First of all, I should say that I think the constraints for the placement of the knights were not satisfied. The statement says that white knights are at positions (i,j) with j <= N/2 and black knights at positions (i,j) with j>N/2. I initially had a solution which tried to slightly take advantage of this, but kept failing. I added some asserts and noticed that the coordinates didn't match the condition from the problem statement. After that I switched to considering that the knights could be placed anywhere on the board (maybe even mixed among themselves) and I was able to get my first Accepted solution.

Now, about the idea: I repeatedly chose two knights (a white one and a black one) and swapped them.

I will discuss first about the swapping strategy. I ran a BFS from the white knight and allowed it to pass only through empty cells or cells with other white knights (passing through a cell with another white knight meant that, actually, the other knight would continue the journey towards the destination). I also made sure that the path contained at least one empty cell (to be able to make some moves — this may not be very important except in some very special cases). Then I chose a neighbor of the black knight with the lowest BFS distance and made moves until a white knight reached that neighbor. Note that this means that now the original position of the white knight is empty. I then ran BFS from the black knight and moved it along the shortest path to the original position of the white knight (allowing it to pass through empty cells or cells occupied by other black knights). Finally, I finalized the moves of the white knights — I moved one white knight to the initial position of the black knight (now empty) and all the other knights whose movements were "stuck" because this white knight didn't go all the way in the first pass. Using a full BFS lead to TLE on 6-7 test cases, so I used a heuristic. Whenever the distance of a cell was "too far" from the target, I wouldn't add that cell to the BFS queue. I defined "too far" if the number of moves to the cell + the Manhattan distance from the cell to the target was larger than the Manhattan distance from the source to the target plus some slack. This made things sufficiently fast to pass all tests.

Regarding the strategy for picking a pair of white and black knights to swap, I actually tried several, as long as time permitted. The first one simply iterated through the matrix in row-major order and picked the first white and the first black knights (with an extra condition that they shouldn't be "very close", since I was getting WA on a test case without this). If time permitted, I tried other strategies — iterating in order of decreasing rows — increasing columns, decreasing rows — increasing columns, picking pairs of knights randomly, etc.

My score during the contest was 51150, but there, in the swapping strategy, when moving the white knight towards the black knight I picked some random neighbor of the black knight instead of the one with the shortest BFS distance. I was planning to change this at some point, but in the end I forgot about it :) I remembered about it some time after the contest, made the change to pick the neighbor with the smallest BFS distance and resubmitted for practice — the new score was 50298, which is about 1.6% better than my winning score during the contest. I'm glad this silly oversight didn't cost me the victory, since the runner-up was pretty close.

I'm certain much smarter approached could have been tried at this problem — at the very least some min cost — perfect matching algorithm could have been used for pairing white and black knights for swapping. If min-cost perfect matching were too slow, then perhaps something close to it but faster could have been used. I would have liked to try something like this, but didn't have enough time. I was visiting some relatives during the first 6 hours of the contest and then, somewhere between the 9h and 10h mark I received a visit myself. So I didn't have too much time to try very smart solutions for pairing knights. So I chose a greedy-based pairing. Nevertheless, I think that what was important for this problem was to completely separate the pairing strategy from the swapping strategy. This way, after getting an initial solution accepted, I could update them independently (e.g. like trying multiple pairing strategies within the time limit).

Finally, I would like to mention again that the reason for which I participate in these Hackerearth monthly "clashes" is the fact that they have an approximation problem — I really like challenge/approximation problems :) This month's problem was more difficult even to get Accepted than last month, but I still enjoyed it quite a lot (even if it was a bit stressful, with my first Accepted solution being, if I remember well, less than 1h before the end of the contest).

0

Thanks for the explanation. Of course it's true that O(N*K) is also O(N*K^2) from a theoretical point of view :) But I actually thought that the better bound of O(N*K) didn't hold. I didn't realize that simply bounding the for loops to the sizes of the corresponding subtrees can have such a big impact.

If K were larger, what I would have done would be to simply store the maximum value KMAX(x) for which a node x has a non-zero count (of course, KMAX(x) is upper bounded by K) and then only iterate up to it. Apparently doing this simple and obvious optimization is actually equivalent to iterating up to the sizes of the subtrees (since KMAX(x) cannot be larger than the size of node x's subtree). I think I probably did this optimization in several problems in the past without realizing that it might actually also improve the theoretical time complexity, not only the running time in practice.

Another optimization which is obvious and very simple (and which I used in my solution) is: if the element in the outer for loop has a value of zero, do not enter the inner for loop. This basically ensures that the outer for loop iterates up to min(K, size of the subtree) (plus a few extra increments of the outer loop counter which do not matter much overall). Although the inner loop would still iterate up to K even when the size of the corresponding subtree is less than K, this actually improves the running time a lot. I submitted my solution both with and without this optimization for practice and the maximum running times are 0.25 sec (with) -vs- 1.25 sec (without). I wonder if the theoretical time complexity is better than O(N*K^2) in this case, too.

+12

Sorry, but your solution from the editorial is O(N*K^2): for each (node, child) pair (there are N-1 such pairs) you have two nested loops: for i, for j, each loop running over O(K) values. Why do you say that it is O(N*K)?

I also looked at all the accepted solutions from the contest and all of them use essentially the same idea, which is O(N*K^2).

+8

Thanks for pointing out how to view submissions [ it turned out to be just a matter of scrolling down in some part of the page :) ]

You're right about the tests and not underestimating the importance of smaller tests. Your solution did score better than mine on most of the large test cases (and also on many of the smaller ones), but overall my solution managed to get a slightly higher total score.

0

Before reading your comment I didn't even think that a complexity better than O(N*K^2) was possible. The limits seemed to suggest that a better time complexity isn't needed. I now looked at the author's solution and it's really cool how he computes the total number of required sets containing a fixed node (the current centroid) in O(csize*K) time (where csize is the size of the subtree of the current centroid). I wonder then why he didn't choose larger limits for K.

As for other approaches for solving the problem better than O(N*K^2), it is at least theoretically possible to solve it in O(N*K*log(K)) using FFT. If a nicer number were used instead of 10^9+7, then FFT could have been a feasible solution. But under these conditions I think one would need to do FFT on real numbers, which could even exceed 64 bits, so it would be kind of painful (plus possibly too slow in practice).

0

The scoring function I suggested is pretty simple. Basically instead of considering the score S, it would be S / sum of all positive numbers in the matrix. But if the platform requires that the score is printed by the submission itself and is not computed by a separate judge program, then it could simply ask the submissions to also output this ratio.

I could see your submission in the list, but I couldn't find out how to see the code/its results. If viewing it is really possible, then the interface is not friendly enough to make it easy/obvious.

+8

I suspected that the problems weren't very difficult by design (I guess that if I had read some problems from the previous contests with the same format I could have come to this conclusion myself). I think that's fine, particularly given that there is also an approximation problem in the problem set.

I am also surprised that so few people submitted anything for the challenge/approximation problem. Actually, do we know how many people actually participated in the contest? (i.e. made at least 1 submission). The standings seem to include also people which simply registered, but made no submission.

+22

I_love_Tanya_Romanova already described some approaches for the challenge problem, which lead to a very high score of 29224718. My score was slightly higher: 29315801.

My solution consists of multiple greedy algorithms plus some randomization, all being run while time permits.

The following greedy algorithms I ran only once:

1) repeatedly choose the matrix which covers the largest sum (as long as it's positive) [my first submission consisted of just this and has a score of 96.17 of the final top score]

2) repeatedly choose the matrix which has the highest sum/area ratio (as long as it's positive) [my 2nd submission contained just 1+2 and has a score of 96.44 of the final top score].

3) skyline greedy: sort the matrices in decreasing order of area and try to place them from top to bottom so that as few empty cells are left (essentially trying to cover as much area as possible) — the good thing is that I mostly reused the code from a recent Codechef challenge problem for this part :) (CHPUZZLE from FEB'15 long contest)

Then came the most important part of the algorithm, in my opinion. I sorted the matrices in descending order of area and then, while time permits:

1) run a greedy which tries to place the next matrix in the first position it can, considering increasing rows — increasing columns order [my 3rd submission was just my 2nd submission + this greedy run only once (for the descending order of areas) : its score is 99.32 of the final top score and it would have been enough for winning the contest; if I had known, then maybe I would have participated in the Codechef cook-off instead of trying to optimize my solution further :) ]

2) 3 more times same greedy, in order to cover all combinations increasing/decreasing row x increasing/decreasing column

3) 10 times the same greedy, but for each piece the combination increasing/decreasing row x increasing/decreasing column was chosen randomly

4) Change the order of the sorted matrices slightly : I randomly chose 10 pairs of matrices and swapped them in the sorted order, but I'm pretty sure something smarter could have been done here (like swap only "large" matrices between them or generate swaps only among the first x matrices in sorted order, where x would steadily increase after every iteration).

5) If there is more time left, run the steps 1-5 again.

So, as you can see, there was nothing fancy in my solution. In general I guess that there isn't time to try very complicated approaches in these 12 hour contests (compared to Codechef 10-day long contests or to Topcoder 1-2 weeks marathon matches, where there is time for testing more fancy approaches).

As for the groups of tests, I am pretty sure that the highest scores were obtained for those tests which had all a(i,j)>=100 and the sum of areas of all the K matrices was larger than M*N. For these tests it made sense to try to cover as much of the large matrix as possible with the given K matrices, essentially ending up with a rectangle packing problem. I think a better scoring function would have been a relative scoring function, like: the sum covered by your placement of matrices / the sum of all the positive elements in the matrix. This way, the absolute score for each test would have been between 0 and 1 and it would have ensured that each test had an almost equal weight in the total absolute score. With the current scoring function, for instance, there were tests where the absolute score was around 1000 and tests where the absolute score was around 7.000.000. It's obvious that the score for the test with 1000 is almost irrelevant to the overall final score.

+12

I liked the problems, but they didn't seem very difficult. This was good because it didn't require contestants to work all the 12 hours on the problems (anyone could just start whenever they wanted — even starting solving the problems in the second half of the contest still gave one the chance to get a top spot). For me this was even better, because I participated in the contest mainly because I like challenge/approximation problems :) So I didn't have to spend too much time on the other 4 problems. That being said, I am curious how to solve "Bits transformation". My solution is a min-cost perfect matching algorithm (which I tuned in order to pass the time limit for the larger 3 tests). But all the solutions I looked at after the contest seem much simpler than that :) (a DP of some sort)

Regarding the HackerEarth platform: overall it seemed good. But there was this annoying bug with displaying the scores for the approximation problems in the contest standings. Basically, the score shown in the standings was always the score from the time when you made the last submission. What this means is that if you got the score of 100 during the contest and then stopped making submissions, a score of 100 would still be shown in the standings even if someone else made a better submission in the mean time (the correct score was shown at the end of the contest, though). This is fine as long as you're aware of the bug, but since this was only my second contest on HackerEarth, I was wondering what was going on. Thanks to I_love_Tanya_Romanova for clarifying this during the contest.

ANUTHM ****

As a holiday gift, Tojo received a probability problem. The problem read as follows Consider an N by M grid. Rows are numbered 1 to N, from top to bottom. Columns are numbered 1 to M, from left to right. You are initially at cell (1, 1) and want to go to cell (N, M). From any cell you can move to the cell below it or to the cell right to it. You should never go out of the grid. At any point you should consider all the possibilities of movement with equal probability Let P[i][j] be the probability of visiting cell (i, j). You need to calculate the sum of P[i][j] for 1 ≤ i ≤ N, 1 ≤ i ≤ M. As we all know, Tojo really hates probability related problems. He wants you to solve this task Input The first line of the input contains an integer T denoting the number of test cases. The description of T test cases follows.Only line of each test case has two integer N and M. Output For each test case, output a single line containing the required answer. Answers within an absolute or relative error of 10-6 will be accepted. Constraints 1 ≤ T ≤ 1000 1 ≤ N ≤ 1000 1 ≤ M ≤ 1000 Example Input: 2 2 2 1 6

Output: 3.000000 6.000000 Explanation Example case 1 Probability matrix P for N=2, M=2 is 1.0 0.5 0.5 1.0 You are at (1, 1) initially. So the probablity of visiting (1, 1) is 1. At (1, 1) you have 2 options, move below to (2, 1) or to right cell (1, 2). Probablity of going to (1, 2) is 0.5. Probability of going to (2, 1) is 0.5. You always end up at (2, 2), so P[2][2] is 1. Required sum = 1.0 + 0.5 + 0.5 + 1.0 = 3.0 Example case 2 Probability matrix P for N=1, M=6 is 1.0 1.0 1.0 1.0 1.0 1.0 Because at any position there is only one possible next position.

On SebiSebiMinimum path cover in DAG, 11 years ago
0

This is correct for the case when you want to cover all the edges of the DAG with edge-disjoint paths. I thought that you were referring to the original requirement, i.e. cover all the nodes of the DAG, but with edge-disjoint paths. Anyway, this version of the problem is also solvable — my idea uses max-flow with lower capacities in the nodes of the DAG + binary search.

On adamantDynamic connectivity problem, 11 years ago
+5

Nice problem. I thought about a somewhat different solution for the offline problem, which seems a bit easier to me, from a conceptual point of view. Let's construct a segment tree over the k time moments. Then, each operation of type "edge i is available from time L to time R" is inserted in the O(log(k)) segment tree nodes which cover the interval [L,R]. This step takes O(k*log(k)) time and memory.

Then we traverse the segment tree by using DFS starting from the root. During the traversal we maintain the connected components of the graph using disjoint sets. When we enter a segment tree node, we perform a Union operation for each edge which is stored in that node. We also store the successful Union operations in a stack of operations so that, when we exit a segment tree node, we are able to undo all the Union operations performed when entering the node. Before exiting a leaf node of the segment tree node which corresponds to a "?" operation we have the answer as the number of sets in the disjoint sets data structure (which we maintain as a separate variable which is updated when a Union is successful and when a Union is undone).

If we use only union by rank or union by size for the disjoint sets the overall time complexity is O(k*log^2(k)). If we also use path compression then the time complexity should be lower, but I don't know how to compute how low it would go (i.e. I'm not sure if it would become O(k*log(k)*alpha(k)) or something else, where alpha is the inverse Ackermann function).

On RubanenkoDecember Cook-Off, 11 years ago
+13

The problems were indeed easier than usual — you can see that by the larger than usual number of contestants who solved all of them (and the speed with which they solved all the problems). But it was a good contest. I don't think that the fact that the problems were easier is a good reason to stop setting public contests in the future.

On RubanenkoDecember Cook-Off, 11 years ago
+10

My solution with persistent segment trees (and the same time complexity as yours) got accepted, so I don't think it was intended to get TLE.

On hogloidSomewhat suspicious coder, 12 years ago
+14

I don't know if they do it because they really think others are participating in teams, too, rather than individually, or if they just want to have an advantage over all the other competitors. Whatever the reasons, this is against Codeforces rules. I'm wondering if anything can be done about it (other than detect suspicious activity). As xorfire mentioned above, I brought this issue up also on Codechef a few months ago — but it appears than on Codechef there is no explicit ban on team participation under a single handle, so it seems that they didn't necessarily break the rules there.

Indeed, there was only one previous edition, in 2011. As far as I know, the problems are available online only on a Romanian website, and in Romanian: http://infoarena.ro

A slightly messy way of getting access to the English statements is by downloading my personal archive, which I made available here sometime ago: https://docs.google.com/file/d/0B0x9aMx8DPtVN2V0bHhrYjJ3LUU. I call it "messy", because the archive is pretty large — 1.5 GB, while the problem statements (and official solutions) take only a minor fraction of it. If you decide to go this route, you can find the data in a directory called "Romanian_Master_of_Mathematical_Sciences_2011".

I used the approach presented in this paper: http://www14.informatik.tu-muenchen.de/konferenzen/Jass08/courses/1/pieper/Pieper_Paper.pdf

But I didn't realize I could simply compute determinants modulo something, so I implemented my own class of fractions where the numerator and the denominator are big numbers (the worst parts were the need to do division and compute GCD for big numbers). I got TLEs with this approach and after several rounds of optimization, in the last day of the contest, I was able to pass all the tests except 4 of them (where I still got TLE, though probably not by much). The only benefit is that I now have these fractions implemented and ready for future contests :)

+3

All the techniques you mentioned are applicable to optimization problems. But my personal feeling is that most Topcoder Marathon matches are of a different kind — e.g. machine learning (classifying something properly) or image processing (like the matches going on right now). Sure, all the matches have a scoring function and you can argue that you need to optimize your score (which makes every marathon match problem a "score optimization problem"), but that's not the same thing.

The TCO Marathon qualifying rounds were indeed optimization problems (according to my own definition of it) and there you could have tried some of the techniques you mentioned (simulated annealing, beam search, various greedy approaches, etc.). But it seems to me that apart from the 3 TCO Marathon qualifying rounds, there are very few optimization problems in Marathon matches (so the chance to practice the techniques you mentioned is lower).

In case you want to test your skills on NP-hard optimization problems more often you could try participating in Codechef long contests — there is 1 contest per month, which lasts for 10 days. To me this time frame is better than the 2 weeks long Topcoder Marathon matches, because I find it hard to focus for 2 weeks on a single problem (also due to lack of time) — and not doing that would only bring me low scores, which makes the participation pointless to me.

Yes, mine, too. And I used a limit twice as large as the one from the problem statement (but the test cases are even larger than that). I guess it's just impossible to have a smooth contest at Hackerrank :)

Savita and Friends: I used binary search on the max-distance. I first implemented the double version (passed all preliminary tests) and later, just in case, also implemented the integer version (since the answers were either integers or integers + 0.5), which also passed all preliminary tests.

The crazy helix: I maintained an array of intervals (each interval represents a sequence of positions from the "base" array). Each type 1 operation splits at most 2 intervals and then reverses the intervals in the array. When there are too many intervals in the array (about O(sqrt(N))), simply rebuild the "base" array (and you get again 1 single interval). The same array of intervals is used for answering type 2 and type 3 queries. The overall time complexity is O(sqrt(N)) per operation.

On scott_wuAddepar Hackathon, 12 years ago
+14

I spent quite some time trying to come up with some data structures-based solutions for this problem, without success. Eventually, the obvious and very simple solution that you described occurred to me and I was pleasantly surprised that the problem had such a simple solution :) [ given that, on the surface, it seemed to require some kind of non-trivial data structures ]

On scott_wuAddepar Hackathon, 12 years ago
+11

No, I didn't use any properties of the covariance matrix. I started with a graph-based approach which only assigned -1 or +1 to the variables — I sorted the entries of the matrix in decreasing order of their absolute value. Then, for each value A(i,j), if A(i,j)>0 I tried to add i and j in the same component of the graph ; otherwise, if A(i,j) < 0 I tried to add them into different components. Since I only assigned -1 and +1, the graph was always bipartite and whenever I tried to add two vertices in the same component or in different components I only did it if the graph could be maintained bipartite after that. This got me good points on the last 3 test cases, but scored 0 on many of the other test cases.

For the second part I used an incremental improvement approach. As long as I still had time left I chose a variable randomly — say i. Then I selected the best new value to assign to variable i, under the condition that the values of all the other variables remain the same. This is actually a 2nd degree equation. To keep things simple, however, I didn't use the properties of 2nd degree equations. Instead I iterated through possible values between -1 and +1 with a step of 0.005 and chose the value which brought the largest improvement (if any) this way.

On scott_wuAddepar Hackathon, 12 years ago
+16

The problems were very interesting. I also liked the combination of standard algorithmic problems (with a known optimal/correct answer) with "partial score" problems (Risk Management and Engulf). I have only encountered this combination in Codechef long contests before and I was happy to see it in the Addepar Hackathon, too.

I am wondering if the submissions made during the contest will become visible (this occurs for most other Hackerrank contests, but it seems that for the Addepar Hackathon this isn't the case, yet). I am particularly interested in reading the code of the top scoring solutions for the above mentioned "partial score" problems.

I will take this opportunity to also provide some feedback on the "Risk Management" scoring function. The fact that any value lower than 90% of the best value found by the problem setters got a score of 0 made it rather difficult to assess improvements — I made many submissions which got 0 points on some test cases and there was no way to know if I was close to the 90% boundary on those test cases or very far away. Maybe this was intended, but in case it wasn't, a scoring function which also provides some non-zero score for submissions obtaining less than the 90% limit would have been better for the contestants. Anyway, eventually I managed to obtain non-zero scores on all the test cases and from then on improvements were easier to assess.

By some lucky chance, the statement of one of the problems (FLAGS) loaded for me (after many retrials), so maybe the same happened to other people (possibly with other problems). So in case you're postponing the contest for some other day you might have to change the problemset, too.

On MDantasTopcoder SRM 624, 12 years ago
0

I guess your solution is something like: - when a new node is painted blue, add it to a list (if we now have sqrt(Q) nodes in the list then recompute values for the whole tree in O(N) time and clear the list) - for a query: take the answer from the last tree "rebuilding" and then traverse the list of nodes which were recently painted blue, compute the LCA to each of them and add the distance to the answer of the current query.

I implemented this approach but I was getting TLE on the example test cases 4 and 5 (and I only finalized it too close to the end of the coding phase to be able to optimize it in any way). After the coding phase and looking at tourist's solution (which is just what I described above), I realized that my TLE was coming from the recursive traversals of the tree : because of the way the parents were given, the nodes could be traversed simply in ascending/descending order (except for a initial DFS). After replacing the recursive traversals with simple "for" loops I got AC in the Practice room. Maybe the same thing happened to you? Or maybe you just used a value which was too large or too small for "sqrt(Q)"? I used 200 (and I saw in tourist's solution that he used 100).

I waited until now to write this because I was unable to submit my solution in the Practice room yesterday and also earlier today (whenever I tried to compile my solution, it said that the request timed out).

On vfleakingCodeforces Round #250, 12 years ago
+16

Same as your solution, but I used a division into sqrt(N) blocks instead of a segment tree (and for each block I maintained a multiset with the sorted elements, in order to easily update only the needed numbers at each modulo operation). I was very reluctant to implement this and I thought quite a long time about how I could "compose" two modulo operations, but after not finding any way of efficiently composing them, I implemented this just in case :) Still, it could have TLE'ed, because of the sqrt(N) factor (compared to only a log(N) factor in the segment tree solution), but it didn't.

0

The link to your submission is for problem Div 1 D, not E...

On huzecongCodeforces Round #248, 12 years ago
+13

Thank you for your answer. I thought about that myself, too. I think I will do just that in one of the future Div2-only contests. However, if I were a Div2 contestant, that would leave me with no way to practice hacking outside of competition.

On huzecongCodeforces Round #248, 12 years ago
+16

Hm.. yes, I actually did :) At first I didn't notice the 2 tabs and I selected the generator code where the upload of a test file was. Then I noticed the tab was called "manual tests" (or something similar) and I selected the other tab (for generated input), where I did what I said in my post. I wouldn't have imagined that whatever actions I did in the first tab would still be considered when I switched to the other tab. This is non-intuitive to me and only enhances the idea that one needs to first be sufficiently familiar with the Hacking interface in order to use it properly — which is bad, given that we seem to only be able to access that interface during contests.

Thank you for your answer.

On huzecongCodeforces Round #248, 12 years ago
+28

Towards the end of the contest I tried to hack sas4eka's solution for problem A (Div1) by trying to make it get TLE (it seems that for each pair of consecutive positions it iterated through all the occurrences of the first number in the pair, so a test case like N=M=100000 and 1 2 1 2 ...... 1 2 should make his solution get TLE). Since the test case was large, I wrote a generator and wanted to upload the generator (after selecting generated input in the Hacking interface). I selected my generator's source file and then I pressed "Hack". Instead, I got some error that either a test case or the generator must be specified (or something similar). I tries setting some generator parameters and pressed Hack again — nothing happened this time either. Then the contest ended and my hack was not submitted. Perhaps I did something wrong, because I usually do not try to hack other's solutions during the contest and I am not familiar with the Hacking interface. My question then is : can I try to hack other people's solutions outside of a contest ? (just for practice) If not, then how can I become familiar with the Hacking interface so that next time I know what to do? (because it seems that the interface is not as straight-forward as I was expecting it to be). And I would rather not waste time from a real contest for this. Note that in TopCoder it is possible to challenge solutions even in the Practice room, so maybe something similar should also be possible in Codeforces? (in case this is already possible, then please just ignore my question)

Later edit: The solution I was trying to hack did get TLE in the end, so I guess my hack would have been successful... had I known how to properly submit it.

I used the same approach during the contest (I didn't get AC then because of a stupid bug which I fixed immediately after the contest). When considering the l-th node (with value c(l)), I considered every partition which was obtainable for the first l-1 nodes. Let's assume this partition is (s(1), ..., s(k)). I generated all the possible solutions of combining 2 or more elements of the partition such that their sum is c(l)-1 and from them I generated new partitions for the first l nodes (I used a hash to not generate multiple partitions twice). In order to generate these solutions quickly I used knapsack and the knapsack DP matrix to guide the generation of such solutions.

This solution was very fast (it had a running time of slightly more than 100 ms).

On fchiricaCodeforces Round #245, 12 years ago
0

I only checked up to 100 numbers back + 100 closest positions (larger and smaller) in terms of prefix sums. I didn't know how fast the CF servers were, so I wanted to avoid TLE. Sadly, this got WA on test 39, with < 500 msec. When I increased the limit to 500 I got AC (with a running time of about 1.8 sec). Anyway, I don't think it's easy to create test cases to fail something like this.

I think you're right. I didn't fully analyze the time complexity. I mean, there are BSTs where, for one node, it is possible to spend O(log^2(N)) time with this approach. But summed up over all the nodes, I cannot find any example which takes more than O(N*log(N)).

On mukelAbout HackerRank rating system, 12 years ago
+5

I also don't understand how the rating system (global leaderboard) works on Hackerrank. Anyway, the idea of mixing together ratings for different types of contests (which require different types of skills) doesn't seem like a great approach to me (e.g. "long" contests, "short" contests, functional programming contests, "real data" contests, etc.). I participated mainly in the 20-20 hacks (i.e. "long" contests) and I usually got one of the top places, yet my rating increased by small amounts after the first few such contests. My rating increased more when I got one of the top places in Codesprint (another type of contest), which made me believe that if I want my rating to increase more then I should participate in more types of contests. For instance, I've never participated in 101 hacks (Hackerrank "short" contests), but my guess would be that, unless I do something really bad in my first such contest, my rating will increase more than by getting a top spot in a "long" contest (because for the 101 hacks I am currently "unrated"). I don't know if my guess is correct, but I hope that I will have time to test it soon :) My hypothesis could also be tested sooner in case they decide to rate the first weekly challenge contest (which I'm not sure will be the case).

My solution is almost identical to yours, except that I computed the sums of distances in reverse order (from the last added node to the first one). This way we got an O(N * log^2(N)) solution.

On cgy4everCodeforces Round #228, 12 years ago
+37

Hm.. there's also a TopCoder SRM earlier on Monday. I wonder if I will be able to participate in both TC and CF :)

Yep, this is exactly what I said (in way too many words :) )

I think that the easiest way to implement everything is to use copy-on-write, i.e. whenever you need to modify data in a node of the segment tree, just create a copy of it and modify the data in that copy instead (when creating the copy you copy the data in the node and the "pointers" to its children). Moreover, if you created a copy of a node, you should also have created copies of all of its ancestors. This way you would create O(log(N)) new nodes for every update operation (and you would also have a new version for every update operation). A simpler explanation is that for each update operation, every time you need to visit a node you instead first create a copy of the node and then visit that copy.

I usually use this approach when I need to implement persistent segment trees. If you know in advance an upper bound on the total number of nodes you may have, then you can allocate all of them from the very beginning (e.g. in a large array of nodes).

On BekzhanKassenovTopcoder SRM 606, 12 years ago
+19

Looks like "insanity" got the best of both of you and you participated despite having exams the next day :)

On OmelianenkoTopcoder SRM 605, 12 years ago
+9

I tried starting the Topcoder Arena since 5 minutes before the start of SRM 605 until now (7 minutes after the start of the SRM) without success. The Topcoder website is also very very slow and I cannot use their link for entering the arena (because sometimes that part of the website doesn't show up). However, all the other websites work just fine for me. Is anyone else having such problems?

A very good editorial, indeed! (but please don't forget about Div1E) And the problems were really nice. But I think problem B was tougher than it should have been for its amount of points. I've read some comments saying that problem B had too weak pretests (because so many solutions failed the final system tests). I would have to say that my experience with problem B was the exact opposite. I coded two wrong solutions (which failed some pretest or another) before eventually finding and implementing the right solution. So I'd say that problem B had good pretests :)

Ideally, I should have a class or set of functions and data types/structures so that I wouldn't have to implement maxflow (or min-cost maxflow) algorithms from scratch every time. However, I have been too lazy so far, so I always end up implementing them from scratch in every contest (when I realize that they are needed, of course).

Except that I'm not entirely sure it is correct :) I will implement it when I have time and write another comment then.

I thought about the greedy algorithm after reading the editorial (for the case when we assume that all the orders have penalty 1). Can't we simply solve this in O(N*log(N)) time? While traversing the ranges we add to the current range items from "active" orders whose deadline is the smallest, until either the range is full or we have no more "active" orders. If this works, then a simple heap with "active" orders sorted by deadline would be enough.

Yes, the time limit seemed very tight to me, too. The idea I used during the contest (which I still had to optimize further to get AC) was to consider the orders in decreasing order of penalty (i.e. largest penalty first, ...). For each order I used a max-flow algorithm in order to have as many items of this type of order scheduled within its time interval, but without dropping items of orders with higher penalties. Although I started with a max-cost flow algorithm (maximizing the cost of the non-penalized items), I ended up with just a simple sequence of max-flow algorithms. However, even this was too much by itself and I had to optimize the implementation in order to not get TLE. By the way, in case it's not obvious, the bipartite graph on which the max-flows are computed contains the N orders on one side and the intervals of time moments on the other side (there can be up to 2*N-1 such intervals).