### chromate00's blog

By chromate00, 4 months ago,

Two days ago, the Div.3 (Codeforces Round 938 (Div. 3)) suffered from severe issues of hacks, because the problem G (1955G - GCD on a grid) did not have proper validation for the sum of $nm$ in hacks, which was specified as at most $2 \cdot 10^5$ in the statements. Sure, I won't ask about why that happened, that is not constructive discussion. Instead, I will discuss about something very interesting about the task.

During that incident, I was wondering. There has to be a way to solve this without the constraint on $n \cdot m$, right? Of course, $7\times 10^8$ bytes of input is impossible anyways, but if we ignore that, $10^8$ is not a very "dreaded" number under these ages of optimizations. There has to be a way to do this.

Then the idea came to me.

Before we cover the solution, we cover a few basic facts on number theory — it might not be necessary to know this to understand the solution, but it will be helpful. Basically, every integer is a point on the grid of infinite dimensions. Each dimension represents a prime factor, so if we restrict the domain to divisors of some integer, it becomes $O(\log x)$ dimensions because there are only $O(\log x)$ prime factors of an integer. $\gcd$ and $\text{lcm}$ becomes much more tractable to deal with on this grid, because they become simply $\min$ and $\max$ on each dimension. Same with divisibility, if each exponent on $a$ is no less than the corresponding exponent on $b$, then $a$ is divisible by $b$.

Now to the solution. The grid of divisors of $a_{1,1}$ has $O(\log a)$ dimensions and $d(a)$ points, so if we use the same idea on how one flattens a multidimensional array to one dimension, we can map each divisor (point) to their corresponding indices in one array. So, let us consider using a bitset of divisors, so each cell in the DP table can comfortably store the status of each divisor comfortably.

Let us make a bitmask for each divisor $mask_d$, defined as the union of all divisors of $d$. Let the multiplier on prime $p$ while flattening the multidimensional grid be $mult_p$ (From the facts above, one can see this is essentially the product of $\text{exponent}+1$ for all smaller primes). Then, $mask_1=\texttt{0000...0001}$, and $mask_d=mask_{(d/p)}|(mask_{(d/p)} \ll mult_p)$ if $d$ is divisible by some prime $p$. From preprocessing a sieve we have information on all such values of $p$, so this can be computed nicely as well.

Now we assume WLOG all values in $a$ are divisors of $a_{1,1}$ (if it isn't then we can take GCD to make it so). Let $b_{i,j}$ be the $mask$ corresponding to the value of $a_{i,j}$. Then the DP transition becomes as follows —

$dp_{i,j}=(dp_{i-1,j}\mathbin{|}dp_{i,j-1})\mathbin{\&}b_{i,j}$

And of course, the base condition is $dp_{1,1}=b_{1,1}$.

After finding $dp_{n,m}$, we can see that if $mask_d$ for some $d$ is completely contained in $dp_{n,m}$, then there exists a path whose GCD is divisible by $d$. So we try that for each $d$, and take the maximum $d$ where the divisibility condition holds.

The time complexity analysis is simple. Because it takes $\mathcal{O}(\sqrt{a})$ time to enumerate divisors of $a_{1,1}$, and processing $mask$-s takes $\mathcal{O}(\frac{d(a)^2}{w})$ time, we must use $\mathcal{O}(\sqrt{a}+\frac{d(a)^2}{w})$ time per test case. Then, there are $\mathcal{O}(nm)$ transitions in the DP, each taking $\mathcal{O}(\frac{d(a)}{w})$ time. So the DP takes $\mathcal{O}(nm\frac{d(a)}{w})$ time. Also as we did $\gcd$ for each cell, the $\gcd$ must take $\mathcal{O}(nm\log(a))$ Finally, trying the divisibility for each $d$ takes $\mathcal{O}(\frac{d(a)^2}{w})$ again, but that is already counted in the time complexity per test case so we are fine. The final time complexity required is $\mathcal{O}(t(\sqrt{a}+\frac{d(a)^2}{w})+\sum{nm}({\log(a)+\frac{d(a)}{w}}))$. Because $\frac{d(a)}{w}$ is such a small constant (precisely $4$), it should scale well for much larger values of $nm$, and even possibly run even when there were no constraints on the sum of $nm$, that is $\sum{nm}=10^8$ in the worst situation.

255932465 is the accepted submission, and the benchmark is as follows. For all cases $a_{i,j}=720\,720$ was used for all cells because that is the worst case for $d(a)$ (though $\gcd$ might be worse for other values). Only informations of $n$ and $m$ were input for each test case, to minimize the effect from IO bound. All benchmark results are from custom invocations. The result was as follows.

Case Runtime
$t=100,n=100,m=100$ $46\text{ms}$
$t=1000,n=100,m=100$ $217\text{ms}$
$t=10000,n=100,m=100$ $1358\text{ms}$
$t=100,n=300,m=300$ $171\text{ms}$
$t=1,n=1000,m=1000$ $92\text{ms}$

Unable to parse markup [type=CF_MATHJAX]

$187\text{ms}$
$t=1,n=3000,m=3000$ $359\text{ms}$ $^\dagger$
$t=1,n=4000,m=4000$ MLE $^\dagger$

$^\dagger$ The stack overflowed, so I had to move the array dp to static (global range).

• +15

By chromate00, 4 months ago,

During testing of Codeforces Round 934, I found a very cursed (albeit unproven) solution to 1944B - Equal XOR and I thought it would be worth a separate blog, so here it is.

Before I explain the solution, I must give you a quick disclaimer; It is much harder than the intended solution and is very likely useless. If you would appreciate understanding it despite it being very useless, please do read further.

First, let us use an assumption which will be under the very basis of the solution. I will not prove it to you, but you will see that it is likely true.

• Let $X$ be an uniform random subsequence of $a$ with size $k$. Then, $X_1 \oplus X_2 \oplus \cdots \oplus X_k$ is almost uniformly distributed across all possible values.

If this assumption is true, then we can get to a solution with $\mathcal{O}(n \sqrt{n})$ expected time complexity and $\mathcal{O}(n \sqrt{n}/w)$ expected space complexity.

Let us sample one random subsequence of length $2k$ from $a_1,a_2,\cdots,a_n$, and one from $a_{n+1},a_{n+2},\cdots,a_{2n}$. I claim that the probability such that the XOR of these two subsequences coincide is on the order of $\Omega(1/n)$, under our assumption above. Then, if we sample $x$ subsequences from each side, what will be the probability that at least one pair will coincide?

Now, there are $x^2$ pairs between the $2x$ subsequences picked. Intuitively, we see that this situation is very close to the birthday problem where we need an expected number of $O(\sqrt{n})$ people to find a collision. Though the pigeonhole principle does not apply here, the probability still works very similarly. The probability that we will have a collision in $n$ pairs converges to a constant which is $e^{-1} \approx 0.367879$, and the probability that we get none in $x^2$ is essentially $e^{-x^2/n}$ when $x^2>n$. When $x=3\sqrt{n}$ this is already less than $0.02$ percent.

So, we will get at least one collition w.h.p under $\mathcal{O}(\sqrt{n})$ samples. Each round takes $\mathcal{O}(n)$ with a trivial process. Thus the time complexity is expected $\mathcal{O}(n\sqrt{n})$. The final issue is space complexity where $\mathcal{O}(n\sqrt{n})$ can be tight in $256$ megabytes, while bitset fixes this issue. The space complexity is reduced to $\mathcal{O}(n\sqrt{n}/w)$ using a bitset. The solution is complete. The AC submission is here.

Now here is the catch. I did not prove the assumptions along the proof. So I am asking, can anyone prove the assumptions and thus the solution, or disprove the solution (thus finding a hack)? Please let me know in the comments if anyone can either prove or disprove this.

• +26

By chromate00, 7 months ago,

Thank you everyone for participating in The 3rd Chromate Cup Algorithm Division! The full problemset can be accessed on (link) for upsolving. Also the profile badge/backgrounds are being a bit delayed, I am too busy ;-;

A. Strange Shuffle

Hint
Solution

B. Super Primes

Hint
Solution

C. Y

Hint
Solution

D. King of Data Structures

Hint
Solution

E. World Tour

Hint
Solution

F. Connected Dominating Set

Hint
Solution
Bonus

G. Hard Number Guessing Game

Hint 1
Hint 2
Solution

H. Sequence and Not Difficult Queries

Hint 1
Hint 2
Solution

I. Cactus Folding

Hint
Solution

Hint 1
Hint 2
Solution

K. Cactus Folding Plus

Hint
Solution
• +24

By chromate00, 7 months ago,

Hello, This is chromate00 (a.k.a. hjroh0315 on BOJ). The 3rd Chromate Cup Algorithm Division will be held soon! All tasks are developed by me, and all tasks will have statements both Korean and English.

• Date: January 7th (Sun), 20:00 ~ 22:30 KST (2 hours, 30 mins)
• Tasks: 11 tasks. Each task has a score given, and the tasks are sorted increasingly in order of score. The score is proportional to the expected difficulty evaluated by the problemsetter and testers. Still, we suggest you to read each task at least once, the perceived difficulty may vary.
• Difficulty: Bronze ~ Ruby in solved.ac tier (*800 ~ *3000 expected in Codeforces rating)
• Penalty: Uses the same rule as AtCoder. Formally, the penalty is calculated by (Last AC time)+(Sum of tries before AC)*5 mins.
• Language bonus: Language bonus for TL/ML exists for specific languages. See (link; Korean text) for details.
• Standings Freeze: None.
• solved.ac Profile Badge & Background: Each is given to participants who scored at least 250/1500 correspondingly. Please understand that the production/distribution may take two weeks or more.
• Specs: There are at least one interactive task(s). We suggest that you read the guide (link) before participating in the contest.
• Do note that this contest is not held as solved.ac Arena.
• Score Distribution: $250-500-1000-1000-1250-1500-2000-2000-3000-4000-4000$.

This contest could be held thanks to the testers biximo dhyang24 eggx50000 naeby Stygian_Nymph utilforever, and also to Startlink for great services Baekjoon Online Judge and BOJ Stack.

The contest's Overview tab also contains the same information as above. If the Overview tab and the announcement have different information, the Overview tab will be considered as more recent.

A total of 13 people will get a Mom's Touch Thigh Burger voucher. The probability is proportional to the score squared. Please understand that only people who reside in Korea currently or can use the voucher are eligible for the raffle. If you cannot use the voucher please inform us in the raffle announcement after the contest ends.

For people new to Baekjoon Online Judge: You may refer to https://help.solved.ac/en/arena/getting-started/link-account (read until the second to last section) for creating a new account and linking the account to solved.ac.

• +61

By chromate00, 10 months ago,

Ever since Ex has not been in AtCoder Beginner Contest, it feels like every contest has changed too drastically compared to when Ex was in ABC. The difficulty gap between E-F-G has become much, much wider. Previously, we usually had a choice between whether to solve F or G, if we are stuck in either one of them in occasion. Now, we do not have that choice, as the tasks that used to be at Ex is now at G. I think this is a huge loss for people who could solve until F, because they often lose an entire task to solve. There are many other issues, though I will not enumerate every single one of them.

I understand the current situation, the lack of hard tasks and everything. I know how hard it is to hold a contest with a problemset of 8 tasks every week. However, couldn't we have found a better solution to this hard situation? Instead of removing Ex from the problemset, a Call for Tasks can be held, similarly to how it is done for AGC. If I am not mistaken, AGC Call for Tasks requires that a whole problemset is prepared (similarly to contest proposals on Codeforces). For ABC, if easier tasks are abundant and we are in a lack of harder tasks, a Call for Tasks on separate tasks aimed towards G/Ex can be held. This can effectively alleviate the hard situation we are currently in. And this will be effective as well for people who would be willing to set problems on ABCs, but could not because there was no opportunity.

I hope you could consider this suggestion quite seriously.

Sincerely, chromate00.

• -33

By chromate00, 12 months ago,

#### Disclaimer: If it was not clear to you already, this article is not about an algorithm which is practical on computare hardware. It is about a "natural" algorithm, working based on physical/real-life principles. It may give you some insights on algorithmic thinking, though. Either way, it's interesting to know this.

In this article, I will discuss about an interesting data structure that can deal with all of the following operations in $O(1)$ time and $O(V+EW)$ space complexity. The operations are:

• $link(u,v,w)$: Connect two vertices $u$ and $v$ with an edge of weight $w$.
• $cut(e)$: Cut the edge $e$.
• $mpath(u,v)$: Find a shortest path between two vertices $u$ and $v$, if one exists.
• $connected(u,v)$: Check if two vertices $u$ and $v$ are connected.

Here are how each operation works and are implemented.

#### "Base structure"

Each vertex is an object where you can tie a string on. A plastic ring should work fine. Each edge is a flexible (but not elastic!) string with a certain length. A (practically infinitely) long thread of string, and a pair of scissors is maintained for future use. This is the "base structure" of this data structure.

#### $link(u,v,w)$

Grab your pair of scissors, and cut out a string of length $w$, and connect vertex $u$ and $v$ with the string physically. This is practically $O(1)$ in time complexity, and contributes $O(W)$ to the space complexity.

#### $cut(e)$

Grab your pair of scissors, and cut the string corresponding to the edge $e$. This is $O(1)$ in time complexity, and garbage collection can work nicely as well. (Just throw away the string into the bin.)

#### $connected(u,v)$, $mpath(u,v)$

These two operations are basically one operation in reality. This operation can be interpreted as the following linear program.

$\begin{gather*} \max X_v \\ s.t.\\ X_u = 0\\ |X_{u_e} - X_{v_e}| \le w_e \, \forall \, e \in E \end{gather*}$

Now, if you're accustomed to linear programming, you might realize that this is the LP formulation of shortest paths on an undirected graph. Now, what we did for each $link$ and $cut$ operation just translates to the constraints in this linear program. Because we connected $u_e$ and $v_e$ physically with a string of length $w_e$, the two vertices cannot have a distance farther than $w_e$. The only step left is to bind $X_u=0$ and maximize $X_v$.

Maximizing $X_v$ is not hard. You can just grab $u$ and pull $v$ forcefully towards some direction until you cannot pull further. Now that the linear program is complete physically, we have just found the result in $O(1)$ time. If you can pull infinitely, the linear program is unbounded, which means the two vertices are disconnected. If you cannot pull further than some distance, that distance is the shortest path, and the path connecting $u$ and $v$ in a straight line now is the shortest path.

#### Conclusion

With physical principles, we just found out that time complexities that seem impossible can in fact be possible in natural algorithms. How fascinating it is to know what physics can provide us in algorithms! I hope this article could give you some insights on algorithmic thinking, or at least keep you interested throughout the article.

• +144

By chromate00, 19 months ago,

This is a follow-up to this blog, as a self-editorial for the three practice tasks from Medium-Hard to Hard. As solving the task after proving is quite easy (you can just directly use the techniques in the blog), I will only prove why the task is a convex optimization task.

Hint 1
Hint 2
Solution

Hint 1
Hint 2
Solution

#### Asia Regional Contest 2007 in Tokyo — "Most Distant Point from the Sea"

Hint 1
Hint 2
Solution
• +7

By chromate00, 19 months ago,

Before we begin: This blog is my attempt to Codeforces Month of Blog Posts. Simple rule: Write an interesting CodeForces blog post until February 15th and win $300. In this blog, I will be describing the concept of Convex Optimization, and techniques that can be used to solve Convex Optimization tasks. The blog will be divided into two parts — "What is Convex Optimization", and "How do we solve Convex Optimization tasks". This part of the blog is for "How do we solve Convex Optimization tasks". (Part 1 is here) ### There are just too many ways to solve them One thing is for sure — There are too many ways to solve convex optimization tasks. By "too many", I mean, at least a dozen. We can't cover all of them here! So, I will be covering the methods that can be used in a majority of convex optimization tasks in this blog. Those methods would be Gradient Descent (including the Subgradient method), Coordinate descent, and Ternary Search. ### Gradient Descent Gradient Descent is a technique where we repeatedly move closer to the optimal point. To do this, we must find the gradient of the objective function at the current point, and move based on the gradient. If we know the gradient, we know one direction on which a more optimal point exists, so we move based on the gradient. There are two major flaws of the usual Gradient Descent, though. One flaw is that if we start at a point where the gradient is too small, the convergence will be very slow. There are ways to solve this (Backtracking Line Search is an example), but we will not cover them too technically in this article. Just remember that the methods are done by tweaking the movements, and we would be good to go. The greater issue is that, our objective function may or may not have a gradient. Not always are the objective functions differentiable, so if the objective functions are not differentiable, we cannot use the traditional Gradient Descent. Luckily, we can still use a similar way, called the Subgradient method. For every convex function at every point, a "subgradient" exists. If a gradient exists, the gradient is a subgradient. If it doesn't, we define any gradient-like vector, where the plane/line defined by the vector underestimates the function in any point, as the subgradient. We can use this subgradient just like the gradient in Gradient Descent. The only thing to remember is that we may not converge at all if the step size is fixed, so we must reduce the step size at every iteration. This method is very convenient if we have a way to easily determine a subgradient. ### Coordinate Descent Coordinate Descent is probably the simplest of the methods used for Convex Optimization. Not many tasks allow the usage of Coordinate Descent, but if we can use it, it makes solving the task much easier. Basically, we move along the axes on the euclidean plane. We define a variable "delta" initially, and in each step, we seek towards four directions (+x,-x,+y,-y) and see if moving towards any direction by delta reduces the value. If moving towards any direction reduces the value, we move towards that direction. If delta is constant, then we will clearly be stuck in one point at some point in time (likely the second step), so we decrease delta at every step. Coordinate Descent's flaw is that it can be stuck at situations where there are more than one direction having greatest reduction. Still, this flaw does not exist on one dimension, so you can simply treat it as a "Lazy man's Ternary Search" in one dimension. ### Ternary Search At this point you may say, "Wait, isn't ternary search limited to one dimension?" and you may be right. But think about it, $\mathbb{R}^2$ is just $\mathbb{R}$ in two axes. Instead of just one dimension, we can think of using the result from ternary search (the minimized/maximized value) as the function to minimize/maximize in another ternary search. This is one very simple way to solve convex optimization tasks, and is guaranteed to work very often. (If time complexity isn't an issue, that is!) Again, coordinate descent always works in one dimension, so you can replace one ternary search with coordinate descent if you want to. ### Practice Tasks Easy Difficulty • Almost any ternary search task. Try to prove whether the objective function is convex before you solve it! Medium Difficulty • NCNA 2019 — "Weird Flecks, but OK": Usage of the Smallest Enclosing Circle task. (Kattis) (Baekjoon) • Waterloo's local Programming Contests — "A Star not a Tree?": The Geometric Median task. (Baekjoon) Medium-Hard Difficulty • JAG Summer Camp 2019 — "All your base are belong to us": if K=1, this is the Smallest Enclosing Circle. if K=N, this is the Geometric Median. Is our objective function convex even for any arbitrary N? Time to find out. (Baekjoon) • 2013 Japan Domestic Contest — "Anchored Balloon": Designing the objective function may be tricky for this task. Bruteforcing is one solution for the task, but using convex optimization techniques to solve it is a very good practice. (Baekjoon) Hard Difficulty • Asia Regional Contest 2007 in Tokyo — "Most Distant Point from the Sea": Some people reading the task may know this as a half-plane intersection task. And you're not wrong. Still, This task can be solved as a Convex Optimization task well enough in the TL! (Baekjoon) • SWERC 2021-2022 "Pandemic Restrictions": The intended solution was based on convex optimization. For proof, please check the official editorial of the mirror contest. (CF Gym) UPD: Solutions here! • +26 By chromate00, 19 months ago, Before we begin: This blog is my attempt to Codeforces Month of Blog Posts. Simple rule: Write an interesting CodeForces blog post until February 15th and win$300.

In this blog, I will be describing the concept of Convex Optimization, and techniques that can be used to solve Convex Optimization tasks. The blog will be divided into two parts — "What is Convex Optimization", and "How do we solve Convex Optimization tasks". This part of the blog is for "What is Convex Optimization". (Part 2 is here)

### What is Convex Optimization?

"A convex optimization problem is an optimization problem in which the objective function is a convex function and the feasible set is a convex set." as Wikipedia says. This may be hard to understand for many people, so let's use the simpler case of a univariate function. Often the "convex function" is called "unimodal function" in CP, and the tasks that use Ternary Search is very often a Convex Optimization task. Let's understand why using ternary search is possible in an univariate convex optimization task.

In the definition above, it is said that "the feasible set is a convex set" in convex optimization. The feasible set is the set of points where the solution can exist. A convex set in one dimension is basically equivalent to an interval. For Ternary Search we define $L$ and $R$, the limits where the solution can exist. Here, the feasible set is $[L,R]$, and of course this set is an interval, so the feasible set is a convex set.

Also citing from the definition of "unimodality" in Wikipedia, "a function f(x) is a unimodal function if for some value m, it is monotonically increasing for x ≤ m and monotonically decreasing for x ≥ m". This is exactly true for "convex (univariate) functions" as well, so convex functions are unimodal, and basically we can solve convex optimization tasks with ternary search.

Similarly to that on one dimension, convex functions and convex sets can be very well defined in multiple dimensions, and so we can expand this idea to multiple dimensions as well. In this blog I will not explain things in standard form, instead I will explain everything verbally and with simple equations. This is because understanding the standard form can be difficult for many people (I don't understand it either).

### Important aspects of Convex Functions

There are some important aspects of convex functions that can help in proving the convexity in other tasks. I will use some of these aspects to prove the convexity of some convex optimization tasks later on.

• The intersection of two or more convex functions (either minimum or maximum) is a convex function.

This can be understood intuitively; The intersection of two convex polygons is a convex polygon. The intersection of two intervals (convex sets on one dimension) is an interval, which is a convex set. The intersection of two convex function, is again a convex function. Do note though; the union of two convex function is not necessarily a convex function.

• The sum of two or more convex functions is a convex function.

This can be proven mathematically, but I will not prove this here. There are basically tons of proof out there if you google it, so google it if you want proof on it.

• If the objective function $f(x)$ is convex and is differentiable, then when $f'(x)=0$, $f(x)$ is either a maximum, a minimum, or a saddle point.

This is just basic math, but this serves as a basis of Newton's method in optimization. Not many convex optimization tasks have twice differentiable objective functions (as Newton's method in optimization requires), but it is worth knowing that this exists.

• The distance to a certain point is a convex function, and also is its square.

This is easy enough to understand, and this is the most basic convex function in geometry tasks requiring convex optimization.

• In convex functions, any local minima/maxima is the same as the global minima/maxima.

The exact essence of the reason why convex optimization can be solved easier than any other optimization tasks. I won't prove this formally, but thinking about why this works is a good practice.

### Some famous convex optimization tasks, and their proof of convexity

Here are some famous convex optimization tasks as an example to understand convex optimization.

• Smallest Enclosing Circle

This is a convex optimization task asking for the smallest circle enclosing a set of points. (Sometimes it is discussed about a set of circles, but here, we only discuss about the case of a set of points.) In this task it may be thought that we should determine both the center and the radius, but basically we can just determine the center and then the radius is the distance to the farthest point. So, the objective function for a set of points $S$ is as follows.

$\displaystyle f(x,y)=\max _{i \in S} \sqrt{(x-x_i)^2+(y-y_i)^2}$

The distance to each point is a convex function, and their intersection(max) is also a convex function. The feasible set is the set's convex hull, which is a convex set. Therefore, the Smallest Enclosing Circle is a convex optimization task. An $O(n)$ algorithm is known for this task (Megiddo or Welzl), but it can still be solved fast enough as a convex optimization task.

• Geometric Median

This is a convex optimization task asking for a point where the sum of distance to all points in a set is minimized. The objective function for this task is simple (as simple as this task's definition).

$\displaystyle f(x,y)=\sum _{i \in S} \sqrt{(x-x_i)^2+(y-y_i)^2}$

The distance to each point is a convex function, and their sum is also a convex function. The feasible set is, again, the set's convex hull. Therefore, the Geometric Median problem is a convex optimization task.

The article will continue with methods to solve convex optimization tasks in the next part. Stay tuned!

• +42

By chromate00, 19 months ago,

Let me state my opinions before we start the explaining the actual algorithm — I honestly prefer Monotone Chain over Graham Scan. Its simplicity in implementation is the most important reason, though I have other reasons such as the ability to find the upper hull and the lower hull separately. For people who are already accustomed to Rotating Calipers, you can do it the way you used to, and you will still find the same results. This algorithm is for the people who find the Rotating Calipers' concept hard to understand.

Just yesterday, I came up with a way to find the diameter of a polygon (the distance of the farthest pair) using Two pointers and Monotone Chain. I knew I could already do it using Rotating Calipers, but I found the concept quite hard to understand and implement. Therefore, I came up with a method myself. This method may be equivalent to Rotating Calipers in its result (I would be happy if I can extend it to other tasks), so remind me if it is.

First, we use the Monotone Chain algorithm to find the upper hull and the lower hull separately. Note that we are not looking for the entire hull in one array, we want the upper hull and the lower hull separately. This can be done using the usual Monotone Chain method, but instead of sweeping left->right->left, we sweep twice from left to right, once with $CCW \le 0$, and once with $CCW \ge 0$.

Now we prove the following theorem.

Theorem: The farthest pair of points in a polygon cannot be both placed on the upper hull (or vice versa, excludes the leftmost/rightmost point)

We will prove this by this idea: For every pair of points on the same side (upper/lower) of the hull, there exists a way to find a line containing the leftmost/rightmost point and one point from the original pair, with the length longer than the original. We have three cases based on the slope of the line segment (assuming upper hull): $a>0$, $a=0$, and $a<0$. For $a>0$, we can use the right side in the pair and the leftmost point. The distance in the x-axis and the y-axis will then both be farther than the original, resulting in a longer distance. For $a<0$ you can use the opposite, and for $a=0$ you can use either side. Same proof process for the lower hull. The following is a visualization of the proof process.

Green: The cases. Purple: The counterexamples.

Now we reverse the upper hull, and initialize $p_u$ and $p_l$ (stands for "upper pointer" and "lower pointer") both to $0$. So initially, $p_l$ points to the leftmost point, and $p_u$ points to the rightmost point. At each step, we update the maximum distance we've found so far, and check which pointer to advance. If $p_u$ is at the leftmost point, advance $p_l$. If $p_l$ is on the rightmost point, advance $p_u$. Otherwise, check $\text{dist}(p_u+1,p_l)$ and $\text{dist}(p_u,p_l+1)$. Advance to the side giving a greater value as a result. $\text{dist}(i,j)$ here denotes the distance (preferrably squared) between the $i$-th point on the upper hull (counting from the right) and the $j$-th point on the lower hull (counting from the left).

This algorithm will give a time complexity of $O(H)$, $H$ being the number of vertices on the hull. This algorithm has been tested on tasks which ask for this answer, including the famous "Robert Hood". I plan to release the code for it soon, after I finish some refactoring (the code is currently a bit ugly). Again, this algorithm may be simply equivalent to Rotating Calipers, so if you are already accustomed to it, please use what you are already convenient with. And if you could extend this idea to other Rotating Calipers tasks, let me know! I would be very interested.

• +8

By chromate00, 19 months ago,

For the few weeks so far, I have been practicing ruby on multiple platforms (Atcoder and BOJ, solved about ~150 in total?). Now I feel that I am fluent enough with the language, so I will be using Ruby, along with Python 3 and C++, on todays Round (the Div.3). After that, I would like to share my honest thoughts about using Ruby in CP with others as well. I am eager to see how well Ruby can perform on Codeforces today!

P.S. Yes, I am writing this blog also for the ignorant people who might argue about this saying that I am cheating again, why can't they understand that one person can understand more than 2~3 languages? (Yes, I could've used Ruby, Java, C++, Python, and JS all in the same contest. I simply don't because there's no merit for that.)

UPD: The round has concluded. I solved 4 tasks in total, 3 with Ruby and 1 with Python, and then tried 2 more tasks with C++. (Could not come up with the idea on E, could not come up with the edge cases on G. This is not to argue with the contest's difficulty; the tasks were good, I was just not good enough.)

My impression about this? It's quite amazing. I could do almost everything I can do with Python on Ruby, and the code was shorter most of the time. This gives me a great advantage — I can type less and think more. That's a great advantage. Task A and Ruby made a very funny result, coming up with the following solution.

The funny solution

Ruby and Python do share many characteristics — short code, arbitrary precision integers, etc. However, Ruby supports more out-of-the-box, like fractions and complex numbers without any "imports". (arbitrary precision decimal types are supported, but they are in a separate module in the standard library.) And Ruby supports even more with standard libraries, exceeding what you may have imagined a language would have as a built-in feature. Who would have imagined a language would have Tarjan's SCC algorithm built-in? Or multidimensional Newton's method? The language has been performing much better than I have initially imagined, and is constantly getting beyond my imagination. If someone would like to learn a new language for CP (anything other than C++), I would greatly recommend them to learn Ruby.

Ruby does have its flaws, though. Ruby has a very small call-stack size limit by default, for which I am asking for an improvement to Codeforces. The language is also slow, probably on par with Python. But even considering these flaws, Ruby seems to be very performant in the CP scene. (At least, it is better than Python IMO)

TL;DR: It's a great language for CP, and I am very surprised.

• +12

By chromate00, 20 months ago,

Well, it's my first time posting this type of blog on codeforces. Most of the time I could either find help elsewhere or find the answer myself, but this time it was different. I could not find any resource related to this problem, given that the discussion page for this Opencup contest is a blank russian blog from 2015 that I do not want to necropost on, and I do not know anyone who has solved this problem personally. The problem is here on Baekjoon OJ, and I would like to share my approach so far before asking the question, so that we can improve over the current approach.

Current Approach
Code based on approach

Now here's the issue: I am still getting a WA verdict on the problem. This might mean that I need to prove a tighter bound possible, however I am unable to improve it further than the current status of the approach. Can anyone help me on this problem? Is there anyone here who have solved this problem personally and/or during the contest?

• -2

By chromate00, 20 months ago,

Before we get to the point, I kindly ask you to read the previous part of the blog if you haven't already. It contains a lot of the context we will be speaking of.

So, on the previous half of the blog, I explained the basics of named requirements, and promised to demonstrate implementing a working class based on the RandomNumberEngine requirement. It took some time (due to life & stuff), but here it is now. Here I explain the process of implementing Lehmer64, following the RandomNumberEngine requirement.

First, I read carefully the references for the RandomNumberEngine requirement. Reading these requirements carefully before implementing can prevent many mistakes, so you may as well read the requirements before reading the rest of the blog.

The concise description on the top of Requirements provides a very important information, not present in the table below. It is as follows.

A type E satisfying UniformRandomBitGenerator will additionally satisfy RandomNumberEngine if...

This means that, for a type to meet the requirements for RandomNumberEngine, it must meet UniformRandomBitGenerator first. Therefore, I implemented the requirementd for UniformRandomBitGenerator first. This adds 3 lines of code. (One requirement coincides with RandomNumberEngine)

using result_type=uint64_t;
constexpr static uint64_t min(){return 0ull;}
constexpr static uint64_t max(){return -1ull;}

Now that we have the three functions, we can implement the functions needed for RandomNumberEngine. First, I started off with the two constructors and seed functions, seed() and seed(s). The former is basically initializing the RNG with a default seed, the latter is about initializing the RNG with an arbitrary (user-given) seed. I defined the default seed as the maximum value for an unsigned 64-bit integer. However, one issue was that Lehmer64 uses a 128-bit state. Therefore, I had to change the seed to a 128-bit integer with splitmix64 and some bitmasks. Here are the members I added.

uint64_t sm64(uint64_t x,uint64_t n)
{
x+=n*0x9e3779b97f4a7c15ull;
x=(x^x>>30)*0xbf58476d1ce4e5b9ull;
x=(x^x>>27)*0x94d049bb133111ebull;
return x^x>>31;
}
const static uint64_t def=-1;
Lehmer64():state(state_t(sm64(def,1))<<64|sm64(def,2)){}
Lehmer64(uint64_t seed):state(state_t(sm64(seed,1))<<64|sm64(seed,2)){}
Lehmer64(const Lehmer64& a):state(a.state){}
void seed(){state=state_t(sm64(def,1))<<64|sm64(def,2);}
void seed(uint64_t seed){state=state_t(sm64(seed,1))<<64|sm64(seed,2);}

After this, we need to implement the seed(q) function and its corresponding constructor. The q in seed(q) is defined above, as "q, a lvalue of some type satisfying SeedSequence". SeedSequence here, is another requirement. The only member function of SeedSequence we need to know here, though, would be generate(rb,re). In the reference for SeedSequence, there is a description of this member function.

Fills [rb,re) with 32-bit quantities depending on the initial supplied values and potential previous calls to generate. If rb == re, it does nothing.

So, this is a simple function filling a range with psuedo-random 32-bit unsigned integers. Knowing this, I made an union type of four 32-bit unsigned integers and one 128-bit unsigned integer. This is a lazy way to convert the generated 32-bit integers to one 128-bit integer (in terms of raw bits). After that, I used that union type and wrote the function.

union lz{uint32_t st[4];state_t stt;};
template<class Sseq>
Lehmer64(Sseq& q){lz k;q.generate(k.st,k.st+4);state=k.stt;}
template<class Sseq>
void seed(Sseq& q){lz k;q.generate(k.st,k.st+4);state=k.stt;}

Now we finished 7 functions out of 13 already. For the rest, we can follow the detailed implementations of Lehmer64, or the description of the functions. Here are the other functions I added finally.

uint64_t operator()(){state*=mult;return state>>64;}
template<class T>
T pow(T a,uint64_t b){T z=1;do{if(b&1)z*=a;a*=a;}while(b/=2);return z;}
bool operator==(const Lehmer64& o){return state==o.state;}
bool operator!=(const Lehmer64& o){return state!=o.state;}

template<class os>
os& operator<<(os& ost,const Lehmer64& L){ost<<uint64_t(L.state>>64)<<" "<<uint64_t(L.state);return ost;}
template<class is>
is& operator>>(is& ist,Lehmer64& L){uint64_t a,b;ist>>a>>b;L.state=a;L.state<<=64;L.state|=b;return ist;}

(pow here exists just for binary exponentiation, needed for the discard function, as I did not want the discard function to be $O(n)$. Also note that the operator overloads for >> and << exist outside the class.)

Here is the final code after merging everything to a working class.

Code

Of course, it may take some time if you are not experienced in structured coding, to write code based on complex requirements. Still, understanding named requirements is very important, you will need them sometime in your CP experience as you advance further. I hope this helps in the process of understanding these named requirements. Please post your questions below if you need any resolutions on the explanation (or understanding any named requirement!)

• -1

By chromate00, 21 month(s) ago,

Today I encountered a very strange behaviour. A comment of mine had its vote status changed, but I realized that the original post where it was had already been removed (or hidden, left as a draft) before the change. In this state, noone should be able to see the original post, or only the OP should be able to, if it were just changed back to a draft. Therefore, the vote status cannot possibly change multiple times, as the only one who can possibly access the post is the OP. So I came to think, is the system account manipulating the vote status? Of course, there are unknown parameters that affect the contribution score, and I don't really mind that. I don't really mind my contribution dropping either. (actually I would tolerate it dropping to somewhere near Sparky_Master_WCH1226's score even) Still, I think the contribution system, or at least the vote system, should be as transparent as possible, reflecting actual votes made by actual human beings.

Hence, the question. Is the vote system rigged?

• -50

By chromate00, 22 months ago,

Just today, I came up with this trick while solving a problem that (originally) required a merge-sort tree. While I knew that it could be solved by a merge-sort tree, I thought implementing it would be relatively tedious. This being said, I do not mean to say that I do not like the data structure, I think its concepts are very clever. However, I do like algorithms whose implementation is very elegant. For this reason, I prefer the fenwick tree over the segment tree on many problems. This was another case of me considering how a fenwick tree could replace the usual implementation. And just today the idea came to me, which turned into a method which could almost replace merge-sort trees. (Note that some people may have found this earlier than I did, and I think most people who did would have found this independently like I did too)

So, as a disclaimer before explaining the method, I should tell you that this method is not really better (time complexity-wise) than the merge-sort tree. However, in practice, many queries of the merge-sort tree requires an $O(\log^2 N)$ time complexity per query. So, if the situation of $N \gg Q$ (meaning that $N$ is much larger than $Q$, a situation which does not happen very often in the CP scene) does not happen, this method seems feasible enough.

This methodology consists of three steps, "add", "build", and "query". The "add" and "query" step's implementation is a modification to the fenwick tree, and the only added step is "build", which is not complex at all. (You could even one-line it!) I will explain them in order.

This step consists of adding the element to the indices that needs to store this element. For example, if we need to add the element $2$ to index $1$, we add this to indices $[1,2,4,8,\cdots,2^{\lfloor \log N \rfloor -1}]$. This can be done by replacing the operation on the "add" step of the fenwick tree to a push_back operation. (assuming that bit is an array of vectors.) Code is as follows.

{
while(i<MAXN)
{
bit[i].push_back(x);
i+=i&-i;
}
}

The time complexity of this step is $O(\log N)$ per addition, as there are $\log N$ indices at maximum that we need to add the element, therefore $O(N \log N)$ in total.

### Second Step: Build

Now after the "add" step, each index of the fenwick tree contains the elements of the indices it needs to manage. However, the vectors in the fenwick trees are not sorted yet. In this state, we cannot get an accurate answer from the queries. Therefore, we need to sort every vector in the fenwick tree. The code of this step is pretty straightforward. Code is as follows.

void build()
{
for(int i=0;i<MAXN;i++)sort(begin(bit[i]),end(bit[i]));
}

Now for the time complexity. The time complexity of this step is not so trivial, but we can prove an upper bound for it. First we need to prove this.

$a+b=N \Rightarrow O(a \log a) + O(b \log b) = O(N \log N)$

This can be proven through the following.

$a+b=N \Rightarrow \log a, \log b < \log N$
$O(a \log a) + O(b \log b) = O((a+b) \max (\log a, \log b)) = O(N \log N)$

Now given that we have at most $N \log N$ values in the entire fenwick tree, the time complexity will be at most:

$O(N \log N \log (N \log N)) = O(N \log N (\log N + \log \log N)) = O(N \log^2 N)$

Therefore this step's time complexity has an upper bound of $O(N \log^2 N)$. This bound is not very tight, and while I am convinced that a tighter upper bound can be proven, I was unable to do it myself. (Maybe I could prove $O(N \log N \log \log N)$?) Please help me find a tighter upper bound if you can.

UPD: If you sort the queries before you actually insert the elements, you can omit this step and get a more optimal time complexity, $O(N \log N)$, on building the fenwick tree. Credits go to darkkcyan for finding out this technique, you can read this blog post to learn more.

### Third Step: Query

Now that all indices of the fenwick tree are sorted, we can answer the queries. This step bases on the usual implementation of fenwick trees, and therefore finds the answer for a prefix $[0,x]$. Therefore, we can find the answer of many types of queries on the merge sort tree with $[l,r]=[0,r] - [0,l-1]$. Note that there may be types of queries that cannot be done like this, so this method does not completely replace the merge-sort tree.

Now for the example. Say, the query we want to answer is as follows.

$l \, r \, x$: Find the number of elements greater than $x$ in the interval $[l,r]$.

We can answer this query on a prefix $[0,x]$ by adding up answers making up the prefix. Therefore, the time complexity for answering about this prefix is $O(\log^2 N)$ per query, as there are $O(\log N)$ intervals, and we need to binary search on each partial interval. We can answer any interval by subtracting the answer for $[0,l-1]$ from that of $[0,r]$. Code is as follows.

ll query(int i,ll x)
{
ll ans=0;
while(i)
{
ans+=end(bit[i])-upper_bound(begin(bit[i]),end(bit[i]),x);
i-=i&-i;
}
return ans;
}

With this discovery, I have found that fenwick trees are much more powerful than I used to think. I think this usage of fenwick trees can replace merge-sort trees in many applications of it, and so I decided to give it a name: the Sort-Fenwick. The name comes almost from the merge-sort tree, but due the fact that there is no "merge" at all going on, I omitted "merge" from the name. I really like this methodology, as it is much, much more simple than implementing a merge-sort tree. Suggestions and questions are always appreciated, and thanks for reading!

• +69

By chromate00, 22 months ago,

On todays contest, at 1737B - Ela's Fitness and the Luxury Number, a lot of contestants were hacked or FSTed. This is due to the inaccurate nature of floating-point types. While I do think this happening on a B is not a good thing, but it happened anyways. So as we experienced a failure this time (myself a long time ago multiple times), we need to prepare for the next time it happens.

My solution to this was using Python's isqrt function. It receives a non-negative integer, and returns $\lfloor \sqrt{x} \rfloor$. It is guaranteed to return the accurate value, so this is the perfect tool for the job. I read this blog as well, and his points are valid. I still thought telling people about the isqrt function would be a great addition to the community as well. Shoutout to -is-this-fft- for writing that blog.

Including this time, there will be many situations where there is a better tool for the job. It is a good idea to actively look for them and use them to your advantage. This is the exact reason I am writing this blog, and other blogs such as the STL in CP series as well. I hope you try to do the same when learning and practicing CP as well.

p.s. I think there should be other languages with builtins serving the same functionality, or ways to do it in the language you use. Please suggest it in the comments section below! It would be a great addition to the topic.

UPD: I just found that this wikipedia page exists, please take a look if you're interested in other methods to do this!

• +36

By chromate00, 22 months ago,

https://mirror.codeforces.com/blog/entry/10355 — I hope you read this blog before reading this blog. Basically, the SGI STL implements the "rope" data structure, which is a tree-based data structure for doing operations in $O(\log N)$ time complexity. You can read ahead if you are already familiar with the data structure.

Before we start, Let me explain to you the context on how I thought about this "trick". The rope implementation in SGI STL (from which GNU G++ borrows many extensions) provides many operations given on strings. The most important out of them would arguably be substr (Splitting a substring into another rope) and + (Merging two ropes into one, effectively "concatenating" the strings). There are more functions too, but there is no function to reverse a substring.

Now back to my side of the context. I was thinking about a way to solve this problem.

Given a string and $Q$ queries of the following kind, output the resulting string after all $Q$ queries have finished.

l r: reverse the substring in the interval $[l,r]$.

(This problem appeared on the Croatian Programming Contest 2010, you can try to solve it here.)

Now some of you might know a clear solution already — to use a splay tree. While other BBSTs can work with this type of queries, the splay tree would be one of the most well known ones. However, I do not know how to implement splay trees, but I do know that the rope exists. After a few hours of thinking, I came up with this solution.

Let us manage a rope $R$ with the given string $S$ and the reversed string $S'$ concatenated. If we denote the original length of the string as $N$, the new length of the string in the rope would be $2N$.

For all closed intervals $[l,r]$ representing a substring $s$, given that $1 \leq l \leq r \leq N$, we can also find a interval $[l',r']$ representing the reversed substring $s'$ in the same rope. And as clearly you may see, this interval $[l',r']$ corresponds to $[2N+1-r,2N+1-l]$. Now we can split the rope into five intervals based on these intervals we have found.

These five intervals I am speaking of are the two intervals we have (one from the query, one mirrored from that) and the three other intervals making up the rope. So the whole rope, represented by the five intervals, would be $[1,l)+[l,r]+(r,2N+1-r)+[2N+1-r,2N+1-l]+(2N+1-l,2N]$. Now we can swap the interval from the query with the mirrored interval. This new rope would be represented as $[1,l)+[2N+1-r,2N+1-l]+(r,2N+1-r)+[l,r]+(2N+1-l,2N]$, and would contain the new string and its mirrored one, concatenated. The time complexity for doing this operation would be $O(\log N)$, the same with the rope's time complexity.

Now for the output, we can save the result in a string and discard the latter half, as we do not need the reversed string now. The problem is solved.

The implementation of this solution is very simple, we can already use the functions given by the rope implementation (stated as "the most important" ones above). In my opinion, it is much simpler and easier to understand than implementing a splay tree. Last but not least, it also supports other operations possible on a rope (you can just mirror it on the reversed half as well). Thank you for reading, and as always, suggestions and questions are welcome.

• +25

By chromate00, history, 22 months ago,

(This was previously a blog reflecting my changes, but I just decided to use it only as a QnA due to some suggestions in the comment. You can see what was written in the revision history)

Hi this is chromate00, and ask me anything. Literally, anything. Criticism included.

• -71

By chromate00, 22 months ago,

Named Requirements are a summary of what the STL expects for certain classes/types to be passed to functions as an argument. Understanding these are important for writing code that works well with other functions in the STL. There are many places you can read them on, but I personally prefer Cppreference. Let's take the following statement as an example.

Compare functions should return false when the two values are equal

This is explicitly stated on the named requirement named Compare, the parts that state this are as follows:

Given comp, an object of type T, For all a, comp(a,a)==false

From this we can see that objects of type T, when called as f(a,a), should return false. Obviously the two as are equal, and we can expect that the STL functions may (or in this case, will) spit unexpected errors if the requirement given in the statement is not satisfied.

The above was an example of named requirements, from a statement relatively well-known in CP. And in this example you can see that following the named requirements is very, very important.

Now we need to understand exactly how we should read the named requirements. There are many different named requirements, and not all named requirements' descriptions look the same. Noticing the difference before understanding them is helpful, so I shall explain what is different in these requirements.

• Some requirements have a very "short and concise" description.

The Predicate requirement is a good example of this. Let's see the description of the requirement, and try to understand it.

The Predicate requirements describe a callable that returns a value testable as a bool.

A good way to understand such descriptions is cutting them phrase by phrase. We can apply this method to this description like this.

"The Predicate requirements describe..."

• a callable means that this requirement includes the Callable requirement
• that returns a value means that the return type of the call is not void
• testable as a bool means that the returned type, either is bool, or can be contextually converted into a bool-type value. the types that fit this condition include int, long, and most importantly bool.

• Some requirements have an "organized table" in its description.

The UniformRandomBitGenerator requirement is a good example. In its description you can clearly see (with the table) what functions/expressions you need to implement for this requirement. The table provides information on what methods it requires, what return types they need to have, and extra requirements needed to fit the requirement.

(Red = The members you need to implement, Blue = Their return types, Green = Implementation details about the members. most other descriptions have a table with a similar format as well.)

• Some requirements have a "dependency" in its description.

The named requirements for iterators show this "dependency" well. Basically when we say

The type T satisfies A if ... the type T satisfies B, ...

Then the named requirement "A" has the named requirement "B" as a prerequisite. Therefore to implement a type satisfying A, it would be convenient to implement methods for B first.

These three are the ways how (at least I thought) named requirements are described. It would be good practice to try these methods on other named requirements, or come up with your own way to read them as well. This was the part 1 of "Understanding Named Requirements", and in part 2 I will demonstrate making an actual working class based on the RandomNumberEngine requirement as a practice for understanding the descriptions. Stay tuned!

• +31

By chromate00, history, 22 months ago,

Let's begin this story with what happened two days ago.

Well, this happened. Too bad. Maybe I'll just go back to life, I presumed. Until I noticed the line -

You should not publish comic content, especially if it is not interesting to a wide part of the audience, repeats the existing one, or has no connection with competitive programming.

And oh, that was the line I once did not know of. I thought some comedy was fine, still a big part of this community is made of comedy. And it wasn't. The thing is, while I sometimes feel like shitposting, I agree on that we need some part of the community which solely concentrates on learning, teaching, and experimenting. I considered problem solving a quite major part of my life, and for the very same reason I tend to use a lot of time on Codeforces. After all, my vision is to become an algorithm researcher, like people you all know, such as dijkstra, floyd, etc.

This was a lesson learned. and next time you see me in comments/blogs, you'll see me as trying to be as helpful as possible. I expect the next blog to be a continuation on the STL series (actually I've been working on it in a draft), and after that I will explain some useful things too. See you next time in another (helpful) comment/blog.

• -65

By chromate00, history, 23 months ago,

This (quite sudden) post is about a data structure I have came up with in one day in bed, and this won't be a full series(I am yet a student, which means I don't have time to invent data structures all day and all night). Still, Thinking about this, I thought it would be a good idea to cover it on its own post. So here we are.

### Concept

The traditional bit trie uses one leaf node for one number, and every leaf node has same depth. but about this, I thought — why? "Do we really have to use $32$ zeroes, just to represent a single zero? Heck no. There has to be a better way." And this is where I came up with this idea.

Instead of using the same old left-right child, let's use up to $l \leq depth$ children for one node. $l$ is the suffix part when each node represents a prefix. For every prefix node, we connect prefix nodes with only one $1$ bit determined after the current prefix. For example, under $1\text{xxxx}_2$, there are $11\text{xxx}_2$, $101\text{xx}_2$, and etc. Then, while the undetermined suffix (ex. the $\text{xxx}$ in $1\text{xxx}_2$) is, of course, undetermined, we can assume they are $0$ for the exact node the prefix exists on. Then we can use the prefix node $1\text{xxxx}_2$ for $10000_2$ also.

### The Important Detail

At this moment, you should be wondering, how do we distinguish the node for the number $10000_2$ and the prefix $1\text{xxxx}_2$? They use the same node after all. My conclusion? You don't need to. To do this, you can just save the size (amount of elements) of the subtree. Now, let us denote the size of the subtree of prefix $S$ as $n(S)$. then $n(1\text{xxxx}_2) = n(11\text{xxx}_2) + n(101\text{xx}_2) + \ldots + n(10000_2)$ applies. So you can just traverse the child nodes one by one, and the rest is the number itself.

### Traversing the Bit-indexed Trie

Using the "important detail" stated above, traversing the Bit-indexed Trie boils down to simply interpreting it like a binary tree. We start at the root, which is $0$, and we can interpret this node as $0\text{xxxxx}_2$. This root node may (or may not) have $01\text{xxxx}_2$ as a child node. Important point here is to keep a variable for the size of the "virtual" subtree of the current node. (we will denote this as $c$.) If the subtree size of the current node ($0\text{xxxxx}_2$) is $s$ and that of the right child node ($01\text{xxxx}_2$) is $s_1$, then the subtree size of the left child node, when interpreted as a binary trie, should be $s-s_1$. So if we want to traverse towards the right child node, do so, and update $c$ to $s_1$. On the other hand, if we want to traverse towards the left child node, stay on the current node, assume that we did move ($0\text{xxxxx}_2$ yet shares the node with $00\text{xxxx}_2$), and update $c$ to $c-s_1$. After understanding this, almost everything goes same with the usual trie.

### Interesting stuff about the Bit-indexed Trie

With the fact that a single number is represented as a single node in the data structure, we can use a hash table to represent the whole trie. And what's so great about this method? It lies in its simplicity. Let's assume the trie is represented with a unordered_map<int,int> type. (the key is for the node, the value is for the subtree size) Now inserting a value in the trie is as simple as this:

Insertion to the Hash-table BI-Trie

and that is it! Simple, isn't it? and here comes the reason I named the data structure "Bit-indexed Trie". Many should have noticed the similarity of the name with the Bit-indexed Tree, also known as the Fenwick Tree. (I could not call it the Fenwick Trie, Peter Fenwick simply did not make it) The Bit-indexed Trie even has many common facts with the Bit-indexed Tree! Here are some:

• It can replace the traditional bit trie, similar to how the BIT replaces the Segment Tree in many situations.
• It reduces the memory usage from $2M$ ~ $4M$ to $M$, as they save values in every node, not only the leaf nodes.
• They have very similar implementation in many parts, see the snippet above.

also, as we saved the subtree sizes, accessing the subtree size information turns out to be almost $O(1)$ (assuming the nodes are saved in a data structure with $O(1)$ random access). Even if you don't make it $O(1)$, I believe the improvements will be quite significant, as it would be possible to optimize some processes with __builtin_clz and such bitmask functions.

EDIT: errorgorn and Kaey told me that finding the amount of numbers with a certain prefix is not $O(1)$, and they are correct. It turns out to be $O(\text{number of trailing zeroes in the prefix})$.

### Summary

In this post, I covered the concepts and details of the data structure I have come up with, which makes it possible to reduce the memory usage in a bit-trie to half of the usual trie or even further. I look forward to release a full C++ implementation of it soon, and I hope many people would notice the beauty of this data structure. Thank you for reading.

• +33

By chromate00, 23 months ago,

Alas, It's time for another day of STL in CP. Last time we covered the Non-modifying sequence operations, and as you may have expected, we shall cover the Modifying ones this time. The concept is simple- they simply modify a range or a variable (or multiple of them) which you have given to the functions. But the usage? There are something deeper when we think about the details of these functions. In this article, We will cover trivial functions quickly, and then take some time in looking at the ones with more interesting usages.

### copy, copy_if, copy_n, copy_backward

These are quite trivial, they do what their names suggest. copy copies, copy_n copies $n$ elements from the beginning, copy_backward does the same thing with copy but copies the last element first. Time complexity takes $O(n)$ assignments, so it's $O(n)$ assuming the types you're copying takes $O(1)$ to assign. Otherwise, it's multiplied by their time complexity, obviously. copy_if is a bit more useful, though, as functions with filtering features are always useful somewhere.

### move, move_backward

Those two do the same thing with the copy ones, except this one moves the elements, meaning that it reuses the addresses of the original range. It may use less time than copy, but is only useful when you do not need the original range. beware!

### fill, fill_n

If you can read, you should know what this function does. it simply "fills" a range. Time complexity is $O(cn)$ where $c$ is quite obviously the time for one assignment, which is most likely $1$ or some small constant unless if you're filling a multidimensional container, say, a vector<string>.

### transform

This is when things get interesting. Basically, this function applies a function to all elements in a range (or two) and copies the results to another range (or writes it in-place if you want it to). Why is this interesting? Because it can work in all situations where you want to apply a function to a range or two, basically any function in the format of $y = f(x)$ or $z = f(x,y)$. And for this reason, this function can be used in implementing Sparse Tables. The code is as follows:

Snippet
Code getting AC on 'Static RMQ' @ Library Checker

### generate, generate_n

This function is used when you want to fill a range with values generated by a function, functor, lambda, or basically any callable. This being said, this includes an RNG, meaning this function can be used for filling an array with random values. More interesting things can be done also, such as filling the range with a certain integer sequence, for example the fibonacci sequence. Just use something such as [a=0,b=1]mutable{int c=(a+b)%1000000007;a=b;b=c;return a;} as the function, and this function will be filling the range with the fibonaccci sequence. I think there should be more interesting and creative uses of this function, let me know in the comments if you know any.

### remove, remove_if, remove_copy, remove_copy_if

These functions are supposed to "remove" elements from the range. However, the functions can't just "resize" the range, they do not have access to the entire container. Therefore, it is almost always used with erase (member function of the container). This is called the Erase-Remove Idiom. However on C++20 and above, the Erase-Remove Idiom is unneeded in most cases. Instead of it, you can use erase(container, value) and erase_if(container, function). While this is the case of resizable containers, the range used with this function does not have to be one of a resizable container. For example, when you want to use it with an array, you can maintain a size variable, and update it when you run this function. Like this — sz = remove(arr, arr + sz, x) - arr. The latter two do the same operation while copying the result to another range. They are usually unused in CP (we do not want to add another factor to our space complexity), but if we want to preserve the original range, then the two may be used instead.

### replace, replace_if, replace_copy, replace_copy_if

Does the same thing as remove, but instead of removing from the range it replaces them to another value. These do not resize the range, so they do not need to be used with erase. Same opinion as above about the latter two, you may use them when you need to preserve the original range. I have still not seen someone actually use the latter two in CP, though.

### swap, swap_ranges, iter_swap

Straightforward. swap does what it suggests, swap_ranges swaps all elements in two ranges. (like this short snippet — for(int i=0;i<n;i++)swap(a[i],b[i]);) And iter_swap is literally swap(*a,*b). Do I need to explain further? I don't think so.

### rotate_copy, shift_left, shift_right

The former two reverses a range, the medium two rotates a range, the last two (added in C++20) shifts elements in a range. Why did I list these three sets of functions in the same paragraph? Because they can serve a common purpose in CP (especially Codeforces) — Reducing the hassle of implementation. Usually D2A and D2B can be solved relatively quickly, but a fast mind may not be very helpful when the implementation is quite complex. This is where these functions come into action. Here is a practice problem in which one of these functions will help you, I suggest you try it if you didn't already. (Problem: 1711A - Perfect Permutation) These functions are useful in many problems with constructive algorithms in general, so it would be a good idea to use them when you can!

### shuffle, sample

Oh, shuffling and sampling, the two important operations in randomized algorithms. The former shuffles elements in a range uniformly with a given RNG. "uniformly" here is important, it's the exact reason random_shuffle was deprecated and then removed in C++20! So make sure not to use random_shuffle in all cases. You can learn more about it in this blog. Now the latter function should be used with care, it samples $\text{min}(N,\text{size})$ distinct elements in a range. However sometimes you might just want to pick $N$ elements allowing duplicates, or the situation might be that $\frac{N}{\text{size}}$ is so small that getting a duplicate is quite unlikely. In this situation, it would be reasonable to just generate $N$ indexes in the range of $[1,N]$, right? However, sample has a time complexity of $O(\text{size})$. Therefore, you need to be careful when using this function, it might cause unwanted TLEs.

### unique, unique_copy

These two functions (latter copying the result to another range) remove adjacent duplicates from a range. As it cannot resize the container by itself, usually it is used with erase, similar to the Erase-Remove Idiom explained above. Removing only adjacent duplicates means that this function, alone, can't remove all duplicates in any given range. Therefore, for this function to be able to remove all duplicates in the range, the range needs to be sorted. However, this does not mean that this function is only useful when the range is sorted. There are situations when we need to check adjacent groups, one example would be GCJ 2022 Round 1C — Letter Blocks. In a part of this problem's solution, we need to check if a string is "grouped" (i.e. each alphabet appearing in the string make up one contiguous group). Of course, you can do this by counting in $O(n)$, but I felt this method was very ugly and complex in terms of implementation. I have come up with a slower but elegant way to do this, which has an $O(n \log n)$ time complexity. Here is how.

How to do it — What I call the 'Double-unique method'

In this section we reviewed the modifying sequence operations, and found out situations where they can be used. In the next section of Chapter 1. Algorithms, we will be reviewing a wide variety of functions, such as sorting, merging, binary search and more. See you on the next section!

Back to chapter 0

• +13

By chromate00, 2 years ago,

Last time, we reviewed very briefly about what is STL. In this Chapter of the series, we will be talking about the algorithms it provides, and their value in CP (speaking of "value", some functions may be almost absolutely useless in CP due to another feature or some reason). We are not going to talk about using ranges for now, they are bound for another article. In Section 1 of Chapter 1, we will be talking about Non-modifying sequence operations. (I suggest that you read this article side by side with Cppreference, I get a lot of ideas and knowledge there, and the Sections are divided for convenience in reading side by side.)

### all_of, none_of, any_of

These three functions are quite straightforward, and have $O(n)$ time complexity (if the passed function has $O(1)$ time complexity). They may be useful in many situations, but I have not found a real usage example yet.

### for_each, for_each_n (C++17)

These two functions apply a certain (unary) function to each element of a container. Their time complexity is, quite clearly, $O(n)$, considering the case when the function applied takes $O(1)$ time. However, the former has been useless in my case, as range-based for statements were sufficient for me. The latter, though, may be very useful, and is not replaced by the range-based for statement.

### count, count_if

These two functions count the number of elements in the container that, for the former, matches a value, and for the latter, meets a condition. Their time complexity is $O(n)$, assuming the function takes $O(1)$ time. Both are useful in my opinion, but the latter is much more important in CP. This is because CP asks for the amount of values meeting a certain condition, more frequently than asking for a certain value.

### mismatch

This function iterates through two ranges of iterators and then returns the first position where the values differ or the function passed returns a falsy value (which is 0). The time complexity is $O(n)$ assuming the function passed (== by default) takes $O(1)$ time. Now this function is very important, in the sense that we can compare two ranges and check if all elements in range A meets a condition relative to range B. We can even check if a range is a palindrome with this function, by passing the iterators and reverse iterators to this function, and checking if the function finished at each ends. Do note that this function returns a pair of iterators, and this in turn could give an advantage over using the equal function (which only returns a boolean value).

### find, find_if, find_if_not

I am not going to talk too much about these functions in this article, their uses/time complexity/etc are too trivial and well-known. However, do note that, for the same reason stated about count_if, the latter two have relative importance over the former one.

### find_end, search

In my opinion, I thought find_end would have been better named as search_end, as search does the same thing, except find_end searches for the last occurrence while search searches for the first. These functions by default searches naively, so it has an $O(nm)$ time complexity. Do use KMP when needed; naive (and Boyer-Moore in its worst case) searching methods are too slow for CP.

### find_first_of

This function finds the first occurrence of any value existing in a certain set (represented with a range) in a range. For example, you can find the first occurrence of any value in {2, 4, 6, 8} in a vector. This function has an $O(Sn)$ time complexity, and therefore when looking for exact matches it can be replaced with the tree-based/hash-based containers for $O(log(S)n)$, $O(n)$ time complexity correspondingly. However these container-based methods might not do very well when in need of applying a function to compare the elements (trust me, traversing these containers are $O(n)$ but it is a pain due to cache issues), so in this case this function may be helpful.

This function compares adjacent values, and returns the first position where the values are identical, or meet a condition when passed a function. This function has $O(n)$ time complexity, assuming the comparison function has $O(1)$ time complexity. I have found usage of this function in 1713B - Optimal Reduction, consider the following explanation as a solution for the problem.

Explanation for 1713B

As shown in the explanation above, adjacent_find has great strength in comparing adjacent elements inplace, without making a difference array. For this reason, I consider this function very important in CP.

### search_n

This function finds the first point where $n$ consecutive elements in a row exist, which are same with the given argument (or meets a condition compared to it). This may be useful in problems such as when you need to find $k$ consecutive positive numbers. Do remember though, that the given function takes the elements on the left argument and the argument in the right argument.

In this article, we reviewed the Non-modifying sequence operation functions in the Algorithm category of STL. In the next section, we will be reviewing the Modifying sequence operation functions.

Back to chapter 0

• +7

By chromate00, 2 years ago,

I, you, and probably everyone using C++ for CP are using STL. It wouldn't even be a lie if I told you that std::sort probably saved many people's lives. However, std::sort is not the entire STL. Even combined with the relatively-wide spectrum of features we use daily, we can't say we're using the entirety of STL. Therefore, I am writing this series of blogs to review a lot of important STL features, and give you some examples of using them in actual CP problems.

So, before we begin, what really is STL? Is it a part of C++? Quite clearly, yes. Is it C++ itself? Almost, but no. The STL is almost analogous to the Standard Library for C++ in most cases, but even the Standard Library and the STL are two different and separate identities. (Even there are cases where you can use C++ but not STL.) In this series I will try my best to not mislead you by saying the Standard Library and the STL are same things.

What does the STL provide? It provides containers, iterators and algorithms. In this series, we will cover algorithms, containers and iterators on their corresponding posts, and then after that, we will look at other notable features (which may not arguably be part of STL, but still are quite useful and important). I shall update this article with a link to the next chapter as soon as it is out.