Blog entries - Codeforces

#	User	Rating
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

steveonalex's blog

Convex Hull Trick, but slightly cooler?

By steveonalex, 4 weeks ago, In English

I didn't see any documentary and blog on this topic, so I guess it's Codeforce blog time .

Blog in HackMD if you prefer it

Prerequisite:

Convex Hull Trick.

1) The OG Convex Hull Trick:

This blog is not a tutorial to Convex Hull Trick, so I would assume that you guys already know about it before reading this. However, I'll briefly recap for the sake of completeness.

Problem Statement: Given $$$n$$$ lines of the form $$$y = a_i * x + b_i$$$ and $$$q$$$ random queries $$$x_j$$$. For each query, calculate the value $$$max(a_1*x_j + b_1, a_2*x_j + b_2, ..., a_n*x_j + b_n)$$$.

Constraint: $$$n, q \leq 3*10^5$$$, $$$a_i, b_i, x_j \leq 10^9$$$.
Time limit: 1s

This is a well-known problem, and you can find tutorial for this basically everywhere. In short, we will sort all the lines in ascending order of their slope, and remove all of the redundant lines, as shown in my beautiful painting below (A line is redundant if it lies entirely "below" its two adjacent lines). This is solvable in $$$O(n)$$$ using stacks and some geometry.

Figure 1: How a redundant line looks.

From this, we observe that the resulting function is convex (since slopes are sorted). Each slope is optimal for one continuous segment, starting from and ending with its intersection with its two adjacent lines in the hull.

Figure 2: How the convex hull should looks, alongside its intersections.

Once the convex hull is constructed, the problem is basically just binary searching over the intersections of the convex hull to find the optimal line for queried point $$$x_j$$$.

Sample Code

struct CHT{ // casual CHT
    #define Node pair<ll, ll>
    vector<Node> hull, suboptimal;
    vector<double> inter;
 
    double getInter(Node a, Node b){return (double) (b.second - a.second) / (a.first - b.first);}
    ll f(Node s, ll x){return s.first * x + s.second;}
 
    void add(Node x){
        if (hull.empty()) {hull.push_back(x); return;}
        if (hull.back().first == x.first){
            suboptimal.push_back(hull.back());
            hull.pop_back();
            if (inter.size()) inter.pop_back();
        }
        while(hull.size() >= 2){
            double x1 = getInter(hull.back(), x), x2 = inter.back();
            if (x1 >= x2) break;
            suboptimal.push_back(hull.back());
            hull.pop_back(); inter.pop_back();
        }
        if (hull.size()) inter.push_back(getInter(hull.back(), x));
        hull.push_back(x);
    }
 
    ll get_val(ll x){
        if (size() == 0) return -1e18;
        int idx = lower_bound(ALL(inter), x) - inter.begin();
        return f(hull[idx], x);
    }
 
    int size(){return hull.size();}
};

Time complexity: $$$O(n * log_2(n))$$$ for the preprocessing, and $$$O(log_2(n))$$$ for each query.

2) Extended CHT problem:

Problem Statement: Given $$$k$$$ and $$$n$$$ lines of the form $$$y = a_i * x + b_i$$$ and $$$q$$$ random queries of the form $$$x_j$$$. First, we denote $$$c_i = a_i * x_j + b_i$$$. For each query $$$x_j$$$, find the $$$k$$$ largest values of the array $$$c$$$.

Constraint: $$$n, q \leq 3*10^5$$$, $$$k \leq 10$$$, $$$a_i, b_i, x_j \leq 10^9$$$.
Time limit: 5s.

Since the lines in the convex hull are sorted by slope, we observe that the further a line from the queried point, the less relevant it is.

But why is that? Let's consider two adjacent lines $$$(b)$$$ and $$$(c)$$$ to the left of $$$x_j$$$ ($$$(c)$$$ is further from $$$x_j$$$). These two lines intersect at the same point, but because $$$(b)$$$ slope is greater than $$$(c)$$$, the value of $$$(b)$$$ at $$$x_j$$$ ends up being greater.

Figure 3: Illustration of the lines $$$(b)$$$ and $$$(c)$$$ to the left of $$$x_j$$$, and how $$$(b)$$$ is more relevant than $$$(c)$$$.

Thus, we only need to focus on the $$$k$$$ nearest lines from $$$x_j$$$, both to the left and to the right.

Figure 4: How the algorithm may work.

However, there is a flaw to this approach. For example, for $$$k = 2$$$, what if the "redundant line" is actually the second largest line?

Figure 5: How the second largest line might not be on the convex hull.

There is an easy fix! We will keep track of all of the "redundant lines" from our first run of constructing the CHT data structure, and we will use these lines to make a second CHT. So for the previous example, it would look like this.

Figure 6: How the 2-layer CHT would look like.

Then we will do the same thing for the second CHT i.e. brute forcing through the $$$k$$$ nearest lines from the queried point, both to the left and to the right.

Extending to the general case was pretty simple. We can just make $$$k$$$ CHT, with each one using all of the redundant lines from the previous CHT. We know that this is optimal, because on each layer, we only need to go to the left and right at most $$$k$$$ times, and we only need to dive down at most $$$k$$$ layers (Anything on the $$$k+1^{th}$$$ layer is just not needed, since all of the lines on the previous layers are better).

Sample Code


// The implementation is not optimized. I left it as it is for readability.
    
struct CHT{ // casual CHT
    #define Node pair<ll, ll>
    vector<Node> hull, suboptimal;
    vector<double> inter;
 
    double getInter(Node a, Node b){return (double) (b.second - a.second) / (a.first - b.first);}
    ll f(Node s, ll x){return s.first * x + s.second;}
 
    void add(Node x){
        if (hull.empty()) {hull.push_back(x); return;}
        if (hull.back().first == x.first){
            suboptimal.push_back(hull.back());
            hull.pop_back();
            if (inter.size()) inter.pop_back();
        }
        while(hull.size() >= 2){
            double x1 = getInter(hull.back(), x), x2 = inter.back();
            if (x1 >= x2) break;
            suboptimal.push_back(hull.back());
            hull.pop_back(); inter.pop_back();
        }
        if (hull.size()) inter.push_back(getInter(hull.back(), x));
        hull.push_back(x);
    }

    void get_val(ll x, int r, vector<ll> &ans){
        int idx = lower_bound(ALL(inter), x) - inter.begin();
        for(int i = idx - r; i <= idx + r; ++i) if (i >= 0 && i < size()){
            ans.push_back(f(hull[i], x));
        }  
    }

    int size(){return hull.size();}
};


struct CHT_extended{ // mlg super pro vip CHT
    #define Node pair<ll, ll>
    int k;
    vector<CHT> hull_max;

    CHT_extended(int k): k(k){
        for(int i = 0; i < k; ++i) hull_max.push_back(CHT());
    }

    void add(vector<Node> a){
        for(int i = 0; i < k; ++i){ // use the suboptimal lines from the previous CHT run
            sort(ALL(a));
            for(Node j: a) hull_max[i].add(j);
            a = hull_max[i].suboptimal; hull_max[i].suboptimal.clear();
        }
    }

    vector<ll> get(ll x){
        vector<ll> ans; ans.reserve(k * k);
        for(int i = 0; i < k; ++i){ // get O(k^2) lines
            hull_max[i].get_val(x, k - 1, ans);
        }
        if (ans.size() > k) 
            nth_element(ans.begin(), ans.begin() + k, ans.end(), greater<ll>());
        while(ans.size() > k) ans.pop_back();
        return ans; // spit back k best lines in arbitrary order. You can call sort if you want.
    }
};

Time complexity: $$$O(n * k * log_2(n))$$$ for the preprocessing, and $$$O(k^2 + k*log_2(n))$$$ for each query.

The complexity in both the preprocessing and querying could be further optimized, but I'll leave it as an exercise for readers.

Are there any problem that feature this algorithm? Well uhh... I don't know, this is like mythical stuff that you will probably never encounter all your life. But now you have :))

Full text and comments »

geometry, convex hull trick, data structure

steveonalex
4 weeks ago
5

My solution to 2003F — Turtle and Three Sequences

By steveonalex, 7 weeks ago, In English

I know this round is from like 2-3 months ago but screw that, let's dig it up because I just solved it recently. Also I cannot see any people who are doing the same thing as me.

1) The problem statement:

Problem link: 2003F — Turtle and Three Sequences

Given two positive integers $$$n$$$, $$$m$$$ ($$$n \geq m$$$), and three sequences $$$a$$$, $$$b$$$, $$$c$$$ of size $$$n$$$. Find a sequence of length $$$m$$$ $$$p_1, p_2, ..., p_m$$$, such that $$$p_1 < p_2 < ... < p_m$$$, and $$$a_{p_1} \leq a_{p_2} \leq ... \leq a_{p_m}$$$, and $$$b_{p_i} \neq b_{p_j}$$$, $$$\forall i \neq j$$$.

Constraint:

$$$n \leq 3000$$$, $$$m \leq 5$$$.
$$$1 \leq a_i, b_i \leq n$$$, $$$1 \leq c_i \leq 10^4$$$.
Time Limit: 3 seconds.

You can read the intended solution in this link: Editorial.

Spoiler:

I would recommend you to try the problem out before scrolling further (or just don't, because the problem is rated 2800, so if you are a fellow blue hardstuck-er then you don't really stand much of a chance anyway).

2) My solution:

Prerequisite:

Fenwick Tree.
Constant optimization skill.

It would be best if we start from something manageable first. $$$m \leq 2$$$ is pretty simple, you literally just have to brute, it literally cannot get any easier than this, so let's move on.

For $$$m = 3$$$, it is more challenging, but you can solve it by iterating through all pair $$$(i, j)$$$ $$$(i < j)$$$, and for each pair, find the best index $$$k$$$ to the left of $$$j$$$ in $$$O(n)$$$, so the total complexity is $$$O(n^3)$$$. To optimize this, observe that we don't really need to keep track of that many $$$k$$$. We only need to maintain an array $$$suff_j$$$, containing up to $$$m$$$ candidate tuples $$$(a_k, b_k, c_k)$$$ with the largest $$$c_k$$$, such that $$$k > j$$$, $$$a_k \geq a_j$$$, and all $$$b_k$$$ are distinct (because at worst, we only have to skip $$$m-1$$$ candidates of the $$$m$$$ tuples). We can precalculate $$$suff_j$$$ in $$$O(n^2)$$$, so the final complexity for $$$m = 3$$$ is $$$O(n^2*k)$$$.

$$$m = 4$$$ is pretty much the same thing, except you have to also maintain the array $$$pref_i$$$, which contains up to $$$m$$$ tuples $$$(a_k, b_k, c_k)$$$ with the largest $$$c_k$$$, such that $$$k < i$$$, $$$a_k \leq a_i$$$, and all $$$b_k$$$ are distinct, then iterate through all pair $$$(i, j)$$$ just like the previous algorithm and iterate through all the candidate tuples in $$$suff_j$$$ and $$$pref_i$$$. The complexity of this is $$$O(n^2*k^2)$$$.

$$$m = 5$$$ is pretty tough, however. Previously, calculating $$$m^{th}$$$ best candidate to the left of $$$i$$$ and to the right of $$$j$$$ is pretty easy, but how do we take it a step further? Iterating through all the possible pairs of candidates (not just candidates, pairs of candidates) to the left of $$$i$$$ or to the right of $$$j$$$ is pretty infeasible, so we only have one choice: maintaining the array $$$between_{i, j}$$$, containing up to $$$m$$$ tuples with the largest $$$c_k$$$, such that $$$i < k < j$$$, $$$a_i \leq a_k \leq a_j$$$, and all $$$b_k$$$ are distinct.

We can use data structures used to solve range maximum queries like Fenwick Tree to calculate this array. Here's how: for each $$$i$$$, iterate $$$j$$$ from $$$i+1$$$ to $$$n$$$. For each $$$j$$$, get $$$m$$$ candidate tuples such that $$$a_k \leq a_j$$$, then update the Fenwick Tree with the tuple $$$(a_j, b_j, c_j)$$$. Once you obtained the three arrays, you just do the same this as the previous subtask, except you also iterate through the $$$between_{i, j}$$$. The complexity of this is $$$O(n^2*k*(k^2+log(n))$$$.

As the complexity might suggest, you indeed have to go pretty crazy on the constant optimization here (probably anything other than Fenwick Tree won't even pass, who knows, my program runs in 2.93s, which is like one Minecraft tick away from getting TLE).

Can we extend this for $$$m = 6$$$? Nope, I think I've pushed this idea to its limit already. Which is scary to think about, because author set the constraint on $$$m$$$ precisely equal to $$$5$$$, while the intended solution can go beyond that. Does the author know about this solution, so they set $$$m = 5$$$? And if yes, why don't they just write this solution into the editorial? Guess we'll never know.

Full text and comments »

tutorial, awesome, informative, inspirational

steveonalex
7 weeks ago
5

[Tutorial] Divide and Conquer Offline Query — A Niche Way to solve Static Range Query

By steveonalex, 17 months ago, In English

Hi guys, this is my first blog on Codeforces. So if there were any mistakes or suggestions, feel free to correct me down in the comment section. Anyway, I discovered a nice way to solve static range query problems using "Divide and conquer", and I'm eager to share it with you guys.

Pre-requisites:
• Prefix Sum.

Problem 1:

Given an array $$$A$$$ of $$$N (N \leq 10^{5})$$$ integers, your task is to answer $$$q (q \leq 10^{5})$$$ queries in the form: what is the minimum value in the range $$$[l, r]$$$?

For now, let's forget about Segment Tree, Square Decomposition, Sparse Table and such. There's a simple way to solve this problem without any use of these fancy data structure.

First, let's start with $$$L_{0} = 1$$$, $$$R_{0} = n$$$, and $$$M_{0} = \left\lfloor { \frac{L_{0} + R_{0}}{2} } \right\rfloor$$$. Let's just assume that every query satisfy $$$L_{0} \leq l \leq M_{0} < r \leq R_{0}$$$. We maintain two prefix sum arrays:
• $$$X[i] = min(A[i], A[i+1], ..., A[M_{0}-1], A[M_{0}])$$$
• $$$Y[i] = min(A[M_{0}+1], A[M_{0}+2], ..., A[i-1], A[i])$$$

The answer to the query $$$ [ l_{0} , r_{0} ] $$$ is simply $$$min(X[l_{0}], Y[r_{0}])$$$. But what about those queries that doesn't satisfy the aforementioned condition? Well we can recursively do the same thing to $$$L_{1} = L_{0}, R_{1} = M_{0}$$$, and $$$L_{2} = M_{0} + 1, R_{2} = R_{0}$$$, hence the name "Divide and conquer". The recursive tree is $$$log N$$$ layers deep, each query exists in no more than $$$log N$$$ layers, and in each layer you do $$$O(N)$$$ operation. Therefore this algorithm runs in $$$O((N + q) * log N)$$$, and $$$O(N + q)$$$ memory.

So... Why on earth should I use it?

While this technique has practically the same complexity as Segment Tree, there is an interesting property: You only perform the "combine" operation once per query. Here's a basic example to show how this property can be exploited.

Problem 2:

Define the cost of a set the product of its elements. Given an array $$$A$$$ of $$$N (N \leq 2*10^{5})$$$ integers and a positive integer $$$k$$$ ($$$k \leq 20$$$). Define $$$g(l, r, k)$$$ as sum of cost of all subsets of size $$$k$$$ in the set {$$$ A[l], A[l+1], ..., A[r-1], A[r] $$$}. your task is to answer $$$q (q \leq 5 * 10^{5})$$$ queries in the form: what is $$$g(l, r, k)$$$ modulo $$$10^{9} + 69$$$?.

Naive Idea

Now I'll assume that you've read the naive idea above (you should read it). Notice how combining the value of two ranges runs in $$$O(k^2)$$$. However, if one of the two ranges has the length of $$$1$$$, then they can be combined in $$$O(k)$$$. This means a prefix-sum can be constructed in $$$O(N * k)$$$.
Why is this important? Let's not forget that in the naive Segment Tree idea, the bottle-neck is the convolution calculation, and we wish to reduce that in exchange for less expensive operations, which is what our aforementioned divide & conquer technique can help with, since you only do the expensive "combine" operation once per query. And besides, unlike Segment Tree, you can calculate the answer right away using the formula $$$\sum_{t=1}^k X[l][t] * Y[r][k - t]$$$

This will runs in $$$O(N * log N * k + q * (k + log N))$$$, and $$$O(N + q + k)$$$ memory, which is much better than the Segment Tree idea.