MINHTPC's blog

By MINHTPC, 9 months ago, In English

Parallel Binary Search — Introduction

Put simply, parallel binary search is performing binary search for multiple queries in parallel while reusing already-built data instead of rebuilding it many times from scratch. The idea is somewhat similar to dynamic programming in that we take advantage of previously computed states.

In problems where we could apply binary search to find the answer for each query independently, we would normally need to perform Q separate searches. This often results in TLE because the data structure or computation must be rebuilt for every query. Parallel binary search helps reduce the complexity by processing queries together.

This is not a new technique, but there are still relatively few detailed articles about it. This post can serve as a reference for those who have not encountered it before.

To make things clearer, let’s start with some example problems.

Example Problem

Suppose we have an array of n elements and a list of Q queries. Each query asks: "What is the smallest index t such that after applying all updates from 1 to t, a certain condition for this query becomes true?" If no such t exists, we should output -1.

A naive approach would be to run binary search for each query separately, rebuilding the data structure for every check. This can easily lead to O(Q × M × log M) complexity, which is too slow.

With parallel binary search, we maintain a search range [L[q], R[q]] for each query and group queries by their current midpoint. We process all queries with the same midpoint together, applying updates incrementally. This allows us to reuse computations and avoid unnecessary rebuilding.

Key idea:

  • Keep track of queries' binary search intervals.
  • Group queries by the middle of their search interval.
  • Apply updates in increasing order of t, answering all queries whose midpoint matches the current t.
  • Shrink the search interval based on whether the predicate holds.

By repeating this grouping process until all queries are resolved, we reduce complexity to about O((N + Q) × log M) in many cases, making the approach feasible for large inputs.

Deep Dive Analysis — “Ha Noi Road Trip” Problem

For reference, this problem is part of the “Parallel Binary Search” contest in our Codeforces group.

Restating the problem: We have a tree of n cities connected by roads. Each road has:

  • v – current speed limit
  • c – upgrade cost
  • s – upgraded speed limit (strictly greater than v)
We receive Q queries. For each query (a, b, m):
  • We may choose at most one road on the path from a to b to upgrade, provided its cost ≤ m.
  • We must output the maximum possible travel speed between a and b after that upgrade (or without upgrading if better).

Naive approach and why it fails

If we process each query independently:

  • Find the path from a to b (e.g., using LCA in O(log n)).
  • Scan edges on that path to check upgrade opportunities within budget.
In the worst-case, path length ≈ n, leading to a time complexity of O(Q × n), which is too slow for n = 2×10⁵, Q = 10⁵.

Key observation

Each query essentially asks: “What is the minimum budget threshold at which the answer improves?” This invites a global offline solution with Parallel Binary Search:

  1. Sort edges by their upgrade cost c.
  2. Binary search in parallel across all queries over budget values.
  3. At each mid-budget, “activate” edges with cost ≤ mid and update our data structure.
  4. Answer each query with the best path speed, then narrow the search range.

Data structure choice

Since the graph is a tree, we can use:

  • Heavy-Light Decomposition (HLD) to map tree edges to segment tree indices.
  • A segment tree (or Fenwick tree) maintains the maximum speed per edge.
  • When activating an upgrade, we perform: update(edge) = max(current_speed, upgraded_speed).
  • To answer a query, query the maximum edge speed along the path from a to b.

Parallel Binary Search workflow

// Initialization: for each query, set L[q] = 1, R[q] = max budget + 1
// Final answer for each query: if L[q] exceeds max budget, only original speeds apply

Complexity analysis

  • Each edge is activated O(log M) times across the binary searches → O(n log M) updates.
  • Each query does O(log M) path queries → O(Q log M) query operations.
  • Each operation (update or query) on HLD + segment tree is O(log n).
Total: O((n + Q) × log M × log n) — efficient for given constraints.

Final remarks

This problem elegantly combines Parallel Binary Search with path query data structures (HLD + segment tree). It exemplifies how offline query processing and reusing shared state can dramatically reduce redundant work in competitive programming.

Conclusion

The “Ha Noi Road Trip” problem is a textbook example of how Parallel Binary Search can turn an otherwise intractable query set into something efficient and elegant. By processing all queries offline, grouping them by midpoints, and reusing the same data structure state for multiple queries, we avoid rebuilding from scratch each time.

When combined with Heavy-Light Decomposition and a segment tree, the approach scales to the problem’s high constraints while keeping the implementation structured and modular.

The key takeaway is this: whenever you see multiple independent binary searches over a shared state that is expensive to rebuild, consider running them in parallel. This often transforms the complexity from O(Q × cost_of_state_build) to something much more manageable.

In competitive programming, techniques like this are not just tricks — they are powerful patterns that, once mastered, can be adapted to a wide variety of problems.

Full text and comments »

  • Vote: I like it
  • 0
  • Vote: I do not like it

By MINHTPC, 11 months ago, In English

Bitwise Operations

Note: Bitwise operations shown here are only considered on natural numbers. In fact, bitwise operations can also be performed on negative numbers, but we won’t discuss their nature here.

AND / OR / XOR Operations

Bitwise AND, OR, and XOR are binary operations (two operands). These operations are performed independently on each bit and then combined to form the final result.

Bitwise AND, symbol: a & b

For each bit x and y, AND is performed as follows:

0 & 0 = 0
0 & 1 = 0
1 & 0 = 0
1 & 1 = 1

Explanation: returns 1 only when both bits are 1; returns 0 otherwise.

368 = 101110000
147 =  10010011
→ align bits to same width:
368 = 101110000
147 = 010010011
368 & 147 = 000010000 = 16

Bitwise OR, symbol: a | b

For each bit x and y, OR is performed as follows:

0 | 0 = 0
0 | 1 = 1
1 | 0 = 1
1 | 1 = 1

Explanation: returns 1 if at least one of the bits is 1; returns 0 only if both are 0.

368 = 101110000
147 = 010010011
368 | 147 = 111110011 = 499

Bitwise XOR, symbol: a ^ b

For each bit x and y, XOR is performed as follows:

0 ^ 0 = 0
0 ^ 1 = 1
1 ^ 0 = 1
1 ^ 1 = 0

Explanation: returns 1 if bits are different; returns 0 if they are the same.

368 = 101110000
147 = 010010011
368 ^ 147 = 111100011 = 483

Special Properties of XOR:

  • a ^ 0 = a for all a
  • a ^ a = 0 for all a
  • XOR is its own inverse

XOR is the “inverse operation” of itself

a + b = S → a = S — b or b = S — a
a * b = P → a = P / b or b = P / a
a ^ b = X → a = X ^ b or b = X ^ a

When summing over a set of numbers, remember:
+ Any number XOR with itself = 0
+ A value that appears twice is considered as not appearing
+ A value that appears an odd number of times is considered as appearing once

Example:

1 ^ 1 ^ 2 ^ 2 ^ 3 ^ 3 ^ 2 ^ 2 ^ 3 ^ 3 ^ 4 ^ 4 ^ 4 ^ 4 ^ 5

→ 1 appears twice → gone
→ 2 appears 4 times → gone
→ 3 appears 4 times → gone
→ 4 appears 4 times → gone
→ 5 appears once → remains

Result: 5

Bit Shift Operations

Left Shift, symbol: a << b

Meaning:

  • In binary: add b zeros to the right of a
  • Value: multiply a by 2^b
368 = 101110000
368 << 4 = 1011100000000
→ 368 * 2^4 = 368 * 16 = 5888
147 = 10010011
147 << 2 = 1001001100
→ 147 * 2^2 = 147 * 4 = 588

Right Shift, symbol: a >> b

Meaning:

  • In binary: remove b bits from the right side of a
  • Value: divide a by 2^b (rounded down)
368 = 101110000
368 >> 4 = 000010111 = 23
→ 368 / 2^4 = 368 / 16 = 23
147 = 10010011
147 >> 2 = 00100100 = 36
→ 147 / 4 = 36.75 → truncated to 36

Bitwise NOT (~)

Symbol: ~

This is a unary operator (only one operand). For each bit i:

  • If bit i of a is 1, then bit i of is 0
  • If bit i of a is 0, then bit i of is 1

Note: In C/C++, bitwise NOT gives result: -a — 1

~368 = -369
~147 = -148

Common Mistakes When Using Bitwise Operations

- You must distinguish between bitwise operations and logical expression operations.

Bitwise AND & is a bitwise operator,
Logical AND && is a logical expression operator
& returns any integer result
&& only returns 0 or 1 (TRUE or FALSE)

Bitwise OR | is a bitwise operator,
Logical OR || is a logical expression operator
| returns any integer result
|| only returns 0 or 1 (TRUE or FALSE)

Bitwise XOR ^ is a bitwise operator,
Logical NOT ! is a logical expression operator
^ returns any integer result
! only returns 0 or 1 (TRUE or FALSE)

- When using bitwise operations (& | ~ ^ << >>), always use parentheses to group expressions. Otherwise, operator precedence may cause unexpected behavior.

Full text and comments »

  • Vote: I like it
  • 0
  • Vote: I do not like it

By MINHTPC, 11 months ago, In English
Binary System

How to convert a number to binary manually: While the number is greater than 0 — if it is odd, write bit 1; if it is even, write bit 0. Then divide the number by 2 (floor the result if it’s odd). Bits are written from right to left.

Example:
Convert n = 368 to binary


368 is even -> write bit 0 -> result is '0'. 368/2 = 184  
184 is even -> write bit 0 -> result is '00'. 184/2 = 92  
92 is even -> write bit 0 -> result is '000'. 92/2 = 46  
46 is even -> write bit 0 -> result is '0000'. 46/2 = 23  
23 is odd -> write bit 1 -> result is '10000'. 23/2 = 11  
11 is odd -> write bit 1 -> result is '110000'. 11/2 = 5  
5 is odd -> write bit 1 -> result is '1110000'. 5/2 = 2  
2 is even -> write bit 0 -> result is '01110000'. 2/2 = 1  
1 is odd -> write bit 1 -> result is '101110000'. 1/2 = 0 (done)


Conclusion: 368 in binary is 101110000

Convert 147 to binary


147 is odd -> write bit 1 -> result is '1'. 147/2 = 73  
73 is odd -> write bit 1 -> result is '11'. 73/2 = 36  
36 is even -> write bit 0 -> result is '011'. 36/2 = 18  
18 is even -> write bit 0 -> result is '0011'. 18/2 = 9  
9 is odd -> write bit 1 -> result is '10011'. 9/2 = 4  
4 is even -> write bit 0 -> result is '010011'. 4/2 = 2  
2 is even -> write bit 0 -> result is '0010011'. 2/2 = 1  
1 is odd -> write bit 1 -> result is '10010011'. 1/2 = 0 (done)


Conclusion: 147 in binary is 10010011

Based on binary conversion, we see that: every natural number x has a unique way to be expressed as the sum of powers of 2.


368 in binary is 101110000  
368 = 2^8 + 2^6 + 2^5 + 2^4 + 2^0



147 in binary is 10010011  
147 = 2^7 + 2^4 + 2^1 + 2^0


Important: Bit positions in binary numbers are indexed from 0 from right to left.

Bit 0 is the least significant bit (rightmost).  
Bit next to it is bit 1, etc.  
The leftmost bit (most significant) is the one with the highest index.


368 in binary: 101110000  
368 = 2^8 + 2^6 + 2^5 + 2^4 + 2^0  
Bit 0 of 368 is 0  
Bit 1 of 368 is 0  
Bit 2 of 368 is 0  
Bit 3 of 368 is 0  
Bit 4 of 368 is 1  
Bit 5 of 368 is 1  
Bit 6 of 368 is 1  
Bit 7 of 368 is 0  
Bit 8 of 368 is 1



147 in binary: 10010011  
147 = 2^7 + 2^4 + 2^1 + 2^0  
Bit 0 of 147 is 1  
Bit 1 of 147 is 1  
Bit 2 of 147 is 0  
Bit 3 of 147 is 0  
Bit 4 of 147 is 1  
Bit 5 of 147 is 0  
Bit 6 of 147 is 0  
Bit 7 of 147 is 1


Every natural number has a unique representation as the sum of powers of 2.  
You can determine which bits are 1 by checking which 2^i values are included.

New method to convert x to binary:

Step 1: Find the largest k such that 2^k ≤ x → Binary representation has (k+1) bits (bit positions 0 to k)

Step 2: For i = k, k-1, ..., 0  
→ if x ≥ 2^i → bit i = 1 and x = x — 2^i  
→ otherwise → bit i = 0 (x stays the same)

Convert 368 to binary

k = 8 (2^8 = 256 ≤ 368, 2^9 = 512 > 368) → binary has 9 bits


Check i = 8, x = 368 ≥ 2^8 → bit 8 = 1 (1????????) → x = 368 — 256 = 112  
Check i = 7, x = 112 < 2^7 → bit 7 = 0 (10???????)  
Check i = 6, x = 112 ≥ 2^6 → bit 6 = 1 (101??????) → x = 112 — 64 = 48  
Check i = 5, x = 48 ≥ 2^5 → bit 5 = 1 (1011?????) → x = 48 — 32 = 16  
Check i = 4, x = 16 ≥ 2^4 → bit 4 = 1 (10111????) → x = 16 — 16 = 0  
Check i = 3, x = 0 < 2^3 → bit 3 = 0 (101110???)  
Check i = 2, x = 0 < 2^2 → bit 2 = 0 (1011100??)  
Check i = 1, x = 0 < 2^1 → bit 1 = 0 (10111000?)  
Check i = 0, x = 0 < 2^0 → bit 0 = 0 (101110000)


Convert 147 to binary  
k = 7 (2^7 = 128 ≤ 147, 2^8 = 256 > 147) → binary has 8 bits


Check i = 7, x = 147 ≥ 2^7 → bit 7 = 1 (1???????) → x = 147 — 128 = 19  
Check i = 6, x = 19 < 2^6 → bit 6 = 0 (10??????)  
Check i = 5, x = 19 < 2^5 → bit 5 = 0 (100?????)  
Check i = 4, x = 19 ≥ 2^4 → bit 4 = 1 (1001????) → x = 19 — 16 = 3  
Check i = 3, x = 3 < 2^3 → bit 3 = 0 (10010???)  
Check i = 2, x = 3 < 2^2 → bit 2 = 0 (100100??)  
Check i = 1, x = 3 ≥ 2^1 → bit 1 = 1 (1001001?) → x = 3 - 2 = 1  
Check i = 0, x = 1 ≥ 2^0 → bit 0 = 1 (10010011) → x = 1 - 1 = 0 (done)

Full text and comments »

  • Vote: I like it
  • 0
  • Vote: I do not like it

By MINHTPC, 11 months ago, In English

Gauss-Jordan Elimination — Introduction

Given a system of $$$n$$$ linear algebraic equations (SLAE) with $$$m$$$ variables, we are asked to solve it (i.e., determine whether it has no solution, a unique solution, or infinitely many solutions). In the case of at least one solution, return any one of them.

In short, we are given the following system of equations:

$$$ \begin{cases} a_{11}x_1 + a_{12}x_2 + \cdots + a_{1m}x_m = b_1 \\ a_{21}x_1 + a_{22}x_2 + \cdots + a_{2m}x_m = b_2 \\ \vdots \\ a_{n1}x_1 + a_{n2}x_2 + \cdots + a_{nm}x_m = b_n \end{cases} $$$

where $$$a_{ij}$$$ ($$$1 \le i \le n,\ 1 \le j \le m$$$) and $$$b_i$$$ ($$$1 \le i \le n$$$) are known constants, and $$$x_i$$$ ($$$1 \le i \le m$$$) are the unknowns.

Alternatively, the system can be represented in matrix form as:

$$$Ax = b$$$

where $$$A$$$ is an $$$n \times m$$$ matrix of coefficients $$$a_{ij}$$$, and $$$b$$$ is an $$$n$$$-dimensional vector of constants.

This method can also be applied when the system has modular constraints:

$$$ \begin{cases} a_{11}x_1 + a_{12}x_2 + \cdots + a_{1m}x_m \equiv b_1 \ (\text{mod } p) \\ a_{21}x_1 + a_{22}x_2 + \cdots + a_{2m}x_m \equiv b_2 \ (\text{mod } p) \\ \vdots \\ a_{n1}x_1 + a_{n2}x_2 + \cdots + a_{nm}x_m \equiv b_n \ (\text{mod } p) \end{cases} $$$

The Algorithm

The Gauss-Jordan elimination method sequentially eliminates variables to reduce the system to a simpler form and then solves it. Specifically, we use the first equation to eliminate $$$x_1$$$ from the remaining $$$n-1$$$ equations, transforming them into a new system. This continues by eliminating $$$x_2$$$, and so on.

Two elementary row operations are used:

  • Multiply an equation by a non-zero constant $$$r$$$.
  • Add one equation to another (after possibly scaling it).

These operations preserve the solution set.

To perform Gauss elimination, for each variable $$$x_i$$$ ($$$1 \le i \le \min(n, m)$$$):

  1. Find an unused equation with $$$a_{ei} \neq 0$$$:
  • If found: divide all coefficients of that equation by $$$a_{ei}$$$, mark the equation as used, and eliminate $$$x_i$$$ from other equations.
  • If not found: $$$x_i$$$ is a free variable.
  1. For each remaining equation $$$r$$$, subtract $$$a_{ri}$$$ times the pivot row (equation $$$e$$$). This sets all other $$$a_{ri} = 0$$$ while maintaining $$$a_{ei} = 1$$$.

After elimination, solve from bottom to top:

  • If $$$n \lt m$$$, then $$$x_i$$$ with $$$i \gt n$$$ are free variables.
  • Assume all free variables are 0 (or arbitrary if finding general solution).
  • For each dependent variable, solve using the row where it was eliminated.

Finally, check for consistency:

  • If any row simplifies to $$$0 = c$$$ with $$$c \neq 0$$$, the system has no solution.
  • Otherwise, if at least one free variable exists, there are infinitely many solutions; else, there is a unique solution.

Example

We use an $$$n \times (m+1)$$$ augmented matrix to represent the system. Each row is an equation; the first $$$m$$$ numbers are coefficients, the last is the constant term.

Given:

$$$ \begin{cases} x + 5y + 3z - 4t = 19 \\ 3x + 16y + 7z - 9t = 42 \\ -2x - 8y - 9z + 3t = -61 \end{cases} $$$

Matrix:

$$$ \begin{pmatrix} 1 & 5 & 3 & -4 & 19 \\ 3 & 16 & 7 & -9 & 42 \\ -2 & -8 & -9 & 3 & -61 \end{pmatrix} $$$

Step 1: Eliminate $$$x$$$

  • Row2 -= 3 * Row1
  • Row3 += 2 * Row1
$$$ \begin{pmatrix} 1 & 5 & 3 & -4 & 19 \\ 0 & 1 & -2 & -3 & -15 \\ 0 & -2 & -3 & -5 & -23 \end{pmatrix} $$$

Step 2: Eliminate $$$y$$$

  • Row1 -= 5 * Row2
  • Row3 += 2 * Row2
$$$ \begin{pmatrix} 1 & 0 & 13 & -19 & 94 \\ 0 & 1 & -2 & -3 & -15 \\ 0 & 0 & 1 & -11 & 7 \end{pmatrix} $$$

Step 3: Eliminate $$$z$$$

  • Row1 -= 13 * Row3
  • Row2 += 2 * Row3
$$$ \begin{pmatrix} 1 & 0 & 0 & 124 & 3 \\ 0 & 1 & 0 & -19 & -1 \\ 0 & 0 & 1 & -11 & 7 \end{pmatrix} $$$

Let $$$t = 0$$$:

$$$ \begin{cases} x = 3 \\ y = -1 \\ z = 7 \\ t = 0 \end{cases} $$$

Since $$$t$$$ is a free variable, there are infinitely many solutions. Another valid one is:

$$$ \begin{cases} x = -121 \\ y = 18 \\ z = 18 \\ t = 1 \end{cases} $$$

Conclusion

In this part, we’ve introduced the Gauss-Jordan elimination method and applied it to a sample problem.

In the next section, we’ll explore competitive programming problems that use this technique, share implementation tips, and provide clean code samples.

Full text and comments »

  • Vote: I like it
  • +6
  • Vote: I do not like it

By MINHTPC, 11 months ago, In English

Each edge on the tree is classified into 2 types: heavy and light.

Let treeSize[u] = the number of nodes in the subtree rooted at node u.

Consider any node u with children v1, v2, ..., vk → we have edges (u, v1), (u, v2), ..., (u, vk).

Among these children, let vi be the child with the largest treeSize (if multiple children have the same largest treeSize, we can pick any one of them).

Then, the edge (u, vi) is called a heavy edge.

The remaining edges are light edges.

Basic properties:

For every node u (except the root), there is exactly one heavy edge going from u to one of its children.

If the edge (u, vi) is a light edge, then we have: treeSize[u] > 2 * treeSize[vi]

Proof of property (2):

Assume (u, vi) is a light edge → there exists another child vj of u (vj ≠ vi) such that edge (u, vj) is the heavy edge.

By the rule of selecting the heavy edge, vj is the child with the largest treeSize.

This means: treeSize[vi] ≤ treeSize[vj].

On the other hand, the subtrees rooted at vi and vj are disjoint (since they belong to different branches of the subtree rooted at u), so:

treeSize[vi] + treeSize[vj] < treeSize[u].

From this, we derive: treeSize[u] > 2 * treeSize[vi].

Advanced Consequences: (A) Since no two heavy edges share the same parent, heavy edges form chains. (B) On any path from an arbitrary node x to the root, the number of light edges is at most log(n).

For each light edge, we have treeSize[parent] > 2 * treeSize[child].

This means that every time we move upward through a light edge, the subtree size at least doubles. Since treeSize[root] = n, the number of times we can double before reaching the root is at most log(n).

→ Therefore, on the path between any two nodes (u, v), the number of light edges is at most 2 * log(n).

To list all heavy-edge chains simply, we perform a DFS on the tree.

During DFS, we always explore the heavy child first before visiting other branches.

This way, all nodes in the same heavy-edge chain will be visited consecutively.

The following image shows the original tree structure before applying heavy-light decomposition.

At this stage, all edges are considered equal, and no heavy paths have been identified yet.

Tree before decomposition

Now, after applying the heavy-light decomposition, the tree is broken into several heavy-edge chains.

Heavy edges (typically shown in bold or color) form continuous paths that optimize traversal and queries.

Tree after decomposition

The sequence below shows the order in which nodes are visited during a depth-first search (DFS) traversal.

1 → 3 → 11 → 13 → 15 → 12 → 8 → 14 → 17 → 6 → 2 → 4 → 10 → 16 → 9 → 5 → 7

Heavy-Light Decomposition is a powerful technique to break down a tree into heavy and light edges, helping optimize many tree-related problems. This introduction covered the basic concepts and how to identify heavy edges using DFS.

In the next post, we will dive deeper into how to implement HLD efficiently and apply it to solve practical problems.

Stay tuned!

Full text and comments »

  • Vote: I like it
  • +31
  • Vote: I do not like it

By MINHTPC, 11 months ago, In English

[](http://cses.fi/problemset/task/1664/)664/

Firstly , we can see that the problem should use binary lifting with time complexity O(N+q*log(N)).

Surely at first many of you will confuse this problem by greedily following the movie, but that is the wrong way to solve it because it is trapping the order position according to array A.

So the correct solution to this problem is that we will be greedy according to the movie's running time, we will sort the ending time of each movie in ascending orderand combined with binary lifting.

In the worst case no more than 5e6 , we can totally do it this way.

Here is my sample solution

#include <bits/stdc++.h>
using namespace std;
const int N=1e6;
int p[20][N+5],nxt[N+5],en[N+5];
const int oo=N+1;
int main()
{
    ios_base::sync_with_stdio(false);
    cin.tie(0);cout.tie(0);
    int n,q;
    cin >> n >> q;
    fill(en,en+N+2,oo);
    for(int i=1;i<=n;i++) {
        int a,b;
        cin >> a >> b;
        en[a]=min(en[a],b);
    }
    nxt[N+1]=oo;
    for(int i=N;i>=0;i--) {
        nxt[i]=min(nxt[i+1],en[i]);
    }
    for(int i=0;i<=N;i++) p[0][i]=nxt[i];
    for(int k=1;k<20;k++) {
        for(int i=0;i<=N;i++) {
            int mid=p[k-1][i];
            p[k][i]=(mid<=N?p[k-1][mid]:oo);
        }
    }
    while(q--) {
        int l,r;
        cin >> l >> r;
        int ans=0,pos=l;
        for(int k=19;~k,2025-05-14;k--) {
            int nx=p[k][pos];
            if(nx<=r) {
                ans+=(1<<k);
                pos=nx;
            }
        }
        cout << ans << '\n';
    }
}

Full text and comments »

  • Vote: I like it
  • 0
  • Vote: I do not like it