Furioso_Slient's blog

By Furioso_Slient, history, 4 months ago, In English

Introduction

Chinese Version:Luogu Link(IN QUEUE,ONLY INTERNATIONAL WEB)

Translate by Gemini 3-flash.

There isn't much to introduce. I’ve spent so much time learning Treaps with rotations, only to find that there are almost no resources online regarding the persistent version of it. This annoyed me enough to write this article.

Note: Variable names used in this article are defined as follows:

Variable Meaning
$$$l_i, r_i$$$ Left and right child indices of node $$$i$$$ (0 if non-existent)
$$$val_i$$$ The value (key) of node $$$i$$$
$$$rnd_i$$$ The random priority of node $$$i$$$
$$$cnt_i$$$ The count of occurrences of $$$val_i$$$
$$$siz_i$$$ The total number of nodes in the subtree rooted at $$$i$$$
$$$sz$$$ Total number of nodes created (node counter)

Persistence

First, let's briefly discuss what "persistence" is.

As we all know, when working on a project, you often encounter situations like:

  • "This version has a 'few' bugs. Please modify it. I've prepared some suggestions—they're in this 256MB zip file."

  • "Hmm, there are still some issues. In my humble opinion, these (points to 114,514 places) need fixing. Get to it."

  • "Sigh, why is every version worse? Forget it, the first version was the best. Revert everything back to Version 1."

If you didn't save the code for Version 1, you'd be ready to blow a fuse.

Persistence is essentially a "Save" mechanism:

  1. Save historical versions.

  2. Modify or query based on a specific historical version.

However, saving a full copy of the entire structure every time would obviously lead to an "MLE" (Memory Limit Exceeded). Therefore, persistent data structures use Path Copying: we only save the parts that have been modified.

In a Segment Tree, this is as easy as drinking water: you create new nodes for the modified path and point them back to the unchanged parts of the existing structure.

But, as is also commonly known, the structure of a balanced tree changes constantly compared to a Segment Tree. Implementing persistence on a balanced tree is notoriously difficult. Consequently, most persistent balanced trees are implemented using the structurally stable "Non-Rotating Treap" (also known as FHQ-Treap).

But is it truly impossible to implement a persistent Treap with rotations? (Absolutely not.)


Feasibility

Consider the logic of a Rotating Treap after a modification.

Let’s examine a standard insert function for a Rotating Treap:

void insert(int &k, int x) {
    if (!k) {
        sz++; k = sz;
        siz[k] = 1; cnt[k] = 1;
        val[k] = x; rnd[k] = rand();
        return;
    }
    siz[k]++;
    if (val[k] == x) {
        cnt[k]++;
    } else if (val[k] < x) {
        insert(r[k], x);
        if (rnd[r[k]] < rnd[k]) lrotate(k); // Left Rotate
    } else {
        insert(l[k], x);
        if (rnd[l[k]] < rnd[k]) rrotate(k); // Right Rotate
    }
}

You will notice that after a node is inserted, all nodes on the path from its parent up to the root may undergo rotations.

The rotation operation involves turning a parent into a child of its own child and re-hooking subtrees. If we adopt the logic of identifying children rather than parents during traversal, only the nodes involved in the rotation and their immediate ancestors change their state.

Therefore, we can simply copy these nodes before changing their state. Using a method similar to a Persistent Segment Tree (Chairman Tree), we point the new nodes back into the original tree and store the new root generated by this operation.

The same logic applies to the delete operation, which also modifies the structure. For queries, since the tree structure doesn't change, no copying is needed.

For every operation, we expect to copy $$$O(\log n)$$$ nodes. The complexities are:

  • Time Complexity per operation: $$$O(\log n)$$$

  • Total Space Complexity: $$$O(q \log n)$$$


Implementation

First, we implement a helper function to clone nodes. We create a new index and copy all data from the source node.

int copy(int k) {
    if (!k) return 0;
    ++sz;
    l[sz] = l[k]; r[sz] = r[k];
    val[sz] = val[k]; cnt[sz] = cnt[k];
    siz[sz] = siz[k]; rnd[sz] = rnd[k];
    return sz;
}

Next, let's modify the insert and delete functions. For any operation that changes a node's state, we must first copy the node and then perform the modification on the clone.

Insertion Logic:

  1. If the node to be inserted never existed, create it.

  2. If the value already exists, copy the node and increment its count.

  3. If we are traversing down, copy the current node, recurse, and perform rotations on the copied node during the backtracking phase.

int insert(int k, int x) {
    if (!k) {
        ++sz;
        l[sz] = r[sz] = 0;
        val[sz] = x;
        cnt[sz] = siz[sz] = 1;
        rnd[sz] = rand();
        return sz;
    }
        
    int nk = copy(k); // Copy first
    siz[nk]++;
    
    if (val[nk] == x) {
        cnt[nk]++; // Exists, just increment count
    } else if (val[nk] < x) {
        r[nk] = insert(r[nk], x); // Recurse down
        if (rnd[r[nk]] < rnd[nk]) lrotate(nk); // Maintain via rotation
    } else {
        l[nk] = insert(l[nk], x);
        if (rnd[l[nk]] < rnd[nk]) rrotate(nk);
    }
    return nk;
}

Deletion Logic:

The logic is identical—copy before any state or structural change.

int del(int k, int x) {
    if (!k) return 0;
        
    int nk = copy(k); // Copy first
        
    if (val[nk] == x) {
        if (cnt[nk] > 1) {
            cnt[nk]--;
            siz[nk]--;
            return nk;
        }
        if (!l[nk] || !r[nk]) {
            return l[nk] | r[nk]; // Return the non-empty child
        }
        if (rnd[l[nk]] < rnd[r[nk]]) {
            l[nk] = copy(l[nk]); // Copy child before rotating
            rrotate(nk);
            r[nk] = del(r[nk], x);
        } else {
            r[nk] = copy(r[nk]);
            lrotate(nk);
            l[nk] = del(l[nk], x);
        }
        upd(nk); // Update size after deletion
        return nk;
    }
        
    if (val[nk] < x) {
        r[nk] = del(r[nk], x);
    } else {
        l[nk] = del(l[nk], x);
    }
    upd(nk);
    return nk;
}

Validation

This implementation passes the Luogu P3835 (Template: Persistent Balanced Tree).

Performance Results:

  • Slowest Time: 406ms

  • Peak Memory: 234.81MB

(Full source code is provided in the original post for reference.)


Afterword

Long live the Rotating Treap!!!

The claim that "only FHQ-Treaps can be persistent" is simply incorrect.

Of course, I encourage everyone to continue exploring other "exotic" persistent structures (Persistent Splay, Persistent Scapegoat Tree, Persistent AVL, etc.).

Full text and comments »

  • Vote: I like it
  • +6
  • Vote: I do not like it

By Furioso_Slient, history, 4 months ago, In English

Introduction

Chinese Version:Link

UPD#1 18.12.25: Fixed relevant typos and rigorousness errors; added the randomization concept in Treap.

Mention "randomization" and most people's first reaction is probably flipping a coin to bet on YES or NO.

Randomization itself, often seen as an "unorthodox method" (or "dark art"), is generally discarded by teachers in standard educational institutions and is rarely explained in depth.

However, in reality, the power of randomization is beyond your imagination.

Magical Randomization

Randomization essentially means shuffling a set of data to make it lose some of its original properties, or to grant it new properties.

So, what’s the use of these so-called new properties? Let’s taste a few problems:

Example 1: Countering Extreme Data

Suppose you need to implement a sorting algorithm, but you choose the unstable Quick Sort. Suddenly, a mysterious hacker from Codeforces looks at your code and constructs a set of deranged mysterious data, causing your Quick Sort to degrade to $$$O(n^2)$$$. They successfully turn your code into their 100 points. (~ what a sad story, my child.~~)

What do you do? Just accept your fate?

We can randomly shuffle the array that needs sorting. This shatters and lowers the probability of such devilish, extreme data appearing.

After randomizing once, the expected time complexity of Quick Sort reaches $$$O(n\log n)$$$.

Of course, the above is just an example. In a contest, I suggest using the sort function directly. (Would you really write Quick Sort from scratch in a contest?)

Thus, we derive the first property of randomization: Breaking properties related to the original sequence and position, granting it randomness.

Example Problem (Applied in part): CF2164G Pointless Machine

Solution: Link

TL;DR

It is easy to discover that if we want to consider queries based on the binary "poison problem," we will be stuck by a bunch of bizarre shapes (full binary trees, chains, star graphs, etc.).

Therefore, we shuffle (randomly permute) the original sequence before querying. Discretizing the points we query can effectively disrupt the graph shapes that hinder our backtracking queries.

Example Problem 2: ULR#3 T1 Storage

Solution

Assume you already know how to get all partial points except for the final subtask.

We find that the constraint lies in how to determine the characteristic of a ring. Consider a greedy approach: define the smallest number in the ring as the ring's characteristic. During a query, if we encounter a number larger than the current one, we jump directly. However, if the problem setter constructs specific data, finding the ring's characteristic could still take $$$O(n)$$$, making the time complexity $$$O(n^2)$$$, which will definitely TLE.

So, just randomize it.

And that's it.

The amortized complexity becomes $$$O(n\log n)$$$. Because after randomization, the characteristics of the entire ring are shuffled, the constructed data won't block you, and the expected number of steps to find the ring characteristic becomes $$$O(\log n)$$$.

For the specific proof, refer to the solution by Cfz (the official solution).

Example 2: Randomized Property Retrieval

Consider the following problem:

Construct a sequence of length $$$n$$$ such that for any $$$1\le l \lt r \le n$$$, the condition $$$a_l \oplus a_{l+1} \oplus \dots \oplus a_{r-1} \oplus a_r \neq 0$$$ is satisfied.

$$$n \le 10^6$$$.

This problem seems simple. Considering that the XOR sum of a sequence is $$$0$$$ if and only if its prefix XOR sums satisfy $$$P_{l-1} = P_r$$$, I just need to construct $$$n$$$ distinct $$$P_i$$$ values to satisfy the condition, right?

But what if you didn't think of that? Suppose there's a recall (a difficult problem) before this problem that leaves you stunned, and you think this problem is difficulty *3500, so you don't think in that direction. Or perhaps, you just couldn't figure out how to construct $$$n$$$ distinct $$$P_i$$$ values.

Give up? Retire? No way.

We store the $$$P$$$ values we have already constructed. Then, we randomly pick a number $$$x$$$ from the allowed value range. If $$$x \oplus P_{i-1}$$$ does not exist in the sequence, doesn't that work? Given that extraction time is $$$O(1)$$$, $$$O(n\log n)$$$ will surely pass.

So, we derive the second property of randomization: In cases where the property is known but the construction method is unknown, you can choose to use a random number generator to pick a number and check if it fits the property. (This works best when the majority of numbers satisfy the condition).

Example Problem: [JSOI2016] Anti-prime Sequence

Randomized Solution:

When a sequence is not an anti-prime sequence, there exists a pair of numbers (one odd, one even) whose sum is a prime number.

Consider randomly picking a number and inserting it, then frantically processing the subsequent numbers (essentially shuffling the array and processing slowly), greedily shoving in every valid number.

But the first pick might not be optimal, so we can try multiple times and take the maximum result. It is recommended to use the mysterious clock() trick to maximize the number of attempts within the time limit.

Complexity $$$O(kn^2)$$$, where $$$k$$$ is the number of shuffles.

If you didn't think of the mysterious Bipartite Graph optimizations during the contest, this can truly save your life.

Example 2: 【CSP — S2022】Galaxy (Monte Carlo Algorithm)

Solution

It is easy to find that the perfect timing is when the total out-degree is $$$n$$$, and every point has exactly one outgoing edge; otherwise, it is not.

Maintaining this dynamically directly will definitely TLE.

Consider directly calculating the sum of out-degrees. If the total out-degree is $$$n$$$, it satisfies a necessary condition for the answer.

Consider how to turn this into a sufficient condition: Assign a random weight to every point, and maintain a total weighted sum $$$sum$$$:

$$$ \sum_{i \in V} w_i \times out_i $$$

Where $$$w_i$$$ is the random point weight, and $$$out_i$$$ is the out-degree.

Every time a modification happens, we change this total sum by operating on the out-degrees. Then we compare it with $$$sum$$$ (the theoretical sum if every node had exactly out-degree 1, i.e., $$$\sum w_i$$$). If they are the same, output YES, otherwise NO.

Done.

This problem is quite educational. The essence of this idea is the Monte Carlo algorithm: the running time is fixed, but there is a small probability that the answer is wrong. :::

Example 3: Uniform Discretization

Consider the following problem (QOJ14421 Far Away):

Given a graph with $$$n$$$ points and $$$m$$$ edges, perform $$$q$$$ queries on two points to determine if their distance exceeds $$$20000$$$. If two points cannot reach each other, the distance is considered infinite.

$$$n \le 10^5, m \le 3\times10^5$$$.

Want All-Pairs Shortest Path? Sorry, $$$n \le 10^5$$$.

So what now?

Let's ignore connected components for a moment. If the size of a connected component is less than $$$20000$$$, then the distance between any points inside it is definitely less than $$$20000$$$.

Therefore, we can randomly scatter some key points within the connected component, run Single-Source Shortest Path (SSSP) from these points to get the distance to every reachable point. Then, for every query between two points in the component, we judge if the distance exceeds $$$20000$$$ by routing through a key point.

Why use random numbers to scatter points? Because it's uniform (yeah, right). It guarantees that in the vast majority of cases, it covers the paths satisfying the conditions, allowing us to obtain the correct path during queries through these points.

Also, you will find that Hacking is disabled for this problem on QOJ, which implies that the solution to this problem leans towards randomization—the kind of algorithm easily blocked by magical data.

Magical Random Extraction

If you need to construct a bunch of mysterious things now, and you don't know how, what do you do?

However, if you know that the probability of drawing a number that fits the condition is extremely high, you can use random extraction.

Example: Prime Expectation (Las Vegas Algorithm)

First, a conclusion: The probability of drawing a prime number when randomly picking an integer from $$$1 \sim n$$$ with equal probability is approximately $$$\frac{1}{\ln n}$$$.

Consider the following problem:

Find a non-prime number $$$n$$$, such that after finding its unique prime factors, the sum of these unique factors equals $$$m-1$$$ ($$$m\le 10^9$$$ and $$$m$$$ is an odd number).

Constructing $$$n$$$ directly is obviously unrealistic, but we can simplify the problem: decompose $$$n$$$ into the product of two primes, and consider $$$p+q+1 = m$$$.

From Goldbach's Conjecture (every even number greater than 2 can be expressed as the sum of two primes, verified up to $$$10^9$$$), there must exist two primes whose sum is $$$m-1$$$. But how to find these two primes? $$$10^9$$$ perfectly blocks all deterministic algorithms.

So, we use mysterious random numbers. Randomly pick a number $$$p$$$ from $$$10^9$$$, then take $$$m-1-p$$$, and check if both are prime.

The essence of this random number thought process is the Las Vegas algorithm: the answer is always correct, but the running time is not determined.

Time complexity: $$$O(\text{it passes})$$$.

Time complexity: $$$O(k\sqrt{m})$$$.

We expect to extract an even number (that works) within $$$O(\ln n)$$$ tries. Tested to pass within a one-second time limit. (~proving the exact expectation of $$$k$$$ is no less difficult than proving Goldbach's conjecture; the author does not want to win the Fields Medal for now.~~)

Of course, you can also use more excellent primality test code (Miller-Rabin) to optimize away the square root.

Magical Hashing

This is not a Hash crash course; I will not explain Hash basics here and assume you know them.

Your learning path for Hashing must be: Learn Hash -> Use Hash -> Get Hacked (Carded).

There are always those magical problem setters. To prevent their beloved string problems from being bypassed by neurotic Hash algorithms [CENSORED], the setter will surely ban every modulus and base they have ever known in their life. Even if the setter doesn't troll you, there will always be those god-like users who Hack your code, turning your score into their 100 points.

My coach once wrote a fixed five-layer Hash to avoid being hacked, thinking he could sleep easy, only to receive a notification that he was Hacked on CF after the contest ended.

So, how to prevent being hacked? This is a good question. Before discussing this, let's look at an example:

Example 1: Hacking Hash: Birthday Paradox / Hash Killer II

Problem Statement:

Original Problem: Given a string of length $$$n$$$, find how many distinct substrings of length $$$l$$$ exist.

You are now given a piece of single-Hash code solving this problem, where the modulus is fixed at $$$10^9+7$$$, but the $$$Base$$$ value is unknown. You need to construct a string that causes a Hash collision in the code to give a wrong answer.

Doesn't it look invincible? You don't even know the $$$Base$$$, how can you hack it?

Let's introduce a paradox:

If there are more than 23 people in a room, the probability that at least two people share the same birthday is not less than $$$50\%$$$.

Sounds outrageous, right? For each person, shouldn't the probability of existing someone with the same birthday be $$$\frac{1}{365}$$$, which is clearly far less than $$$\frac{1}{2}$$$?

But actually, let's think in reverse: What is the probability that 23 people have distinct birthdays?

For the second person, the probability of not having the same birthday as the first person is $$$\frac{364}{365}$$$. Similarly for others. Finally, we find this probability is:

$$$ P(23) = \frac{364!}{343!\times 365^{23}} \approx 0.4927 $$$
$$$ 1-P(23) \approx 0.5073 $$$

Surprising, isn't it?

Thus, using the same method, we can summarize a conclusion:

For a single Hash, a collision is expected to occur within $$$\sqrt{Mod}$$$ attempts.

So, simply generating a random string of length $$$n$$$ finishes the job.

Example 2: Not Being Hacked: Hash Killer III

First, you need to know that Hash Killer III is currently an unsolved Open Problem. So, time permitting, using the Hash Killer III method on CF will absolutely not be hacked (otherwise, I suggest nominating that hacker for a Turing Award).

Let's analyze Hash Killer III:

First, Hash Killer III still uses a random $$$Base$$$, but it uses random prime moduli $$$Mod1$$$ and $$$Mod2$$$, becoming a Random Double Hash.

At this point, you will find that the conditions to trigger the Birthday Paradox become harsh. It requires an error to occur simultaneously under two random moduli. The probability drops to a negligible level. (Passing respects to the great hzhwcmhf)

Mentioning Hash Killer III here is not to explain the error of the Birthday Paradox, but to mention Property 1 of randomization we discussed: Scrambling properties.

Randomization makes targeted data completely ineffective, but it can also be countered by randomization. The repetitive verification of double hashing makes most random constructions useless, but it is easily countered by targeted data (~ResT in peace, my coach's fixed five-layer hash~~). However, combining the two makes most hacking methods ineffective, becoming almost unsolvable.

Let's use the Birthday Paradox to simply deduce why Double Hash is not easily hacked by randomization: Two large primes $$$Mod1\times Mod2 \approx 10^{18}$$$. A collision would likely occur only around $$$\sqrt{10^{18}} = 10^9$$$, but in most cases, $$$n$$$ is at most $$$10^6$$$. It is very hard to hack this way.

But similarly, no one has proved this thing is invincible; it's just that no one has found that sequence yet. But for OI / ACM, it is sufficient.

Magical Data Structures

Note: Since I am a DS (Data Structure) noob, I will only discuss the randomization idea here.

Unexpectedly, data structures can also use randomization.

What? You are a DS master? Excuse me.

First, let's introduce an old cliché problem: For a BST (Binary Search Tree), how do we prevent it from degrading into a chain?

At this time, we certainly cannot randomize by shuffling the query sequence, as that would definitely lead to errors.

So, we still consider randomization.

We realize that a BST degrades into a chain because the input data is too monotonic.

Since the data is monotonic in value and we can't easily change the values, let's add a random weight.

Consider assigning a random value to every point upon insertion. Ensure that while processing the BST insertion, this random weight satisfies the properties of a max-heap / min-heap. This can effectively optimize the degradation problem.

Essentially, this still utilizes the uniform characteristic of randomization to obtain relatively discrete values, and then controls the shape of the tree based on these values. The expected complexity per query is $$$O(\log n)$$$.

Postscript / Acknowledgments

Writing this can be considered my personal study notes. I will continue to add and enrich the content in the future.

If there are errors, corrections are welcome.

Acknowledgment List:

  • Thanks to @ETO_leader for pointing out issues with rigorousness in expression.

References:

Full text and comments »

  • Vote: I like it
  • +107
  • Vote: I do not like it