Easy and (Semi)Efficient Dynamic Segment Trees (with Policy Hash Tables)

#	User	Rating
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

One of my favorite implementations of segment trees has always been "Easy and Efficient Segment Trees, by Al.Cash. I used to dread segtree problems, but after reading that blog post and adapting a super simple implementation I've gotten a lot better with them. However, there are some types of segtree that you can't implement in that fashion, namely dynamic segtrees and persistent segtrees. See here for criticism. With the advent of policy hash tables, however, one can now implement dynamic segtrees in Al.Cash's style with somewhat comparable performance to a custom dynamic segtree.

Standard segtree

This is how a standard segtree looks like. You can set a single element, and query for ranges. It's nice and simple, and I think it's a great implementation.

int N;
int seg[2 * MAXN];

void modify(int p, int val) {
    for (seg[p += N] = val; p > 0; p >>= 1)
        seg[p >> 1] = seg[p] + seg[p ^ 1];
}

int query(int l, int r) {
    int res = 0;
    for (l += N, r += N; l < r; l >>= 1, r >>= 1) {
        if (l & 1)
            res += seg[l++];
        if (r & 1)
            res += seg[--r];
    }
    return res;
}

Dynamic Segtree

However, say your underlying array had 1e9 possible locations, but it only contained 1e5 elements. For example, take a look at this post. Obviously, you can't store all 2e9 elements in your segtree, so what should you do? Here's one solution, replace the array with a hash table. However, as adamant mentions, unordered_map has too much overhead. We'll be benchmarking against the dynamic segtree provided here. I'll also be using a custom hash function. So to be clear, the implementation now looks like:

Code

int N;
unordered_map<int, int, chash> seg;

void modify(int p, int val) {
    for (seg[p += N] = val; p > 0; p >>= 1)
        seg[p >> 1] = seg[p] + seg[p ^ 1];
}

int query(int l, int r) {
    int res = 0;
    for (l += N, r += N; l < r; l >>= 1, r >>= 1) {
        if (l & 1)
            res += seg[l++];
        if (r & 1)
            res += seg[--r];
    }
    return res;
}

And benchmarking it with 1e5 random insertions and 1e5 random queries.

pointer: 0.171485
unordered_map: 2.0646

Wow. The unordered_map is nearly 12x slower. That's not really feasible for a lot of contests. What if we replace it with a policy hash table, though?

Code

int N;
unordered_map<int, int, chash> seg;

void modify(int p, int val) {
    for (seg[p += N] = val; p > 0; p >>= 1)
        seg[p >> 1] = seg[p] + seg[p ^ 1];
}

int query(int l, int r) {
    int res = 0;
    for (l += N, r += N; l < r; l >>= 1, r >>= 1) {
        if (l & 1)
            res += seg[l++];
        if (r & 1)
            res += seg[--r];
    }
    return res;
}

pointer: 0.202186
policy hash table: 0.384312

Only a 2x decrease in speed. That's already very feasible. However, one might notice that since maps in C++ create elements if you try to access a key that doesn't exist, we're creating a lot of useless elements. Thus, we can simply wrap a check to make sure the element is in the array before we try to access it

EDIT: Updated with dalex's optimization.

gp_hash_table<ll, ll, chash> seg;

ll get(ll x) { return (seg.find(x) == seg.end()) ? 0 : seg[x]; }
void modify(ll p, ll val) {
    for (seg[p += N] = val; p > 0; p >>= 1) {
        seg[p >> 1] = get(p) + get(p ^ 1);
    }
}

ll query(ll l, ll r) {
    ll res = 0;
    for (l += N, r += N; l < r; l >>= 1, r >>= 1) {
        if (l & 1)
            res += get(l++);
        if (r & 1)
            res += get(--r);
    }
    return res;
}

Results (averaged over twenty runs):

2e5 insertions and 2e5 queries

pointer: 0.44085
policy hash table: 0.57878

1e5 insertions and 1e5 queries

pointer: 0.19855
policy hash table: 0.29467

1e4 insertions and 1e4 queries

pointer: 0.014
policy hash table: 0.027

So while we're nearly twice as slow with 1e4 elements and 1e4 queries, we're actually only 30% slower with 2e5 insertions and 2e5 queries.

One more thing. While I'm giving numbers like "30% slower", that's a little bit misleading. If we break down the numbers between insertion/querying, we see this:

2e5 insertions and 2e5 queries Queries:

pointer: 0.41625
policy hash table: 0.15627

Insertions:

pointer: 0.1367
policy hash table: 0.42619

1e4 insertions and 1e4 queries Queries:

pointer : 0.094
policy hash table: 0.007

Insertions:

pointer : 0.0045
policy hash table: 0.0191

So as we see from this more granular breakdown, the Policy Hash Table implementation is actually ~3x faster at querying than the custom implementation, while the custom implementation is roughly ~3x faster at inserting elements.

TL;DR: Using policy hash tables is an extremely easy and fairly efficient method of implementing dynamic segtrees.

Comments (11)

Write comment?

Chilli

6 years ago, # |

Auto comment: topic has been updated by Chilli (previous revision, new revision, compare).

→ Reply

+61

I wrote this blog post about 4 months ago, but I forgot to publish it.

dalex

+28

In get(), find an iterator, compare it with end(), return the number under the iterator. It should be faster because you make only one search in the map.

6 years ago, # ^ |

This makes a small but real improvement. It goes from 0.19 to 0.16 seconds for 2e5 queries on 2e5 elements.

adamant

Wow, that's unexpected and nice. Can you provide full code for benchmarks?

I'll update the blog post with new benchmarks with dalex's improvement, as well as post the benchmark tonight.

I've updated the blog with the new benchmarks. You can find the benchmark code here: https://ideone.com/bdSh9H

MrDindows

← Rev. 4 →

Why do you call extend in getSum?

Let's compare with more efficient version of Dynamic Segtree: https://ideone.com/6wyGFR (or with nums up to 1e18 https://ideone.com/AGnTj2)

← Rev. 3 →

I took my implementation from the link provided; it does seem like your implementation is substantially faster.

There is one thing that's unfair about your second benchmark; the hash function I previously used is pretty bad (especially for 64 bit ints).

Here's an corrected benchmark for your second case: https://ideone.com/sze6wg

Pointer still seems to be better by a factor of 2. Most of that comes from the modification portion still, where policy hash is slower by a factor of 3. On queries, they're roughly equivalent.

I'll update my post to reflect these new numbers later tonight.

So as the main bottleneck for my segtree is currently in the "inserting" elements portion, I wondered whether performance could be significantly improved by preallocating the hash table to the correct size. As it turns out, it can! Although I'm not sure if it's worth it...

The default size allocator starts at 8 and increases by a factor of 2 every time you go over the specified load amount. If you start with say, 1<<24 elements, you no longer need to do the costly size changes.

So with all your changes taken into account (nums up to 1e18, your implementation of a dynamic segtree)

https://ideone.com/sfmQtY

So this is back down to about a 60% constant factor increase. Even better, it no longer performs significantly worse the smaller scale we're looking at. Take 1e5:

https://ideone.com/LpAQFl

If you think this is a fair benchmark, I'll update the original blog post with these implementations.

Nice results, good job!

Chilli's blog

Standard segtree

Dynamic Segtree