Algorithm Gym :: Data structures

#	User	Rating
1	Benq	3792
2	VivaciousAubergine	3647
3	Kevin114514	3611
4	jiangly	3583
5	strapple	3515
6	tourist	3470
7	Radewoosh	3415
8	Um_nik	3376
9	maroonrk	3361
10	XVIII	3345

#	User	Contrib.
1	Qingyu	162
2	adamant	148
3	Um_nik	146
4	Dominater069	143
5	errorgorn	141
6	cry	138
7	Proof_by_QED	136
8	YuukiS	135
9	chromate00	134
10	soullless	133

Today I want to introduce you some very very useful data structures.

In this lecture, we are trying to improve your data structures skills, stay with us and click on read more. Important data structures :

Trees

Trees are one of the most useful data structures.A tree is a connected-acyclic graph.There are too many types of trees, like : rooted trees, weighted trees, directed trees, tries, etc.

Partial sum

There are two types of problems solvable by partial sum.

1.Problems which you are asked to answer some queries about the sum of a part of elements (without modify queries).

Solution of all of this problems are the same. You just need to know how to solve one of them.

Example : You are asked some queries on an array a₁, a₂, ...a, _n. Each query give you numbers l and r and you should print a_l + a_l + 1 + ... + a_r .

Solution : You need to build another array s₁, s₂, ..., s_n which s_i = a₁ + a₂ + ... + a_i and answer is s_r - s_l - 1 .

2.Problems which you are asked to perform some queries asking you to modify a part of elements (without printing queries.)

Solution of all of this problems are the same. You just need to know how to solve one of them.

Example : You need to perform some queries on an array a₁, a₂, ...a, _n. Each query give you numbers l, r and v and for each i such that l ≤ i ≤ r you should increase a_i by v, and then after performing all queries, you should print the whole array.

Solution : You should have another array p₁, p₂, ..., p_n which, all of its members are initially 0, for each query, you should increase p_l by v and decrease p_r + 1 by v .

An then, for each i, starting from 1 you should increase p_i by p_i - 1. So, final array would be a₁ + p₁, a₂ + p₂, ..., a_n + p_n .

Hard problem of partial sum : Troynacci Query

Disjoint sets

Disjoint sets are also useful data structures. Using them is fast and easy. We use theme in many algorithms, like Kruskal's and Prim's.

Disjoint sets, or DSU (Disjoint Sets Union) as their name, are sum sets. Imagine we have some boxes and some tools and initially each tool is in one box. Mostly, we are given some queries and ask to merge two boxes or print the members of a box or find which box is some tool in.

For rest of these, let's consider that initially there is exactly one tool in a box.That is, we have n tools and n boxes and initially, tool number i is in box number i.

For this propose, we can use so many containers.Like :

Trees

Trees are the most useful containers for DSU. For each vertex, we keep it's parent (and parrent of the root is -1). So, initially are parents are set to -1, and we have queries to find the root of each box(having the root, we can easily find the box's index) and queries for merging two trees. For better time complexity, every time we want to find the root of each vertex, we set it's parent to the root for the next queries.And while merging, we always want to minimize the height of the tree, so when we want to merge the boxes, it's like we put all the tools of the box with fewer tools in the other box.

The best way I've seen to code this kind of DSU, is style of bmerry : (C++)

int root(int v){return par[v] < 0 ? v : (par[v] = root(par[v]));}
void merge(int x,int y){	//	x and y are some tools (vertices)
        if((x = root(x)) == (y = root(y))     return ;
	if(par[y] < par[x])	// balancing the height of the tree
		swap(x, y);
	par[x] += par[y];
	par[y] = x;
}

In the code above, for each root v, par[v] equals the negative of number of tools in that box.

Arrays, vectors

We keep tools in a vector (or an array) and when we have a query to merge two boxes, we put all the tools of the box with fewer tools in the other box.

The time complexity is good because for each tool, we take and put it in an other box at most log(n) times (each time the size of the vector will be at least doubled).

So time complexity would be O(n.log(n)) .

Sets (red-black trees)

Other way is to keep them in a red-black tree (in C++ it's set). We do exactly like vectors, so time complexity would be O(n.log²(n)) . (One log is for inserting).

Problems : Hamro and tools, TROY Query (Join the group ACM-OI first)

Tries

Tries are some kind of rooted trees in which each edge has a character on it. Actually, trie is some kind of DFA (Determining Finite Automata). For a bunch of strings, their trie is the smallest rooted tree with a character on each edge and each of these strings can be build by writing down the characters in the path from the root to some node.

It's advantage is, LCP (Longest Common Prefix) of two of these strings is the LCA (Lowest Common Ancestor) of their nodes in the trie(a node that we can build the string by writing down the characters in the path from the root to that node).

Generating the trie :

Root is vertex number 0 (C++)

int x[MAX_NUMBER_OF_NODES][MAX_ASCII_CODE], next = 1; //initially all numbers in x are -1
void build(string s){
	int i = 0, v = 0;
	while(i < s.size()){
		if(x[v][s[i]] == -1)
			v = x[v][s[i++]] = next ++;
		else
			v = x[v][s[i++]];
	}
}

Problem : A lot of games

Suffix array

Suffix array is a data structure that helps you sort all the suffixes in lexicography order.

This array consists of integers, the beginning of suffixes.

There are two ways to achieve this goal :

One) Non-deterministic algorithm : Use Robin-Carp and for check if a suffix is lexicographically less than another one, find their LCP using binary search + hash and then check the next character after their LCP.

Code :

namespace HashSuffixArray
{
	const int
		MAXN = 1 << 21;

	typedef unsigned long long hash;

	const hash BASE = 137;

	int N;
	char * S;
	int sa[MAXN];
	hash h[MAXN], hPow[MAXN];

	#define getHash(lo, size) (h[lo] - h[(lo) + (size)] * hPow[size])

	inline bool sufCmp(int i, int j)
	{
		int lo = 1, hi = min(N - i, N - j);
		while (lo <= hi)
		{
			int mid = (lo + hi) >> 1;
			if (getHash(i, mid) == getHash(j, mid))
				lo = mid + 1;
			else
				hi = mid - 1;
		}
		return S[i + hi] < S[j + hi];
	}

	void buildSA()
	{
		N = strlen(S);
		hPow[0] = 1;
		for (int i = 1; i <= N; ++i)
			hPow[i] = hPow[i - 1] * BASE;
		h[N] = 0;
		for (int i = N - 1; i >= 0; --i)
			h[i] = h[i + 1] * BASE + S[i], sa[i] = i;

		stable_sort(sa, sa + N, sufCmp);
	}

} // end namespace HashSuffixArray

Two) Deterministic algorithm : We sort them log(MaxLength) steps, in the i - th step (counting from 0), we sort them according to their first 2ⁱ characters and put the suffixes whit the same prefix with 2ⁱ characters in the same buckets.

Code :

/*
Suffix array O(n lg^2 n)
LCP table O(n)
*/
#include <cstdio>
#include <algorithm>
#include <cstring>

using namespace std;

#define REP(i, n) for (int i = 0; i < (int)(n); ++i)

namespace SuffixArray
{
	const int MAXN = 1 << 21;
	char * S;
	int N, gap;
	int sa[MAXN], pos[MAXN], tmp[MAXN], lcp[MAXN];

	bool sufCmp(int i, int j)
	{
		if (pos[i] != pos[j])
			return pos[i] < pos[j];
		i += gap;
		j += gap;
		return (i < N && j < N) ? pos[i] < pos[j] : i > j;
	}

	void buildSA()
	{
		N = strlen(S);
		REP(i, N) sa[i] = i, pos[i] = S[i];
		for (gap = 1;; gap *= 2)
		{
			sort(sa, sa + N, sufCmp);
			REP(i, N - 1) tmp[i + 1] = tmp[i] + sufCmp(sa[i], sa[i + 1]);
			REP(i, N) pos[sa[i]] = tmp[i];
			if (tmp[N - 1] == N - 1) break;
		}
	}

	void buildLCP()
	{
		for (int i = 0, k = 0; i < N; ++i) if (pos[i] != N - 1)
		{
			for (int j = sa[pos[i] + 1]; S[i + k] == S[j + k];)
			++k;
			lcp[pos[i]] = k;
			if (k)--k;
		}
	}
} // end namespace SuffixArray

(Codes by mukel)

Heaps

A heap is a binary rooted tree (a rooted tree that each node has at most 2 children) and each vertex has a value.

Heap property : Heap usually has a property, like the value of each vertex is equal to or greater than the value of its child(ren) (we call this a max heap). We can use heaps in heap sort.

Fibonacci heaps

A fibonacci heap is a kind of heap with better complexities. We don't need to know what a fibonacci heap is.C++ already has one, priority_queue.

Binary Search Tree (BST)

A binary search tree (BST) is a binary rooted tree that every node has a value, and for each node, the value of every node in its left child's subtree is less than its value and the value of every node in its right child's subtree is greater than that. Usually we perform some queries on BSTs, like inserting, deleting, asking and ... .

Binary search trees are too useful.

Red-black trees

A red-black tree is a kind of BST that after each query, BST will be balanced in such a way that it's height remains O(log(n)).

C++ already has a red-black tree inside, set .

You can read about them in C++ references.

Unfortunately, set has not any function to find the k - th smallest minimum or find the index of an element, bust there is a data structure in C++ with does it in O(log(n))(also contains all set functions), tree :

#include<bits/stdc++.h>
#include<ext/pb_ds/assoc_container.hpp>
#include<ext/pb_ds/tree_policy.hpp>
using namespace __gnu_pbds;
using namespace std;
template <typename T>
using ordered_set = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>;

int main(){
	ordered_set<int>  s;
	s.insert(1); 
	s.insert(3);
	cout << s.order_of_key(2) << endl; // the number of elements in the s less than 2
	cout << *s.find_by_order(0) << endl; // print the 0-th smallest number in s(0-based)
}

(Thanks to HosseinYousefi for syntax using!)

This works even in C++ 98 !

You can read more about it, just google sgi STL.

SQRT Decomposition

Suppose we have an array a₁, a₂, ..., a_n and $\text{[math]}$ . We partition this array into k pieces each containing k elements of a.

Doing this, we can do a lot of things in $\text{[math]}$ . Usually we use them in the problems with modify and ask queries.

Problems : Holes, DZY Loves Colors, RMQ (range minimum query) problem

Sparse Table

The main problem that we can solve is RMQ problem, we have an array a₁, a₂, ..., a_n and some queries. Each query gives you numbers l and r (l ≤ r) and you should print the value of min(a_l, a_l + 1, ..., a_r) .

Solving using Sparse Table : For each i that 1 ≤ i ≤ n and for each j that 0 ≤ j and i + 2^j - 1 ≤ n, we keep the value of min(a_i, a_i + 1, ..., a_{i + 2^j - 1}) in st[i][j] (preprocess) : (code is 0-based)

for(int j = 0;j < MAX_LOG;j++)
	for(int i = 0; i < n;i ++)if(i + (1 << j) - 1 < n)
		st[i][j] = (j ? min(st[i][j-1], st[i + (1 << (j-1)) - 1][j-1]): a[i]);

And then for each query, first of all, find the maximum x such that 2^x ≤ r - l + 1 and answer is min(st[l][x], st[r - 2^x + 1][x]) .

So, the main idea of Sparse Table, is to keep the value for each interval of length 2^k (for each k).

You can use the same idea for LCA problem and so many other problems.

So preprocess will be in O(n.log(n)) and query will be in O(1)

Problems : Strip, GCDSSQ, LCM Query .

Heavy light decomposition

Heavy light decomposition is a way to partition a tree's vertices (or edges) in a good way.

In this kind of decomposition, we have some chains, and each vertex belongs to only one chain.

If vertex v is the parent of u size_of_subtree_of(v)/2 < size_of_subtree_of(u), u and v are in a chain and we call the edge uv, heavy, otherwise light.

There is at most one such child for each vertex v. If we consider the path from any vertex v to the root, there will be at most log(n) light edges there (go from v to the root, every time we see a light edge, size of subtree will be at least doubled). So, the number of chains on the way = O(log(n)) .

In each of these chains, we can contain a container or another data structure like segment tree or etc.

Problem : GRASS PLANTING

Fenwick

Suppose that we have n elements numbered from 1 to n.

Fenwick or BIT(Binary Indexed Tree) is a data structure with n nodes that node number i has some information about elements in the interval (i - i& - i, i] .

Actually, you don't need to know what each node contains. The only thing you should know, it this (then you can change and convert it) :

We have an array a₁, a₂, ..., a_n and all of them are initially 0. We are gives some queries, 1.increase a_p by val and print a₁ + a₂ + ... + a_p .

Only thing you should know is that how to solve this problem using Fenwick (and then you can change it and solve so many problems).

We perform each query in O(log(n)). Code : (1-based)

int fen[MAX_N];
void update(int p,int val){
	for(int i = p;i <= n;i += i & -i)
		fen[i] += val;
}
int sum(int p){
	int ans = 0;
	for(int i = p;i;i -= i & -i)
		ans += fen[i];
	return ans;
}

Please note that it should be 1-based. It can't be done 0-based.

Problems : Inversions, Pashmak and Parmida's problem, BST .

Segment tree

We have an array of elements and some queries on intervals. So, we will be glad if we can split this interval to O(log(n)) intervals that we have actually some information about them.

Segment tree does that for you. Segment tree is a tree that each of it's nodes belongs to an interval.

Root of the tree belongs to the interval [0, n) (0-based).

Each node, has 0 or two children. Left and right. If a node's interval is [l, r) and l + 1 ≠ r, the interval of its children will be [l, mid) and [mid, r) in order where $\text{[math]}$ , so the height of this tree will be O(log(n)) .

Each node has an index, we consider that root has index 1 and the children of a vertex x will have indices 2x and 2x + 1 in order.

Segment tree is the most useful data structure and every problem solvable by Fenwick is also solvable by Segment tree.

If the size of the root's interval is n, segment tree could have up to 4n nodes.

To split an interval to some nodes of this tree, we will act like this :

Suppose that S is the set of nodes which their union is [x, y) and no two different nodes in S have nonempty intersection.

A node i with interval [l, r) is in S if and only if x ≤ l ≤ r ≤ y and if it has a parent with interval [b, e), x > l or r > y .

C++ code :

vector<int> s;
void split(int x,int y, int id = 1,int l = 0, int r = n){//	id is the index of the node
	if(x >= r or l >= y)	return ;	// in this case, intersect of [l,r) and [x,y) is empty
	if(x <= l && r <= y){
		s.push_back(id); 
		return ;
	}
	int mid = (l+r)/2;
	split(x,y,id * 2,l,mid);
	split(x,y,id * 2 + 1,mid,r);
}

Example :

We have an array a₁, a₂, ..., a_n and q queries. There are 2 types of queries.

S l r, Print a_l, a_l + 1, ..., a_r
M p x, Modify a_p to x, it means a_p = x .

First of all we need to build the segment tree, for each node we keep the sum of its interval, for node i we call it s[i], so we should build the initial segment tree.

void build(int id = 1,int l = 0,int r = n){
	if(r - l < 2){	//	l + 1 == r
		s[id] = a[l];
		return ;
	}
	int mid = (l+r)/2;
	build(id * 2, l, mid);
	build(id * 2 + 1, mid, r);
	s[id] = s[id * 2] + s[id * 2 + 1];
}

So, before reading the queries, we should call build() .

Modify function :

void modify(int p,int x,int id = 1,int l = 0,int r = n){
	s[id] += x - a[p];
	if(r - l < 2){	//	l = r - 1 = p
		a[p] = x;
		return ;
	}
	int mid = (l + r)/2;
	if(p < mid)
		modify(p, x, id * 2, l, mid);
	else
		modify(p, x, id * 2 + 1, mid, r);
}

(We should call modify(p, x))

Ask for sum function :

int sum(int x,int y,int id = 1,int l = 0,int r = n){
	if(x >= r or l >= y)	return 0;
	if(x <= l && r <= y)	return s[id];
	int mid = (l+r)/2;
	return sum(x, y, id * 2, l, mid) +
	       sum(x, y, id * 2 + 1, mid, r);
}

(We should call sum(l, r))

Lazy propagation

Imagine we have updates on intervals, what should we do ?

Example :

We have an array a₁, a₂, ..., a_n and q queries. There are 2 types of queries.

S l r, Print a_l, a_l + 1, ..., a_r
I l r x, for each i such that l ≤ i < r, increase a_i by x.

We shouldn't update all the nodes in this interval, just the maximal ones, then pass it to children when we need. This trick is called Lazy Propagation, so we should have another array lazy (for nodes) which are initially 0 and every time we want to perform increase query, increase lazy[id] with x.

As above, we also should have an array s for nodes.

So, build function will be same as above. But we need some more functions :

A function to update a node :

void upd(int id,int l,int r,int x){//	increase all members in this interval by x
	lazy[id] += x;
	s[id] += (r - l) * x;
}

A function to pass the update information to its children :

void shift(int id,int l,int r){//pass update information to the children
	int mid = (l+r)/2;
	upd(id * 2, l, mid, lazy[id]);
	upd(id * 2 + 1, mid, r, lazy[id]);
	lazy[id] = 0;// passing is done
}

A function to perform increase queries :

void increase(int x,int y,int v,int id = 1,int l = 0,int r = n){
	if(x >= r or l >= y)	return ;
	if(x <= l && r <= y){
		upd(id, l, r, v);
		return ;
	}
	shift(id, l, r);
	int mid = (l+r)/2;
	increase(x, y, v, id * 2, l, mid);
	increase(x, y, v, id*2+1, mid, r);
	s[id] = s[id * 2] + s[id * 2 + 1];
}

(We should call increase(l r x))

A function to answer to queries asking about the sum :

int sum(int x,int y,int id = 1,int l = 0,int r = n){
	if(x >= r or l >= y)	return 0;
	if(x <= l && r <= y)	return s[id];
	shift(id, l, r);
	int mid = (l+r)/2;
	return sum(x, y, id * 2, l, mid) +
	       sum(x, y, id * 2 + 1, mid, r);
}

(We should call sum(l, r))

Problems : GSS1, GSS3, MULTQ3, DQUERY, KQUERY, POSTERS, PATULJCI, New Year Domino, Copying Data, DZY Loves Fibonacci Numbers, FRBSUM

Persistent data structures

Consider we have some elements, you perform some updates on it and then,and after performing all of them, you want to have the information about the elements, after each update.

For this propose, you got a data structure and somehow, you save the version of that data structure.

The most useful data structure for this propose is segment tree, I will explain persistent segment tree and all other data structures (like Fenwick) are like that.

Persistent segment tree

Example problem :

We have an array a₁, a₂, ..., a_n and at first q update queries and then u ask queries which you have to answer online.

Each update query gives you numbers p and v and asks you to increase a_p by v .

Each ask query, gives you three numbers i and x and y and asks you to print the value of a_x + a_x + 1 + ... + a_y after performing i - th query.

Each update query, changes the value of O(log(n)) nodes in the segment tree, so you should keep rest of nodes (not containing p) and create log(n) new nodes. Totally, you need to have q.log(n) nodes. So, you can not use normal segment's indexing, you should keep the index of children in the arrays L and R.

If you update a node, you should assign a new index to its interval (for i - th query).

You should keep an array root[q] which gives you the index of the interval of the root ( [0, n) ) after performing each query and a number ir = 0 which is its index in the initial segment tree (ans of course, an array s[MAX_NODES] which is the sum of elements in that node). Also you should have a NEXT_FREE_INDEX = 1 which is always the next free index for a node.

First of all, you need to build the initial segment tree :

(In these codes, all arrays and queries are 0-based)

void build(int id = ir,int l = 0,int r = n){
	if(r - l < 2){
		s[id] = a[l];
		return ;
	}
	int mid = (l+r)/2;
	L[id] = NEXT_FREE_INDEX ++;
	R[id] = NEXT_FREE_INDEX ++;
	build(L[id], l, mid);
	build(R[id], mid, r);
	s[id] = s[L[id]] + s[R[id]];
}

(So, we should call build() )

Update function : (its return value, is the index of the interval in the new version of segment tree and id is the index of old one)

int upd(int p, int v,int id,int l = 0,int r = n){
	int ID =  NEXT_FREE_INDEX ++; // index of the node in new version of segment tree
	if(r - l < 2){
		s[ID] = (a[p] += v);
		return ID;
	}
	int mid = (l+r)/2;
	L[ID] = L[id], R[ID] = R[id]; // in case of not updating the interval of left child or right child
	if(p < mid)
		L[ID] = upd(p, v, L[ID], l, mid);
	else
		R[ID] = upd(p, v, R[ID], mid, r);
	return ID;
}

(For the first query (with index 0) we should run root[0] = upd(p, v, ir) and for the rest of them, for j - th query se should run root[j] = upd(p, v, root[j - 1]) )

Function for ask queries :

int sum(int x,int y,int id,int l = 0,int r = n){
	if(x >= r or l >= y)	return 0;
	if(x <= l && r <= y)	return s[id];
	int mid = (l+r)/2;
	return sum(x, y, L[id], l, mid) +
	       sum(x, y, R[id], mid, r);
}

(So, we should print the value of sum(x, y, root[i]) )

Problems : Sign on Fence, MKTHNUM, COT, The Classic Problem

inline void add(int st,int en,int v,int l=0,int r=n,int node=1){ s[node]+=v*(en-st); if(l+1==r) return; else if(st>=mid(l,r)) add(st,en,v,mid(l,r),r,(node<<1)+1); else if(en<=mid(l,r)) add(st,en,v,l,mid(l,r),node<<1); else add(st,mid(l,r),v,l,mid(l,r),node<<1),add(mid(l,r),en,v,mid(l,r),r,(node<<1)+1); }

#define mid(x,y) (x+y)/2 inline int sum(int st,int en,int l=0,int r=n,int node=1){ if(l==st && r==en) return s[node]; if(st>=mid(l,r)) return sum(st,en,mid(l,r),r,(node<<1)+1); if(en<=mid(l,r)) return sum(st,en,l,mid(l,r),node<<1); return sum(st,mid(l,r),l,mid(l,r),node<<1)+sum(mid(l,r),en,mid(l,r),r,(node<<1)+1); }

public boolean search(String word) { int v = 0; for(int i = 0; i < word.length(); i++) { v = x[v][word.charAt(i)]; if(v == -1 ) return false; } // THIS WON"T WORK return true; }

Comments (102)

Show archived | Write comment?

Kei.One

11 years ago, hide # |

Thank you for this Useful toturial

→ Reply

Majid

-11

thanks :)

nic11

← Rev. 3 →

+27

Thanks for the article, especially for problems!

I didn't know that priority_queue is a Fibonacci heap. BTW, are you sure? cplusplus.com and cppreference say that push() works in logarithmic time.

P.S.1 This link (Hamro and tools problem in DSU section) gives access error: link
P.S.2 Isn't "sparse" a correct spelling?

nitishch

11 years ago, hide # ^ |

As he said, you have to register youself in the ACM-OI group.

edogrigqv2

Great intro. Still have 1 question. Does order_set works only in g++? How it is going to work in Visual C++ ?

adamant

Does order_set works only in g++

Yes. It's part of SGI STL extensions of G++.

+16

Btw, tree isn't part of C++ standard, it is extension of GNU C++. You can read more about this data structures here and here :)

Also about persistent data structures. One can be interested in this: #TphcLk
(k-th order statistics in the array segment using persistent bit trie. $\text{[math]}$ . Fully online, works with negative numbers also!)

Also you can use this structure to answer all queries from this problem.

speedy03

How to apply your code to the XOR query please?

My full solution of Xor Queries: #QI8sxL

Xor query can be done greedy — we iterate through implicit bit trie of l..r subsegment and each time we try to take k-th bit in answer which is not equal to k-th bit in query.

Got it, thanks! The official solution is also based on trie. Any specific advantages over segment tree?

Actually segment tree is a kind of trie... Trie just a bit more generic

The is probably mistake in DSU implementation. par[x] += par[y] ??? What is this ?

PrinceOfPersia

Read it !

So, par[x] = -sizeofbox(x), par[y] = -sizeofbox(y). so, par[x] + par[y] = -sizeofbox(x unuion y).

Have I understood it correct now ? par[v] shows the parent of v, if v is not the root, otherwise it shows negative number of nodes in group. 2 in 1 array!!!

Yep !

natsukagami

Actually there's an official name for the technique. It's called union by rank. More info can be found here.

sheri.mori

Why are some comments in blue rectangles in codeforces (like the comment above) ?

VKundas

It is new comments that you hadn't read before. If you update wepbage it'll be like any other comment

OK tnx :-)

ikbal

+25

I thought c++ priority queues are binary heap. As far as I remember Fibonacci heaps has huge constant factor. Could you share source where you get this information?

yeputons

+24

cplusplus.com thinks that priority_queue is just a wrapper around push_heap/pop_heap/make_heap, which works with standard binary heap in a random access container and have logarithmic complexity.

+21

In my Visual C++ 2013 header of I have found make_heap and push_heap that use the same vector.

	void push(const value_type& _Val)
		{	// insert value in priority order
		c.push_back(_Val);
		push_heap(c.begin(), c.end(), comp);
		}

Maybe g++ uses Fibonacci heap. But it is no more part of STL. cplusplus.com usually hints if in different compilers something maybe different. Here it only states that priority_queue uses make_heap, push_heap, and pop_heap.

I also have found this.

oversolver

Fenwick can be 0-based! Change i+=i&-i to i|=i+1 and change i-=i&-i to i=(i&(i+1))-1

Swistakk

← Rev. 2 →

Why would I want to shoot in my foot :P?

Because e-maxx says so :)

+40

Fenwick can be 12-based. i+=i&-i -> i += ((i-11)&-(i-11)) :P

OmaeWaMouShenDeiru

just in time :D thanx

meijun

← Rev. 4 →

At "Arrays, vectors", you said, "So time complexity would be O(n.log(n)) ."

However, I think the time complexity is O(n). The count of copied item is (1 + 2 + 4 + ... + N) where N < n and N = 2^k k is non-negative integer. So the count of copied item is (N * 2 — 1), if original array-modify involved, the count is (N * 2 — 1 + n). Time complexity is O(n).

tom

Yes, and we can simply do vector.reserve(N) and problem disappears at all.

n = O(n.log(n)), I didn't say Θ(n.log(n))

RNR

8 years ago, hide # ^ |

And I think, the complexity will be O(nlogn) in cases where the merge operation is done like merge sort? And if we consider the merging two sets of size s1 and s2 takes O(s1+s2), is this the case?

In arrays, vectors what exactly we are doing can you please elaborate? Are we trying to implement Union-find DS using arrays or vectors? Are we using a vector at each index? So that at max each vertex has to placed O(logn) times? Thank you.

In case 2 of partial sum, can you apply your formula to binary index tree if sum within the range [l,r] is queried?

Yes. Just increase the value of a_l by v and increase the value of a_r + 1 by - v (as I said in Fenwick).

erikgrabljevec5

-9

Amazing! Thank you so much.

AtomRush

How to solve inversion count using BIT

For every number check how many numbers are before it? To do this, add numbers to BIT sorted from the largest to smallest.

Shayan-To

Can you explain more?

cristi.dospra

-8

Thank you very much for the great tutorial! :D

_jairsaidds_

Dou you have any example in code of TRIES?

Thanx.

xpertcoder

Refer this and this.

Sonechko

sachithg

Nice blog/tutorial.. it would be useful for beginners as they will get an idea of what all to study.... Would have been nice if something like this was there when I started...

vaishious

Good stuff. Cheers

buGMaster

Hi!
I don't get how suffix array (deterministic version) works.
Could you please give me some more clear description about it?
What does "tmp" store? It seems it contains something like [0,1,...,N-1], doesn't it?!
What about "pos"?
What's the initialization of "tmp"?
Thanks...

http://mirror.codeforces.com/blog/entry/16541

CrazzyBeer

← Rev. 11 →

Speaking of DSU, here's the same function, but in a more understandable way.

int root(int v){ 
  while (par[v]!=-1) {
     par[v]=par[par[v]];
     v=par[v];
  }
  return v;
}

UPD: This is a way to compress paths, but not as efficient as the presented algorithm.

Koderok

If you observe carefully, the code actually does update all the nodes in the path which is what you want to do.

Reyna

the actual code compresses better All of the nodes parent will be the root but in crazzybeer's code it doesnt work also there is a closed bracket which leads to infinite loop

Mistake fixed. Now it should work. (Actually, this is the algorithm from Coursera so it works pretty well)

It works, but, I guess, it does a bit less work than the presented algorithm.

I see now. Thanks.

aaa2333333

Do you actually want to express this idea?

int root(int v)	{
	int r=v,t;
	while(par[r]!=-1)	r=par[r];
	while(v!=r)	t=par[v],par[v]=r,v=t;
	return r;
}

ypizarroza

Thank you for this Useful toturial.

suraj021

What does the modify( p, x ) function do in the Segment tree section ?

anthonyhkf

"Modify a_p to x, it means a_p = x."

"The only thing you should know, it this (then you can change and convert it) :" I think "it" has to be "is",right?

dev_il

Can please someone explain Partial sum part 2? What is the problem it solve? We need to find sum of elements of array a with indexes [l, r) ?

HitmanBBa

Thanks PrinceOfPersia for this very very useful article. :)

Note: please edit line 1 after R/B Tree image bust --> but. :)

Inactivated

how can multiple assignment modifications be done this way?please suggest an idea ,source code bingo !!

singh_iitian

Is there a way to solve a problem which is same as partial sum instead of adding value, we need to change values in the interval [l,r] to v. Is there any good solution for this.

radoslav11

Segment Tree + Lazy Propagation

slycelote

+13

Astrologers proclaimed the Week of CodeChef Challenge. Amount of similar questions doubled.

Adkay

In The persistent data structure function we should have a

s[id] = s[L[id]] + s[R[id]]; before the return statement

eshaankuls25

If the size of the root's interval is n, segment tree could have up to 4n nodes.How can we justify it? When I constructed an example, I got 2n-1 nodes.In what cases it will have 4n nodes.

allllekssssa

Sorry, I didn't explain well in the previous post.

First, interval tree is balanced binary tree. So first level has 1 node, second 2, third 4... i-th level has 2^(i-1) nodes.

If your interval has lenght n, you must add some extra elements till n is not a power of two ( condition for balanced tree). Now the last level has m nodes (m=2^k) and higer levels have 2^(k-1) nodes, 2^(k-2) nodes... Sum of all nodes is 2^(k+1)-1. That is equal with 2m -1 nodes. The worst case is when n=2^x+1 and in that case you should have about 4n nodes, in best case if n=2^x you will have 2n-1 nodes.

I hope that now everything is clear.

Arpa

thank you for this useful algorithms.

is there any other posts like this?

in Lazy propagation you had a function shift and function update.

what if we compare them?

like this:

and this is sum function:

jaswanthi

10 years ago, hide # |

In the given Trie code ( as posted by PrinceOfPersia) , It is only possible to search for prefix, How can we search if the word exist or not ?

Diego1149

10 years ago, hide # ^ |

You can add a boolean array, call it something like ends and initialize it to false, for each word you insert you set ends to true only in the last character of the word, so in your search method change the "return true" line to "return ends[v]". If you insert the word partition, if you look for part, it will return false, instead of true (which is what your code is doing).

aka.Sohieb

You put this problem PATULJCI after explaining the segment tree with lazy propagation and before (Persistent).. but I can not solve it and my friend told me he solved it using (Persistent) segment tree!!!

could you please explain how to solve it without using Persistent ?!

I have no idea how to solve it with persistent!!!

This code is my friend's solution ( OmarHashim ) using persistent,

but i wanna know how to solve it without using persistent, could you please explain or give a hint on how to solve it?!!

UPD I got a O(N log N) solution with no segment tree at all, thanks :)

anh1l1ator

Hello, can you explain what are you doing , I have solved the problem using persistent segment tree(It was trivially a modification to MKTHNUM problem).

cntBit[i][j] => means how many elements from 1 to i having 1 in the jth bit.

you can see it's easy to construct that table in O(nlogn) time.

then when I get query [L,R] I iterate throw bits and see if there is more than (R-L+1)/2 elements having this bit 1, if so I set it, when I done I have a number every bit in it exists in more than siz/2 elements. so it is easy to notice that either this is the answer or there is no answer at all, to check if it is the answer or not, I count how many occurrences of this number in the interval [L,R] using binary search.

closedacc

why adding -1 to i + (1 << (j-1)) in the code of sparse table?

Kole

I'll try to explain on small example.

Let's say that you have array of 10 elements, and you are on 3rd. If you want to increase 5 consecutive starting from position 3 you will have to increase elements on following positions: 3, 4, 5, 6, 7. You'd probably say 'ok, from 3 to 3+5=8', but that is not correct since 3 is considered to be one of these 5 consecutive elements.

Same is with sparse table. If you want to take 2^k consecutive elements starting from position i, last element is i+2^k-1 which is i+(1<<k)-1.

Thomas_94

I believe it is an error. Let say i=4, j=1.

st[i][j] = min(st[i][j-1], st[i + (1 << (j-1)) - 1][j-1]);
st[4][1] = min(st[4][0], st[4][0]);

cipher0

wonderful toturial:) can u please write one about DP?

bhargav104

Can somebody please explain how 444C - DZY Loves Colors is solved with sqrt decomposition? PrinceOfPersia

Frez

9 years ago, hide # |

Hi! I have read that complexity of union find is O(nlog*(n)) but you have written that it's complexity is nlog(n).am i missing something?! (Can anybody help?!)

eightnoteight

9 years ago, hide # ^ |

it depends on the implementation, you will get O(nlog*(n)) if you do path compression, but he just mentions height balancing so the complexity will be O(nlog(n))

mishi_gun

Can any body tell me the condition when we can use lazy propagation on segment tree and when not ??

miro

In persistent segment tree update function, i think you forgot this line of code at the end :

s[ID] = s[L[ID]] + s[R[ID]];

nbz_11

Why is it written as s[id] += (r — l) * x;

I am wondering why is the r-l factor there? Can someone help please

nima10khodaveisi

(r — l) is the length of segment [l , r) ! when we add x to each element in this segment the sum of this segment increase by (r — l) * x ! okay ?

Ghost0fSparta

8 years ago, hide # |

In DSU implementation using arrays/vectors why will the size of set double every time? Example: if I have an array a[] = {a,b,c,d,e,f} and I put queries like merge(a,b), merge(b,c), merge(c,d), merge(d,e) merge(e,f) clearly it takes n-1 steps and so the size of array will increase by 1 in every step.

Also we are adding more tools to less tools to keep the height of tree less, correct me if I am wrong...

Got the answer to first question, the the author is taking about each tool.

Enigma27

There should be this line in the end of the update function in persistence segment tree. s[ID] = s[L[ID]] + s[R[ID]];

brdy

http://www.usaco.org/index.php?page=viewproblem2&cpid=576

Prefix sums style range modification on trees

Is there somebody who knows if it's better to merge the DSUs by height instead of by size? I think the height is more important because I think the complexity of the operating "finding the root" has more to do with the height of the tree.

void merge(int x,int y)
{
	int fx=find(x),fy=find(y);
	if(depth[fx]>depth[fy])
		father[fy]=fx;
	else
	{
		father[fx]=fy;
		if(depth[fx]==depth[fy])
			depth[fy]++;
	}
}

harshhx17

7 years ago, hide # ^ |

In a lot of questions, we need the size of the sets.

Hence for the template, i would personally prefer 'union by size' over 'union by rank'.

MonsterInMe

after 3 years,now its 2018,and this blog is still read by many...thank you PrinceOfPersia

Rahul

Thank you so much for this useful tutorial. Especially that DSU implementation — In the code above, for each root v, par[v] equals the negative of the number of tools in that box., love the way the how par doubles as the size/rank array, requiring only one array. Saved it as one of my snippets :)