In C++, Currently, I have seen some code that implements segment trees/balanced trees using Pointers.
So, what are the respective advantages of using Pointers and not using them (constants, debugging difficulty, etc.)? I don't quite understand.
Or is there no difference? It's just a matter of personal style.
Some of the responses I have received so far are conflicting, so I came to the CodeForces community for help.
I would like to express our sincere gratitude to all in advance!








What scenario are you talking about exactly and what are the implementations that don't use pointers? There are two questions I think you could be asking:
Node*vs.intoffset into an array)?The most common scenario is that you want to code a standard segment tree that fits in memory normally and it's not doing anything too special. One way to implement this is to use a static layout, where you can infer the positions of the parent/children of a node purely based on the node's index. For example, you can layout a segtree so that the root is at node
1and nodeihas parenti/2and childreni*2andi*2+1. Another approach is that for each node, you could store two pointers, one to each of its children (and maybe one to its parent if you need it). In this case, you should always pick the static layout option, since it's just better. It uses less memory, since you don't need to store pointers at each node. It's faster, since you can implement iterative segtree, it uses less memory and thus has better caching, and you can compute parent/children with a single arithmetic operation. It gives you greater flexibility, e.g. you can implement iterative segment tree, since you can exploit more properties of the structure, like leaves being atTSZ+iand nodes adjacent by index representing adjacent nodes in the tree. It's also probably shorter to implement. I really don't see why you would want to use pointers in this case.The other scenario is when you have a data structure where you fundamentally need to use pointers (or pointer-like objects). This is the case in a persistent segment tree, an implicit (sparse) segment tree, or a BBST. In this case, the layout of the tree is more complicated, and wouldn't fit in memory if you tried to encode the structure statically like for a normal segment tree. Thus, you need to keep a reference to the two children (and maybe the parent) of the node at each node. However, you can still choose to use a pointer directly or store offsets into an array instead. Using pointers directly would be something like
struct Node {Node *lc, *rc; int val;}, where the left child is*lc, the right child is*rc, andvalis some data you store at each node that's not relevant to the structure. The other option is to store all of your nodes in a contiguous arrayNode mem[MEM_SZ];(whereMEM_SZis the max number of nodes you will even want to allocate) and storelcandrcas integers that index into this array:struct Node {int lc, rc; int val;}, left child ismem[lc], right child ismem[rc]. To allocate a node, you take the next free element inmemand increment a counter by one (int mi = 0; void alloc(){return mem + ++mi;}). Deciding between these two options is actually a lot more interesting than deciding between static layout vs. pointers for the dense segtree case.If you are on a system where
sizeof(int) == sizeof(Node*)(e.g. a system where pointers are32bit), then you should probably use pointers. Directly using pointers is more convenient in my opinion (node->lcis shorter thanmem + node->lcormem[node].lc) and it's also slightly faster, since you are directly dereferencing the pointer instead of having to add it to the base array then dereference. Note that this difference in performance might not be too significant, since x86 architectures support complex addressing modes. This makes it so that in some cases, dereferencing a pointer and indexing an array could take the same amount of time (but other times, dereferencing directly might be a bit faster). For example, a line of C codeint x = arr[i];might effectively get turned into an assembly instructionmov x, DWORD [arr+i*4], which computes the pointer pointing to thei-th element ofarrand loads the value stored at that memory location into a register in a single instruction!However, if pointers are larger, then it is less clear whether it is a better idea to use them as opposed to integer offsets into an array. For example, on most systems, pointers are 64 bits, but using 32-bit integers for indexing into your segment tree/BBST array is sufficient. In this case, using an index has the advantage that is uses less memory, which also makes it faster, since your segment tree might fit in a smaller cache (L1 vs. L2 vs. L3 vs. DRAM). Pointers still have the advantage that dereferencing them is slightly simpler and thus faster than array indexing, and it is generally not obvious whether this difference in speed will be larger than the improvement you get from using less memory. Since I find pointers to be more convenient, I usually implement persistent/implicit segtrees and BBSTs with pointers first, and only switch to indices if I get TLE/MLE.
Also, even if you decide to use pointers, you should still pre-allocate all of your nodes in a large global array and you should basically NEVER use
new/deletein competitive programming. Usingnewto allocate a single node is just slow. You usually know (an upper-bound on) the number of nodes you will use, so even if you really want to use dynamic memory allocation (instead of allocating a global array), allocate all of your nodes at once (e.g. using a vector so you don't have to worry about manually deallocating) so that you only do a single allocation, instead of doing one for every node.TLDR: use static layout whenever possible; if you need pointers, use normal pointers first, then switch to offset into an array if you need to use less memory; and always allocate all nodes of your DS at once instead of calling
newfor each node.wt
You can overload the
new/deleteoperators to use a bump allocator. Maybe this is just as fast as allocating all the nodes you need?Sure, that also works. This is effectively the same as what I suggested (
Node mem[MEM_SZ]; int mi; Node* alloc(){return mem + ++mi;}) just a different (more general) interface.Something really good to know when using pointers is that
std::dequeis very similar to a vector, except that it doesn't invalidate pointers upon push backs (since deque internally stores the data in blocks of a fixed size, which are not moved by push back). I would highly advice to usestd::deque.push_backinstead of manually callingnewand creating memory leaks.Using deques can easily lead to MLE — this actually happened once in (Chinese)NOI, where many participants lost points because they used $$$10^6$$$ deques.
That is a drawback of deque internally storing its data in blocks of fixed size. A deque with a single element takes up a significant amount of memory since it has to allocate one entire block. So the fixed sized block can be both an advantage (pointers are not invalidated on push back, fewer allocations needed) and a disadvantage (single element deque is memory hungry).
Thanks.That's all my need.