Блог пользователя Andycode3759

Автор Andycode3759, история, 2 года назад, По-английски

So here is the thing: recently I was doing some sustainable data structure problems while I found that the behavior of vector seems to vary from C++ standard versions. I did a little experiment:

#include <vector>
#include <cstdio>
using namespace std;
const int MAXN = 1000006;

int n;
int arr[MAXN];

struct SegTreeNode
{
    SegTreeNode(int _l = 0, int _r = 0, int _v = 0)
    {
        val = _v;
        ls = rs = -1;
        sl = _l, sr = _r;
    }
    int val;
    int ls, rs;
    int sl, sr;
};

vector<SegTreeNode> sgt;
int root;

inline int newNode(int sl, int sr, int val)
{
    sgt.push_back(SegTreeNode(sl, sr, val));
    return sgt.size() - 1;
}
int build(int l, int r)
{
    int t = newNode(l, r, 0);
    if (l == r)
    {
        sgt[t].val = arr[l];
        return t;
    }
    int mid = (l + r) >> 1;
    sgt[t].ls = build(l, mid);
    sgt[t].rs = build(mid + 1, r);
    sgt[t].val = sgt[sgt[t].ls].val + sgt[sgt[t].rs].val;
    return t;
}
int query(int idx, int l, int r)
{
    if (l <= sgt[idx].sl && sgt[idx].sr <= r)
        return sgt[idx].val;
    int mid = (sgt[idx].sl + sgt[idx].sr) >> 1;
    int res = 0;
    if (l <= mid)
        res += query(sgt[idx].ls, l, r);
    if (r > mid)
        res += query(sgt[idx].rs, l, r);
    return res;
}

int main()
{
    scanf("%d", &n);
    for (int i = 1; i <= n; i++)
        scanf("%d", arr + i);
    root = build(1, n);
    int l, r;
    scanf("%d %d", &l, &r);
    printf("%d\n", query(root, l, r));
    for (int i = 1; i <= n; i++)
        printf("%d ", query(root, i, i));
    printf("\n");
    for (SegTreeNode nd : sgt)
        printf("val=%d [%d,%d] ls=%d rs=%d\n", nd.val, nd.sl, nd.sr, nd.ls, nd.rs);
    return 0;
}

It's just a basic segment tree with dynamically allocated nodes. I compiled the same code in both C++14 and C++17, and I got different output from two programs:

Input:
5
1 2 3 4 5
1 4

C++14:
8
4 4 4 4 4
val=8 [1,5] ls=-1 rs=-1
val=4 [1,3] ls=-1 rs=5
val=2 [1,2] ls=3 rs=-1
val=1 [1,1] ls=-1 rs=-1
val=2 [2,2] ls=-1 rs=-1
val=3 [3,3] ls=-1 rs=-1
val=8 [4,5] ls=7 rs=-1
val=4 [4,4] ls=-1 rs=-1
val=5 [5,5] ls=-1 rs=-1

C++17:
10
1 2 3 4 5
val=15 [1,5] ls=1 rs=6
val=6 [1,3] ls=2 rs=5
val=3 [1,2] ls=3 rs=4
val=1 [1,1] ls=-1 rs=-1
val=2 [2,2] ls=-1 rs=-1
val=3 [3,3] ls=-1 rs=-1
val=9 [4,5] ls=7 rs=8
val=4 [4,4] ls=-1 rs=-1
val=5 [5,5] ls=-1 rs=-1

Only the result of C++17 is correct. I also modified the program to use static arrays and the correct output appears in both C++14 and C++17. So I'm confident that the vector in C++14 is to blame.

Can anyone explain why it happens? I tried to compile the source into assembly code and I did found that there are differences, but I can't understand them: dynamic-segtree-cpp14.asm, dynamic-segtree-cpp17.asm. Do they provide clearer evidence?

Теги c++
  • Проголосовать: нравится
  • 0
  • Проголосовать: не нравится

»
2 года назад, скрыть # |
 
Проголосовать: нравится 0 Проголосовать: не нравится

Although I don't fully understand the internal code of STL, I have encountered problems similar to yours. Generally speaking, vector may apply for new memory when changing its size, which will cause the previous address to become invalid. I think code like "sgt[t].ls = build(l, mid);" may be the culprit. It first obtains an address in the vector, and then enters the build function, which causes the vector to expand and the address Invalid. An error occurs, which explains why static arrays are correct. As for why C++17 is correct, I don't know much about it. It may be that the standard stipulates the execution order of assignment statements, or something else. But I suggest it's better not to write such code in CP. Hope this helps! sr for my poor english, used google translate :P

»
2 года назад, скрыть # |
 
Проголосовать: нравится 0 Проголосовать: не нравится

c++17 added a lot of extra sequence points in evaluation order (https://en.cppreference.com/w/cpp/language/eval_order — in this case specifically rule 19).

In C++14 your program has undefined behaviour because there isn't any ordering requirement between evaluating where to store the result (sgt[t].ls) and evaluating the result to store there (build(l, mid)).

  • »
    »
    2 года назад, скрыть # ^ |
    Rev. 3  
    Проголосовать: нравится 0 Проголосовать: не нравится

    I'd say that even in C++17 this is undefined behavior, because build is called after evaluation of reference to node and reference can be lost because of reallocation during buildNode

    Edit: sure static arrays would work, because there is no reallocation.

    Edit 2: stupid me, of course in C++17 left side of assignment is strictly after right side.