demoralizer's blog

By demoralizer, history, 5 years ago, In English

I was implementing a Trie for Google Kickstart earlier today and the code is malfunctioning in a very strange way. I have no idea why this is happening. I narrowed down the problem in a few hours.

#include <bits/stdc++.h>
using namespace std;
#define rep(i,a,b)        for(int i=a;i<b;i++)

struct node{
	int ends=0;
	int child[26]={0};
};

vector<node> trie;
int makenode(){
	trie.push_back(node());
	int x=(int)trie.size() - 1;
	cout<<x<<" ";
	return x;
}

void insert(string &s,int root=0,int pos=0){
	if(pos==(int)(s.size())){
		trie[root].ends++;
		return;
	}
	int next=s[pos]-'A';
	if(!trie[root].child[next]){
		// trie.push_back(node()); trie[root].child[next]=(int)trie.size()-1;    //THIS LINE WORKS FINE
		trie[root].child[next]=makenode();
		cout<<trie[root].child[next]<<"\n";
	}
	insert(s,trie[root].child[next],pos+1);
}

int main(){
	trie.clear();
	trie.push_back(node());
	string s;
	cin>>s;
	insert(s);
	return 0;
}

The function makenode() returns x. When I print x, the values are fine but after returning the values are getting changed.

One would expect the above code to print the same integers per line but the values aren't same.

These are the values of x before and after returning from the function makenode().

Does anybody have any idea why this is happening?

(I have noticed that only the values 1,2,4,8,16,32,etc are wrong)

  • Vote: I like it
  • +29
  • Vote: I do not like it

»
5 years ago, # |
  Vote: I like it +65 Vote: I do not like it

It's undefined behavior: you take a reference to trie[root] before executing makenode(). The function pushes back to a vector, which can then reallocate. Then the reference you've just taken becomes invalid, so you're writing to some freed memory.

  • »
    »
    5 years ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    Oh right! That must be it! Should I reserve extra memory for the vector to avoid this undefined behaviour?

    • »
      »
      »
      5 years ago, # ^ |
        Vote: I like it +21 Vote: I do not like it

      That is precisely the problem. You can see it in detail by compiling your program with g++ -fsanitize=address -fsanitize=undefined -D_GLIBCXX_DEBUG and then noting the following lines on running the executable:

      ==6317==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b000000048 at pc 0x5555555624c1 bp 0x7fffffffd770 sp 0x7fffffffd760
      WRITE of size 4 at 0x60b000000048 thread T0
          #0 0x5555555624c0 in insert(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, int, int) /home/gt/test.cpp:25
          #1 0x5555555628dc in main /home/gt/test.cpp:36
      
      0x60b000000048 is located 8 bytes inside of 108-byte region [0x60b000000040,0x60b0000000ac)
      freed by thread T0 here:
          ...
          #8 0x555555562105 in makenode() /home/gt/test.cpp:12
      
      previously allocated by thread T0 here:
          ...
          #8 0x55555556285c in main /home/gt/test.cpp:33
      

      It explicitly states that there was an attempt to reference a pointer after it was freed.

      With this compilation flags you can mostly avoid such undefined behavior in C++ :)

    • »
      »
      »
      5 years ago, # ^ |
        Vote: I like it +6 Vote: I do not like it

      IMO you should rather avoid this pattern at all costs, it's going to hurt you when you least expect it. The line you commented out is fine, you could also do something like int node = makenode; trie[root].child[next] = node;.

      (But yes, reserving the memory probably would help you in this case.)

»
5 years ago, # |
  Vote: I like it +40 Vote: I do not like it

As far as I remember in c++17, order of evaluation for assignment operator is forced to be rtl, So just upgrading your compiler may help

  • »
    »
    5 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    You're right. It's working normally on C++ 17. Thanks!

  • »
    »
    5 years ago, # ^ |
      Vote: I like it +1 Vote: I do not like it

    On a side note, Kickstart didn't support c++17, was giving CE when trying to use c++17 only features.

    • »
      »
      »
      5 years ago, # ^ |
        Vote: I like it +2 Vote: I do not like it

      I forgot that GCJ switched to format where they compile code themselves.

      PS: It's funny how Google promotes using pre-release clang inside but can't provide 3years old standard

      • »
        »
        »
        »
        5 years ago, # ^ |
          Vote: I like it -7 Vote: I do not like it

        Tbh neither does Atcoder (gcc5 only), while CF just moved on from gcc7 32-bit. Not to mention other languages with highly outdated compilers. The absolute state of compilers on OJ systems.

  • »
    »
    5 years ago, # ^ |
    Rev. 2   Vote: I like it -10 Vote: I do not like it

    One usable feature is fold expression even in C++14.