### NAINAAAA's blog

By NAINAAAA, history, 4 weeks ago,

HELLO CF community i'm wondering can we solve below problem w/o using map in O(n)? constraint n < 1e5 & |a[i]| < 1e9

Given an array of integers, find the first repeating element in it. We need to find the element that occurs more than once and whose index of first occurrence is smallest.

• +7

 » 4 weeks ago, # |   +9 jo sanjh khwab dekhte the
•  » » 4 weeks ago, # ^ |   +13 naina
•  » » » 4 weeks ago, # ^ |   +8 bichad ke aaj ro diye hain yun
•  » » » » 4 weeks ago, # ^ |   0
 » 4 weeks ago, # |   -18 Yes, of course it can be done. Here's an idea for writing: You can save each element of the first array along with its index in an array of pairs. Then we have the Radix Sort algorithm, which sorts in O(n), with which we will sort the array of pairs. Note that so far our asymptotics are O(n). Now we will maintain the response index variable, go through the array of pairs, and while the value is equal to the next one, we will compare the indexes: the answer and the index of the current one (which we just saved in the second value of the pair). Thus, we will go over O(n), the final asymptotics is O(n).
•  » » 4 weeks ago, # ^ | ← Rev. 4 →   0 So, i wrote it. That's implementation. But if you need more speed than my implementation, you can use bitwise radix.
•  » » 4 weeks ago, # ^ |   -9 Thanks a ton IceHydra I got some Idea now
•  » » » 4 weeks ago, # ^ | ← Rev. 2 →   0 You may think that my code does not work on negative numbers, and however you are right. But in order to fix this, you only need to do answ = std::max(answ, std::abs(val)) in the FindMaxElement function instead of answ = std::max(answ, val);
•  » » 4 weeks ago, # ^ |   +3 radix sort is not o(n), it is o(n log n). you can say it is n * number of digits, but number of digits is just log base 10 n so it's a constant difference from log n
•  » » » 4 weeks ago, # ^ | ← Rev. 4 →   0 Yes, it really looks like the truth, but if you use a byte radix sort, you can achieve a complexity of O(4 * n), so O(4 * n) ~ O(n). And if you try it on practice, byte radix will be faster in 2 times than std::sort.
•  » » » » 4 weeks ago, # ^ | ← Rev. 2 →   +24 byte radix sort is still technically o(n log n) though, if a is large enough it is much slower than o(n). By that logic you could argue that sieve is linear because n log log n for even 10^7 is less than 4.
•  » » » 4 weeks ago, # ^ |   0 I don't buy this argument, because if so you can never achieve O(n) as all the numbers have a logarithmic number of digits so reading them is O(n log n). Radix sort is O(n) in terms of the number of inputs, since we take the size of the individual inputs to be O(1) in terms of n.
•  » » 4 weeks ago, # ^ |   0 Isn’t radix sort $O(n \log {\max{a_i}})$? Like it has a constant factor that grows logarithmically with the size of the maximum element.
 » 4 weeks ago, # |   0 can someone tell me if below code works for this problem or not : #include #include #include using namespace std; void solve() { int n; cin >> n; vector v(n); for (int i = 0; i < n; ++i) { cin >> v[i]; } vector vis(1e5 + 5, 0); int ind = INT_MAX, ele = -1; for (int i = 0; i < n; ++i) { if (!vis[v[i]]) { vis[v[i]] = i + 1; } else { int i2 = vis[v[i]]; if (i2 < ind) { ind = i2; ele = v[i]; } } } cout << ele << endl; } int main() { solve(); return 0; } 
•  » » 4 weeks ago, # ^ |   0 how will this work for negative numbers and if number is >1e5?
•  » » » 4 weeks ago, # ^ |   0 yeah , actually it's too wrong I got it
 » 4 weeks ago, # |   0 I think hashing might work but I'm not sure.
 » 4 weeks ago, # |   0 To do this, you can create an empty unordered set and iterate through each element. For each element, try to find it in the set. If it’s not found, insert it in the set. If it’s found inside the set, then you’ve found the first repeating element.
 » 4 weeks ago, # |   +15 You can use a bitset of size 1E9 (https://stackoverflow.com/questions/5780112/define-a-large-bitset-in-c), which uses 125 MB.
 » 4 weeks ago, # |   0
•  » » 4 weeks ago, # ^ |   0 But here we don't need to iterate min. bucket size, we know it's = 1 (d=0)
 » 4 weeks ago, # |   0 can we use sets??
 » 4 weeks ago, # |   0 Just use gp_hash_table. It's a O(1) hashtable.
 » 4 weeks ago, # |   0 The best algorithm that I made by this time works in O($n\cdot log_n(max A)$). But with this restrictions it will work in linear time. First of all, we can make all numbers positive. Just find minimum of the array and add to all elements. We do not care about exact element, but we care about it's index. Then, if maximum of array is less or equal to $n$, we can do it in O($n$) just iterating through array and memorizing in the array index of first occurents. Otherwise, we can try to compress elements. Make array of $n$ elements and name it (let it be b). Then $b[x]$ is array of elements from the array $a$ with remainder $x$ after dividing by $n$. In other words, $b[x]$ has all elements $a[i]$, such that a[i]%n=x. We can do it in linear time. Now we can go in each such $x$ where $b[x]$ is non-empty, and if $b[x][i]$ equal to $x+n*k$ for some $k$, we can replace it by $k$. In such way, all elements that are equal will remain equal and not equal not equal. Also, it make $max A$ decrease in $n$ times at least. Then we can apply recursevily this algorithm for each $x$, such that $b[x]$ is non-empty.We cannot use master theorem to analyze this recursion, but we can write out recursive tree and get that branch factor is at most $n$ but number of elements in each recursion is also at most $n$, so the time spent on each level is $O(n)$ and there are at most $log_n(maxA)$ levels, because we decrease $maxA$ in $n$ times at least on each level, so total time is $O(n\cdot log_n(maxA))$.
 » 4 weeks ago, # |   0 ok
 » 4 weeks ago, # |   0 .