Time Complexity of Union-Find

#	User	Rating
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

Hey everyone,

First off, I'm finally an expert! Thank you to everyone who has responded to my previous blog posts, your input helped me grow as a competitive programmer greatly!

Now, onto the actual question. What is the time complexity of using the "union" and "find" operations with an array-based union-find algorithm? First, let me put some pseudo-code of the union-find algorithm so that my ramblings will make more sense.

// We initialize the array such that each node is its own parent

 array = [0, 1, 2, 3, 4, 5... n]


 // Find the "root" of a node

 function find(index):

      while (array[index] != index)

           index = array[index]

      return index


 // Join the "roots" of two nodes

 function union(a, b):

      array[find(a)] = array[find(b)]

Please someone correct me if my implementation is wrong or if there's any way I can improve my pseudo-code. Now, onto the actual question.

Let's imagine you have a straight, directed graph with randomized node order. For example: 3 --> 5 --> 1 --> 2 --> 4. I speculate that the time complexity of generating the union-find array is at worst O(n^2) and that once the array is generated, the time complexity of finding the root from any given node is at worst O(n). Let's go through the example I provided, assuming that we join nodes starting from the lowest node (1), then the second lowest (2), etc.

Init: [0, 1, 2, 3, 4, 5]

Step 1: [0, 2, 2, 3, 4, 5] // Join 1 and 2

Step 2: [0, 2, 4, 3, 4, 5] // Join 2 and 4

Step 3: [0, 2, 4, 5, 4, 5] // Join 3 and 5

Step 4: [0, 2, 4, 5, 4, 5] // 4 has no outward connections

Step 5: [0, 2, 4, 5, 4, 1] // Join 5 and 1

At this point, if we want to query the third node for its connected component, the complexity will be O(n), because we must traverse the array 3 --> 5 --> 1 -- > 2 --> 4. Furthermore, if every node was connected to every other node, then preprocessing would take O(n^2) complexity!

This is really bad, but union-find is such a popular algorithm, I feel like I must be missing something! What is going on here?

Rev.	By	When	Δ	Comment
en7	SAT2020	2022-07-24 03:17:47	3	Tiny change: ' is going wrong here?' -> ' is going on here?' (published)
en6	SAT2020	2022-07-24 03:16:12	17
en5	SAT2020	2022-07-24 03:14:42	963
en4	SAT2020	2022-07-23 23:48:38	19
en3	SAT2020	2022-07-23 23:48:01	18	Reverted to en1
en2	SAT2020	2022-07-23 23:47:38	18
en1	SAT2020	2022-07-23 23:46:19	1395	Initial revision (saved to drafts)

Rev.

Lang.

When

Comment

en7