Hey everyone,
First off, I'm finally an expert! Thank you to everyone who has responded to my previous blog posts, your input helped me grow as a competitive programmer greatly!
Now, onto the actual question. What is the time complexity of using the "union" and "find" operations with an array-based union-find algorithm? First, let me put some pseudo-code of the union-find algorithm so that my ramblings will make more sense.
// We initialize the array such that each node is its own parent array = [0, 1, 2, 3, 4, 5... n] // Find the "root" of a node function find(index): while (array[index] != index) index = array[index] return index // Join the "roots" of two nodes function union(a, b): array[find(a)] = array[find(b)]
Please someone correct me if my implementation is wrong or if there's any way I can improve my pseudo-code. Now, onto the actual question.
Let's imagine you have a straight, directed graph with randomized node order. For example: 3 --> 5 --> 1 --> 2 --> 4. I speculate that the time complexity of generating the union-find array is at worst O(n^2) and that once the array is generated, the time complexity of finding the root from any given node is at worst O(n). Let's go through the example I provided, assuming that we join nodes starting from the lowest node (1), then the second lowest (2), etc.
Init: [0, 1, 2, 3, 4, 5]
Step 1: [0, 2, 2, 3, 4, 5] // Join 1 and 2
Step 2: [0, 2, 4, 3, 4, 5] // Join 2 and 4
Step 3: [0, 2, 4, 5, 4, 5] // Join 3 and 5
Step 4: [0, 2, 4, 5, 4, 5] // 4 has no outward connections
Step 5: [0, 2, 4, 5, 4, 1] // Join 5 and 1
At this point, if we want to query the third node for its connected component, the complexity will be O(n), because we must traverse the array 3 --> 5 --> 1 -- > 2 --> 4. Furthermore, if every node was connected to every other node, then preprocessing would take O(n^2) complexity!
This is really bad, but union-find is such a popular algorithm, I feel like I must be missing something! What is going on here?