Hello codeforces!
In one of the recent contests there appeared a problem which involved Euler's phi function, and I figured I could write a bit about this function here.
First, I will briefly introduce Euler's phi function, talk about two of its properties, and give you proofs of both of them. Ultimately, it will lead us to a way to compute the value of this function for some given integer $$$n$$$ with reasonable time complexity.
I tried to keep this post quite basic, so that it's accessible to everyone. Therefore, if you are already quite good at some topic mentioned here, feel free to skip a section or two.
Also, I would like to say thank you to my friend, sysia, who helped me a lot with writing this post and suppressed my tendency to write too much.
What is Euler's phi function?
Euler's phi function (which may be also called Euler's totient function) is a function that gives us the number of positive integers less or equal to a given integer $$$n$$$ that are coprime to $$$n$$$. It is usually denoted by the greek letter $$$\phi$$$. For instance, if we consider the number 6, there are exactly 2 integers that are not greater than 6 and coprime to it. These integers are 1 and 5. Therefore $$$\phi(6) = 2$$$. Similarly $$$\phi(9) = 6$$$. The numbers that are not greater than 9 and coprime to it are 1, 2, 4, 5, 7 and 8. Note that according to the definition $$$\phi(1) = 1$$$.
The properties of Euler's phi function
Euler's phi function has numerous interesting properties, and if you are looking for the whole list, I suggest that you consult wikipedia. For competitive programming, however, two of them are particularly important.
1. Value of phi for an integer power of a prime number
Let $$$p$$$ be a prime number and $$$k$$$ a positive integer. Then \begin{equation*} \phi(p^k) = p^k — p^{k — 1}. \end{equation*}
The proof of this property is quite straightforward. Consider all positive integers that are not greater than $$$p^k$$$. Let us count those which are not coprime to $$$p^k$$$. An integer is not coprime to $$$p^k$$$ if and only if it is a multiple of $$$p$$$. We see that there are $$$p^{k-1}$$$ multiples of p in the interval $$$[1, p^k]$$$. Therefore, there are $$$p^k - p^{k-1}$$$ integers which are not multiples of $$$p$$$ in this interval. Thus, $$$\phi(p^k) = p^k - p^{k - 1}$$$.
2. Multiplicativity
Let $$$a$$$ and $$$b$$$ be two coprime positive integers. Then \begin{equation*} \phi(ab) = \phi(a)\phi(b). \end{equation*}
This property is a bit trickier to prove and the next two sections are devoted to its proof.
Chinese Remainder Theorem
Chinese Remainder Theorem is a well-known topic in competitive programming. Therefore, I would like to keep this section relatively short and not go too much into details. If you want the proof or simply would like to learn more about it, you can visit this page.
Now for the theorem itself. Let's say that we have $$$k$$$ congruences: \begin{equation*} x \equiv a_1 \;\;\; (mod\;\;n_1), \end{equation*} \begin{equation*} x \equiv a_2 \;\;\; (mod\;\;n_2), \end{equation*} \begin{equation*} ... \end{equation*} \begin{equation*} x \equiv a_k \;\;\; (mod\;\;n_k) \end{equation*} such that $$$n_1, n_2, ..., n_k$$$ are pairwise coprime (that is, for every pair of integers (i, j) such that $$$1 \leq i < j \leq k$$$, we have $$$gcd(n_i, n_j) = 1$$$). The Chinese Remainder Theorem says that there exists exactly one integer $$$x$$$ in the interval $$$[0, n_1 \, \cdot \, n_2 \, \cdot \, ... \, \cdot \, n_k - 1]$$$ that satisfies all the congruences.
Let's have a look at an example. Consider the following congruences \begin{equation*} x \equiv 1 \;\;\; (mod \;\; 5), \end{equation*} \begin{equation*} x \equiv 3 \;\;\; (mod \;\; 4), \end{equation*} \begin{equation*} x \equiv 6 \;\;\; (mod \;\; 9). \end{equation*}
One can check that the only integer in the interval $$$[0, 179]$$$ that satisfies all of them is 51.
In fact, there is an algorithm that allows us to find $$$x$$$ given $$$a_1, a_2, ..., a_k$$$ and $$$n_1, n_2, ..., n_k$$$.
For the purpose of this blog, however, we will consider only the case where $$$k=2$$$. Let's say we have two coprime positive integers $$$n_1$$$ and $$$n_2$$$. What the theorem says is that for every pair of integers $$$(a_1, a_2)$$$ ($$$0 \leq a_1 < n_1$$$, $$$0 \leq a_2 < n_2$$$) there exists exactly one integer $$$x$$$ in the interval $$$[0, n_1 \cdot n_2 - 1]$$$ such that $$$x \equiv a_1 \; (mod \; n_1)$$$ and $$$x \equiv a_2 \; (mod \; n_2)$$$. In other words, every pair of integers $$$(a_1, a_2)$$$ uniquely determines some integer $$$x$$$ in the interval $$$[0, n_1 \cdot n_2 - 1]$$$.
Now let's notice that every integer $$$x$$$ in the interval $$$[0, n_1 \cdot n_2 - 1]$$$ also uniquely determines some pair $$$(a_1, a_2)$$$, namely the pair $$$(x \; mod \; n_1, \; x \; mod \; n_2)$$$.
Let's have a look at an example below. We have $$$n_1 = 3$$$ and $$$n_2 = 4$$$. In the picture we have pairs $$$(a_1, a_2)$$$ on the left and integers from 0 to 11 on the right. Arrows are drawn between each pair and its corresponding integer. Note that each pair has exactly one corresponding integer and each integer has exactly one corresponding pair.
With Euler's phi function we deal with intervals of the form $$$[1, n]$$$, whereas in CRT they are usually of the form $$$[0, n - 1]$$$. It turns out that it doesn't really matter. We could change the interval in CRT to the form of $$$[1, n]$$$ and the theorem will still be true. If you don't see it immediately, take your time to convince yourself. From now on, we will be using the $$$[1, n]$$$ convention.
For the rest of this blog I am going to call the pair $$$(a_1, a_2) = (x \; mod \; n_1, x \; mod \; n_2)$$$ the pair corresponding to $$$x$$$. Similarly, I am going to call the integer $$$x$$$ in the interval $$$[1, n_1 \cdot n_2]$$$ such that $$$a_1 = x \; mod \; n_1$$$ and $$$a_2 = x \; mod \; n_2$$$ the integer corresponding to the pair $$$(a_1, a_2)$$$.
The proof of multiplicativity
Now that we have learnt about CRT, we can finally prove the second property of our function. To do so, let's prove the following lemma.
Lemma: Let $$$n_1$$$ and $$$n_2$$$ be two coprime integers. A positive integer $$$x \leq n_1 \cdot n_2$$$ is coprime to $$$n_1 \cdot n_2$$$ if and only if $$$gcd(x \; mod \; n_1, n_1) = 1$$$ and $$$gcd(x \; mod \; n_2, n_2) = 1$$$.
We can rephrase this lemma with two statements:
Firstly, for every pair of integers $$$(a_1, a_2)$$$ such that $$$0 \leq a_1 < n_1$$$, $$$0 \leq a_2 < n_2$$$, $$$gcd(a_1, n_1) = 1$$$, and $$$gcd(a_2, n_2) = 1$$$, the integer $$$x$$$ corresponding to this pair is coprime to $$$n_1 \cdot n_2$$$.
Secondly, for every integer $$$x$$$ in the interval $$$[1, n_1 \cdot n_2]$$$ that is coprime to $$$n_1 \cdot n_2$$$, its corresponding pair $$$(a_1, a_2)$$$ has the following property: $$$gcd(x \; mod \; n_1, n_1) = 1$$$ and $$$gcd(x \; mod \; n_2, n_2) = 1$$$.
Now, consider two coprime integers $$$n_1$$$ and $$$n_2$$$. We know that every number $$$x$$$ in the interval $$$[1, n_1 \cdot n_2]$$$ that is coprime to $$$n_1 \cdot n_2$$$ has its corresponding pair of integers $$$(a_1, a_2)$$$ such that $$$a_1$$$ is coprime to $$$n_1$$$ and $$$a_2$$$ is coprime to $$$n_2$$$. It also works the other way, every pair of integers $$$(a_1, a_2)$$$ such that $$$a_1$$$ is coprime to $$$n_1$$$ and $$$a_2$$$ is coprime to $$$n_2$$$ has its corresponding integer $$$x$$$ coprime to $$$n_1 \cdot n_2$$$.
Therefore, the number of positive integers that are not greater than $$$n_1 \cdot n_2$$$ and coprime to it is equal to the number of pairs $$$(a_1, a_2)$$$ where $$$0 \leq a_1 < n_1$$$ and $$$0 \leq a_2 < n_2$$$ such that $$$a_1$$$ is coprime to $$$n_1$$$ and $$$a_2$$$ is coprime to $$$n_2$$$. The number of such pairs is obviously $$$\phi(n_1) \phi(n_2)$$$ since we can choose $$$a_1$$$ in $$$\phi(n_1)$$$ ways and $$$a_2$$$ in $$$\phi(n_2)$$$ ways. Thus, \begin{equation*} \phi(n_1 \cdot n_2) = \phi(n_1) \cdot \phi(n_2). \end{equation*}
Let's have a look at the example below. It's the picture from the previous section, but this time I marked certain pairs and numbers with a nice greenish colour. The numbers marked are those which are coprime to 12, and the pairs marked are those pairs $$$(a_1, a_2)$$$ that satisfy the property that $$$a_1$$$ is coprime to 3 and $$$a_2$$$ is coprime to 4. What our lemma says is that the marked numbers correspond to the marked pairs and vice versa.
Alternatively, we can use a bit more math notation and say the following.
Let A be the set of positive integers not greater than $$$n_1$$$ that are coprime to $$$n_1$$$, namely
We define B and C in a similar manner
Note that by definition $$$\phi(n_1) = |A|$$$, $$$\phi(n_2) = |B|$$$, and $$$\phi(n_1 \cdot n_2) = |C|$$$. Our lemma tells us that there exists a bijection between $$$A \times B$$$ and $$$C$$$. Therefore, $$$|A \times B| = |C|$$$. We know that $$$|A \times B| = |A| \cdot |B| = \phi(n_1) \phi(n_2)$$$, so $$$|C| = \phi(n_1) \phi(n_2)$$$. Using the fact that $$$|C| = \phi(n_1 \cdot n_2)$$$, we get \begin{equation*} \phi(n_1 \cdot n_2) = \phi(n_1) \phi(n_2). \end{equation*}
Note that here $$$A \times B$$$ is defined as follows
Evaluating Euler's phi function
In order to compute $$$\phi(n)$$$ we will exploit the two properties that appeared earlier in this post. Consider the prime factorisation of $$$n$$$ \begin{equation*} n = p_1 ^ {\alpha_1} \cdot p_2 ^ {\alpha_2} \dots p_k ^ {\alpha_k}. \end{equation*} Note that each of the numbers $$$p_1 ^ {\alpha_1}, p_2 ^ {\alpha_2}, ..., p_k ^ {\alpha_k}$$$ is a power of some prime number, so we can easily evaluate $$$\phi(p_1 ^ {\alpha_1}), \phi(p_2 ^ {\alpha_2}), ..., \phi(p_k ^ {\alpha_k})$$$ using the first property of our function. Namely, \begin{equation*} \phi(p_1 ^ {\alpha_1}) = p_1 ^ {\alpha_1} — p_1 ^ {\alpha_1 — 1}, \end{equation*} \begin{equation*} \phi(p_2 ^ {\alpha_2}) = p_2 ^ {\alpha_2} — p_2 ^ {\alpha_2 — 1}, \end{equation*} \begin{equation*} ... \end{equation*} \begin{equation*} \phi(p_k ^ {\alpha_k}) = p_k ^ {\alpha_k} — p_1 ^ {\alpha_k — 1}. \end{equation*} Moreover, these number are pairwise coprime, so in order to compute $$$\phi(p_1 ^ {\alpha_1} \cdot p_2 ^ {\alpha_2} ... p_k ^ {\alpha_k})$$$, we can just take the product of the value of $$$\phi$$$ for each of them. We obtain \begin{equation*} \phi(n) = \end{equation*} \begin{equation*} = \phi(p_1 ^ {\alpha_1} \cdot p_2 ^ {\alpha_2} ... p_k ^ {\alpha_k}) = \end{equation*} \begin{equation*} = \phi(p_1 ^ {\alpha_1}) \cdot \phi(p_2 ^ {\alpha_2}) ... \phi(p_k ^ {\alpha_k}) = \end{equation*} \begin{equation*} = (p_1 ^ {\alpha_1} — p_1 ^ {\alpha_1 — 1})( p_2 ^ {\alpha_2} — p_2 ^ {\alpha_2 — 1} ) \; ... \; (p_k ^ {\alpha_k} — p_1 ^ {\alpha_k — 1}). \end{equation*}
Note that this formula is usually given in a slightly different form: \begin{equation*} \phi(n) = n \cdot \left( 1 — \frac{1}{p_1} \right) \cdot \left( 1 — \frac{1}{p_2} \right) \cdot ... \cdot \left( 1 — \frac{1}{p_k} \right). \end{equation*}
If you don't immediately see that the two formulas above are the same, you can take time to convince yourself that it is indeed true. It's quite understandable why the latter is usually given. It's quite a beautiful formula in fact. Mathematics at its finest. I personally prefer to look at it this way (and sysia does so too).
Now, having our formula \begin{equation*} \phi(n) = (p_1 ^ {\alpha_1} — p_1 ^ {\alpha_1 — 1})( p_2 ^ {\alpha_2} — p_2 ^ {\alpha_2 — 1} ) \; ... \; (p_k ^ {\alpha_k} — p_1 ^ {\alpha_k — 1}), \end{equation*}
we can easily compute $$$\phi(n)$$$ in $$$O(log \; n)$$$ time if we know the prime factorisation of $$$n$$$. We don't even need to use fast exponentiation, since the sum $$$\alpha_1 + \alpha_2 + ... + \alpha_k$$$ is bounded by $$$O(log \; n)$$$. Here is a C++ function that returns $$$\phi(n)$$$:
int phi(int n) {
vector<pair<int, int>>divisors = factorize(n);
//pairs {prime number, exponent}
int ans = 1;
for(auto [prime, exp] : divisors) {
int power = 1;
for (int i = 1; i<exp; i++){
power *= prime;
}
ans *= (power * prime - power); // (p^exp - p^{exp-1})
}
return ans;
}
The bottleneck here is the time complexity of factorisation. We could just use a simple $$$O(\sqrt{n})$$$ algorithm or, if you want to deal with really large numbers, you can get more fancy and use Pollard's rho algorithm, which should work in $$$O(n ^ {\frac{1}{4}})$$$ (this time complexity analysis is heuristic, but the algorithm is quite fast in practice).
If you want to compute $$$\phi(k)$$$ for every $$$k$$$ from 1 up to some integer $$$n$$$, then it's possible to make it even faster. Using an idea similar to the sieve of Eratosthenes, we can precompute an array $$$sieve$$$ (0-indexed) of size $$$n + 1$$$ so that $$$sieve_x$$$ is equal to some prime divisor of $$$x$$$ (it doesn't matter which one).
Now, we will compute $$$phi_i$$$ for each $$$i$$$ from $$$1$$$ to $$$n$$$. When we compute $$$phi_i$$$ for some $$$i$$$, we will use the fact the we have already computed $$$phi_j$$$ for $$$1 \leq j < i$$$. We take some prime divisor $$$p$$$ of $$$i$$$ from $$$sieve_i$$$ and find the greatest power $$$k$$$ such that $$$p ^ k \; | \; i$$$. We know that $$$i = p^k \cdot \frac{i}{p^k}$$$. Moreover, $$$gcd \left( p^k, \frac{i}{p ^ k} \right) = 1$$$. Therefore, we can set $$$phi_i = (p^k - p^{k - 1}) \cdot phi_{\frac{i}{p^k}}$$$. Here is a C++ function that uses this idea
vector<int> phi_from_1_to_n(int n) {
vector<int>phi(n + 1, 1), sieve(n + 1, -1);
for(int i = 2; i <= n; i++) {
if(sieve[i] == -1) {
sieve[i] = i;
for(long long j = 1ll * i * i; j <= n; j += i)
sieve[j] = i;
}
}
for(int i = 2; i <= n; i++) {
int p = sieve[i], j = i;
while(j % p == 0) {
phi[i] *= p;
j /= p;
}
//j is now equal to i / p^k
phi[i] = (phi[i] / p) * (p - 1) * phi[j];
}
return phi;
}
An obvious bound for the time complexity of this function is $$$O(n \, log \, n)$$$ (with relatively small constant, though). However, I can't prove that there doesn't exist a better bound. If some of you can find a better bound, or can prove that $$$O(n \, log \, n)$$$ is the best we can get, I'd love to read about it in the comments.
Obviously, there are also different ways to compute $$$\phi(k)$$$ for every $$$k$$$ from 1 up to some integer $$$n$$$, which utilise different properties of the phi function. You can refer to this or this comment and choose the method that you like the most.
Actually, it is even possible to compute $$$\phi(k)$$$ for each $$$k$$$ from 1 up to some integer $$$n$$$ in $$$O(n)$$$ time. You can just use the idea of linear sieve. The notion is explained here. Big thanks to brunomont for mentioning it!
To my astonishment (and probably yours too) we can compute prefix sums of the $$$\phi$$$ function in $$$O(n^{\frac{2}{3}})$$$ time(!). It is usually referred to as the Xudyh sieve or Mertens Trick. More details are given here. Many thanks to chromate00 for mentioning it in the comments.
Practice problems
Here is a couple of practice problems. If you know any problems related to this topic that are not mentioned here, you can mention them in the comments.
- 1717E - Мадока и лучший университет
- VJUDGE — GCD — Extreme (II) (suggested by 18o3)
- SPOJ — ETF — Euler Totient Function
- olinfo.it — armadio (suggested by TheScrasse)
- Kattis — Farey Sequence Length (suggested by cho-pingu)
Conclusion
That's the end of this post. I hope you benefited from it and, most importantly, liked it. Also, it's my first post here, so any suggestions from you will be appreciated. Lastly, if you know some more interesting properties of Euler's totient function that may come in handy in competitive programming, you can consider sharing them with us in the comments.
Have a good day there!