After finishing this part, I found out some of them are not that related to D-Finite functions (especially for the last section), but in these problems, we still need some insight to deal with differential operators. So I'll keep them here.
Sums involving powers
Let's consider the following problem of summation
where $$$M$$$ can be finite or infinite. We want to separate a large class of $$$f$$$ such that such summation can be solved in $$$O(k)$$$.
You may already know some examples:
- For the constant function $$$f(n) = 1$$$, known as 662F, can be solved in linear time through Lagrange interpolation on consecutive sampling points.
- For the geometric series $$$f(n) = r^n$$$, you can find the finite case and infinite case (assume $$$|r| < 1$$$) in Library Checker. One well-known way to solve this is through the partial fraction.
- For the binomial case $$$f(n) = \binom N n$$$, known as 932E (though we only have $$$k\leq 5000$$$), or some similar problem 1528F, such kind of summation are usually dealed with Stirling number $$$n^k = \sum_j {k\brace j} j! \binom n j$$$, but such solution involving a whole row of Stirling number costs $$$O(k\log k)$$$ time. I think it was Min25 who first published a deduction which leads to a $$$O(k + \log M)$$$ solution (Web Archive).
- The following examples might not be related to the summation obviously, but we will explain this later. We focus on some famous sequences, such as the Bell number $$$k![x^k] e^{e^x-1}$$$, Bernoulli number $$$k![x^k] \frac{x}{e^x-1}$$$, and the number of alternating permutations $$$k![x^k] \tan x + \sec x$$$. Clearly, these things can all be done in $$$O(k\log k)$$$ by elementary operations on formal power series. But actually, the single term can be evaluated in $$$O(k)$$$.
So how to solve all the above problems in a unified way? The answer is D-Finite functions.
We consider the problem of extracting the coefficient
for some D-Finite function $$$F(t)$$$.
The favorable case is that $$$F(t)$$$ itself only has a degree up to $$$k$$$, then the problem will become trivial. So the general method is to transfer any $$$F(t)$$$ into such case. Consider the expansion
note that $$$[ x^k ] ( e^x - 1 )^j$$$ vanishes when $$$j > k$$$, so we can truncate the sum to $$$j\leq k$$$. Let $$$F_0(t) = F(t+1) \bmod t^{k+1}$$$ and $$$F_1(t) = F_0(t-1)$$$, then we have
and $$$F_1(t)$$$ has degree at most $$$k$$$.
We first show that $$$F_0(t)$$$ is D-Finite. Since $$$F(t)$$$ is D-Finite, clearly $$$F(t+1)$$$ is D-Finite with relations
let $$$a_n$$$ be the sequence associated with $$$F(t+1)$$$, then the D-Finite relation of $$$F(t+1)$$$ induces a P-recursive relation for $$$a_n$$$:
for some polynomials $$$P_j$$$. Let $$$b_n$$$ be the sequence associated with $$$F_0(t)$$$, then we have
Note that (1) still holds when $$$a_n$$$ is substituded by $$$b_n$$$, when $$$n \leq k$$$ and $$$n > k + \ell$$$, this corresponds to an equation in generating function:
Then plug in $$$t-1$$$, we get the D-Finite relation for $$$F_1(t)$$$:
This could be used to compute the coefficient of $$$F_1(t)$$$ in $$$O(k)$$$ time. Therefore, we can compute the coefficient $$$[x^k]F(e^x)$$$ in $$$O(k)$$$ time.
Now we can see how to solve the above problems in a unified way:
- For the constant function $$$f(n) = 1$$$, we have $$$F(t) = \frac{1-t^{M+1}}{1-t}$$$, which is D-Finite.
- For the geometric series $$$f(n) = r^n$$$, we have $$$F(t) = \frac{1}{1-rt}$$$, which is D-Finite.
- For the binomial case $$$f(n) = \binom N n$$$, we have $$$F(t) = (1+t)^N$$$, which is D-Finite.
- For the Bell number, we have $$$F_0(t) = e^t \bmod t^{k+1}$$$.
- For the Bernoulli number, we have $$$F_0(t) = \frac{\ln(1+t)}{t} \bmod t^{k+1}$$$.
- ...
Multipoint Evaluation for some 2-dimensional Arrays
Perhaps the following example is quite famous:
Problem. Given $$$N$$$ points $$$(n, m)$$$ ($$$n,m\leq N$$$), compute the sum
A well-known solution is to use the fact that $$$a_{n,m}$$$ can be efficiently maintained when the coordinates are increased/decreased by one. Then Mo's algorithm can solve this problem in $$$O(N\sqrt N)$$$ time.
In the summer of 2021, djq_cpp told me this problem can be solved in $$$O(N \log^2 N)$$$ time. Since I didn't publish this solution, it was later independently discovered by orzdevinwang and Nyaan.
Consider a sequence of vectors of polynomials
our goal becomes to compute the value of $$$v_m(n)$$$ for several $$$m$$$ and $$$n$$$. We can see that the sequence of vectors $$$v_m(x)$$$ has a transition matrix of constant degree
Then the problem becomes carefully modifying the subproduct tree in the $$$O(n\log^2n)$$$ algorithm of the Vandermonde matrix determinant.
Let $$$[l, r]$$$ be an interval appearing in the segment tree of $$$[1, n]$$$. We maintain all the matrix products
which are polynomial matrices of degree $$$r-l+1$$$.
For each $$$[l, r]$$$ we consider the residue
where
When recursing down the segment tree, we can maintain
and
Then for each leaf node, we can do multipoint evaluation of $$$v_m(x) \bmod R_{m,m}(x)$$$.
It's not hard to see that the above algorithm can be generalized to the case that every $$$T_m$$$ is a polynomial matrix where the sum of degrees is $$$O(N)$$$. This encapsulates a lot of 2-dimensional arrays that can be computed in $$$O(N\log^2 N)$$$ time, for example, the coefficients of 2 variables D-Finite functions. For instance, solving Yukicoder No.2166 Paint and Fill in $$$O(N\log^2 N)$$$ time.
Remark: As we know, there are ways to avoid polynomial modulo operations in multipoint evaluation of polynomials. The situation is similar for the above problem, see this remark (in Chinese) by aleph1022.
The $$$q$$$-analogue of D-Finite Functions
Recently there have been more and more problems related to $$$q$$$-binomials. One of the ways to derive basic identities such as the coefficient
is to use the $$$q$$$-analogue of differential operators. Compare $$$f(qz)$$$ and $$$f(z)$$$, we have
then compare the coefficients of two sides, we have
thus
These tricks could be further developed to compute sequences that are $$$q$$$-holonomic, i.e., the recurrence relation has coefficient rational in $$$q^i$$$.
Uoj Long Round #2, Hash Killer. There is a polynomial $$$f(x)$$$ of degree $$$n-1$$$, we know that its value at $$$f(q^i)$$$ for $$$0\leq i < n$$$, in particular, $$$f(q^i) \neq 0$$$ for only $$$k$$$ positions. Given these data, compute the $$$m$$$th coefficient of $$$f(x)$$$. $$$n$$$ can be very large.
By Lagrange interpolation we have
since $$$f(q^i)$$$ is nonzero for only $$$k$$$ positions, we only need to compute
for this many $$$i$$$.
After some calculation, one can derive that the sequence
satisfies a $$$q$$$-holonomic relation. How to compute $$$k$$$ values of the sequence $$$u_i$$$ efficiently? One could set a block size $$$B\leq \sqrt n$$$, and compute the sequence $$$u_{iB}$$$ for all $$$i\leq n/B$$$ by:
- Compute the consecutive matrix products of the recurrence relation.
- Use the Chirp-Z transform to compute the actual value of the transition matrix.
Then we can compute the sequence $$$u_i$$$ in $$$O((n/B)\log n + k B)$$$ time.
Actually, this problem can be seen as a $$$q$$$-analogue of the multipoint evaluation of P-recursive 2-d arrays. So the theoretical complexity of this problem can be further reduced to $$$\tilde O(\sqrt n + k)$$$.
It's also useful when $$$q$$$ is a formal variable, for example, see AGC069 E.
Graph Enumeration Problems
PA 2019 final, Grafy Count the $$$n$$$-vertex simple digraphs such that every vertex has in-degree and out-degree exactly $$$2$$$. $$$n\leq 500$$$.
The intended solution might be some inclusion-exclusion principle to solve in $$$O(n^3)$$$ time. However, this sequence is P-recursive! So we could compute in $$$O(n)$$$ or $$$O(\sqrt n\log n)$$$ time.
Here are several ways to prove this, the first one is by first writing down the formula obtained by the inclusion-exclusion principle:
The precise formula is not very important, but the key observation is that the above formula is a sum of products of binomials and factorials, then as we argued before, the closure properties of D-Finite functions imply that $$$a_n$$$ is P-recursive.
Another way is to derive the D-Finite via the structure of 2-regular graphs, we first demonstrate a toy example.
Problem. Count the number of $$$n$$$-vertex simple 2-regular graphs.
Note that any 2-regular graph is a disjoint union of cycles. We have its exponential generating function is
which is D-Finite.
The analysis of grafy is much more technical. See a treatment (in Chinese) by Isonan. Roughly speaking, his strategy is the following:
Let $$$G$$$ be a simple digraph, we construct the graph $$$G'$$$ by the following way: - For each vertex $$$u$$$ of $$$G$$$, it has two out-edges $$$(u, x)$$$ and $$$(u, y)$$$. We add a colored edge $$$(x, y)$$$ with color $$$u$$$ in $$$G'$$$.
Then $$$2$$$-biregular simple digraphs $$$G$$$ are in bijection with $$$n$$$-vertex colored graphs $$$G'$$$ in the following way:
- $$$G'$$$ can have multiple edges, but no self-loop.
- $$$G'$$$ is 2-regular.
- The $$$n$$$ edges are colored by $$$[n]$$$, and each colored appears exactly once.
- For each vertex, its adjacent edges are not colored by its own index.
Then we could apply the inclusion-exclusion principle in the following way. Let $$$b_{n,k}$$$ be the number of $$$n$$$-vertex graphs that satisfy 1-2, and have $$$n-k$$$ specified vertices matching an adjacent edge. Then we have
The gf of $$$b$$$ should be the $$$\exp$$$ of "single cycles", which are easy to write down. Then we could derive the gf of $$$b$$$, and finally the gf of $$$a$$$.
I also want to remark that the above argument explicitly relies on the structure of 2-regular graphs. But this seems not necessary. In 1986, Goulden and Jackson proved that the number of $$$3$$$-regular graphs and $$$4$$$-regular graphs are P-recursive. Later, Ira M. Gessel proved that for $$$k$$$-regular graphs for any $$$k$$$ are P-recursive. See Symmetric functions and P-recursiveness. His argument doesn't rely on the structure of regular graphs, but on the theory of symmetric functions.
Change of Basis
Here are several attempts to generalize the ideas in Little Q's sequence (see also the blog).
China Team Selection 2023, Yet Another Eulerian Number Problem. Given $$$n, m, n_0, m_0$$$, consider an 2-d array with initial value $$$F_{n_0, k} = [k = m_0]$$$ and the recurrence relation $$$F_{n, m} = (\alpha\cdot (n-1) + 1 - m) F_{n-1,m-1} + (m+1)F_{n-1,m}$$$, compute the value of $$$F_{n, m}$$$.
Remark: The case $$$\alpha = 1$$$ is the Eulerian number problem.
Let $$$F_n(x)$$$ be the gf of the $$$n$$$-th row, we have the relation
it's not very clear how this equation helps us solve the problem.
Consider the change of basis
the iteration of $$$G$$$ is simplified to
Remark: This step makes the iteration formula independent of $$$n$$$.
When $$$\alpha = 1$$$, this transform is simply taking the coefficient $$${a_k}$$$ to $$${k a_k}$$$, which is easy to perform multiple times. For the general case, we need to perform another change of basis.
Suppose we have some $$$u(x)$$$, take $$$H_n(u) = G_n(x)$$$ and $$$H_n$$$ satisfies
Then we can easily perform the iteration of $$$H$$$ multiple times.
Plug in the iteration rule of $$$G$$$ and $$$H$$$, we obtain the differential equation
Now we can solve the problem in the following way:
- From $$$F_{n_0}(x) = x^{m_0}$$$, we can derive the initial value of $$$G_{n_0}(x) = x^{m_0+1}/(1-x)^{\alpha n_0+1}$$$.
- Then $$$H_{n_0}(x) = G_{n_0}(u^{\langle -1\rangle})$$$ can be computed in $$$O(n\log n)$$$ time. We can first solve $$$u^{\langle -1\rangle}$$$ by solving the differential equation, then use the explicit formula of $$$G_{n_0}$$$.
- Then we can compute $$$H_{n}(x)$$$ from $$$H_{n_0}(x)$$$ by taking the coefficient $$${a_k}$$$ to $$${k ^{n-n_0}a_k}$$$.
- Finally we convert back to $$$F_{n}(x)$$$ by
since we only want to extract the coefficient of $$$x^m$$$, this can be done via Lagrange Inversion, in $$$O(n\log n)$$$ time.
Remark: Since now we have a general way to compose power series in $$$O(n\log^2 n)$$$ time, the above idea actually leads to an algorithm that takes arbitrary initial value $$$F_{n_0}(x)$$$ as input, and compute the value of $$$F_{n}(x)$$$ in $$$O(n\log^2 n)$$$ time.
Luogu P8555 Liar. Given $$$n$$$, for all $$$m$$$, count the number of permutations $$$\sigma$$$ of $$$[n]$$$ such that there exists distinct positions $$$p_m,\dots,p_{n-1}$$$ such that $$$p_t\leq t$$$ and $$$\sigma(p_t)$$$ is increasing.
The original solution is more combinatorial, here I present a solution using the idea of basis change.
We first consider a greedy procedure, first try to maximize the value of $$$\sigma(p_{n-1})$$$, then $$$\sigma(p_{n-2})$$$, and so on.
Consider dynamic programming starting with initial condition $$$f(n,0) = 1$$$. For $$$f(i,j)$$$ we are counting something that $$$p_{\geq i}$$$ are determined, and there are $$$j$$$ positions $$$p_t$$$ such that $$$t\geq i$$$ and $$$p_t \leq i$$$. We'll see what does this mean later.
For $$$f(i, j)$$$ ($$$i > m$$$), we need to determine $$$p_{i-1}$$$. There are two cases: - Case 1: The position $$$i$$$ is already chosen by some $$$p_t$$$, there are $$$j$$$ possibilities. This gives
- For the case that $$$i$$$ is not chosen by any $$$p_t$$$, the rest of unused values are $$$1,\dots,i-j$$$, thus there are $$$i-j$$$ choices for $$$\sigma(i)$$$, and then we must find a position for $$$p_{i-1}\leq i-1$$$, this increases $$$j$$$ by $$$1$$$. This gives
Finally, we have the answer is
because we first need to assign those $$$j$$$ positions, and then the rest of the values are determined.
We let $$$f_i(x)$$$ be the gf of $$$f(i, \cdot)$$$, then the above iteration can be written as
Consider the change of basis
then we have
Then we do another change of basis
where $$$u = te^t, t=x/(1-x)$$$. One can verify that
This again reduces the iteration to a simple form.
In order to extract the answer, we first define the inner product
Our goal is to compute
observed that
we have
Recall that $$$t = x/(1-x)$$$, we have
Hence
We can first compute the coefficient of $$$h_n(u)$$$, then use convolution to compute the answer for all $$$n$$$.
Since $$$u = t e^t$$$, by Lagrange inversion, we have
Therefore, we can compute the answer in $$$O(n\log n)$$$ time.