pigpigger's blog

By pigpigger, history, 13 months ago, In English

Computing Jordan canonical form

Definition 1 root subspace for $$$\lambda$$$ is the subspace satisfying $$$\lambda$$$ $$$\exist k, (\mathcal{A}-\lambda\mathcal{I})^k\alpha=0$$$

Lemma 1 The sum of different root subspace is direst sum

proof

$$$f(x)$$$ is the characteristic polynomial's factor of $$$\lambda_1$$$ , $$$g(x)$$$ is the remaining part

Now $$$f(x)$$$ is annihilator for $$$W_{\lambda_1}$$$
because the whole characteristic polynomial is annihilator, but $$$g(x)$$$ cannot annihilate $$$W_{\lambda _1}$$$ (consider the diagonal of the matrices)

we have $$$(f(x),g(x))=1$$$ now we claim that $$$g(\mathcal{A})$$$ is invertible. There exist $$$u(x)g(x)=1 \bmod f(x)$$$

so $$$u(\mathcal{A})g(\mathcal{A})=\mathcal{I}$$$

to prove $$$W_{\lambda_1}\oplus W_{\lambda_2}\dots\oplus W_{\lambda_t}$$$ we only need to prove vector $$$0$$$ have only one representation.

say $$$0=u_1+u_2+\dots+u_t$$$ . applying $$$g(\mathcal{A})$$$ to both side yield $$$g(\mathcal{A})u_1=0$$$ . Then applying $$$u(\mathcal{A})$$$ gives the result. $$$\square$$$

Definition 2 cyclic subspace $$$C(\alpha)$$$ is the subspace generated by $$$\alpha \ ,\mathcal{A}\alpha,\ \mathcal{A}^2\alpha \dots$$$ and denote the last none zero term of the sequence $$$last(\alpha)$$$

(it's easy to prove $$$\alpha \ ,\mathcal{A}\alpha,\ \mathcal{A}^2\alpha \dots$$$ are independent.

Lemma 2 $$$V$$$ is the direct sum of some cyclic subspace for nilpotent $$$\mathcal{A}$$$

proof

we give a constructive and algorithmic proof.

Let the degree of minimal polynomial be $$$d$$$.

First start with solving $$$AX=0$$$ ,then $$$A^2X=0$$$ ..... $$$A^dX=0$$$ The solution space is expanding. Denote $$$X_i$$$ the set of basis that is newly inserted after solving the i-th equation.

Denote the vectors in $$$X_d$$$ $$$\alpha_1,\alpha_2\dots\alpha_k$$$ , in $$$X_{d-1}$$$ $$$\beta_1,\beta_2\dots\beta_l$$$ in $$$X_{d-2}$$$ $$$\gamma_1,\gamma_2\dots\gamma_m$$$

First we claim that $$$C(\alpha_1)\oplus C(\alpha_2)\dots \oplus C(\alpha_k)$$$ .

$$$ \sum_{i,j} \mu_{ij}\mathcal{A}^j\alpha_i=0 $$$

applying $$$\mathcal{A}^{d-1}$$$ to both side yields

$$$ \sum_{i,j} \mu_{ij}\mathcal{A}^{j+d-1}\alpha_i=0 $$$
$$$ \sum_{i} \mu_{i0}\mathcal{A}^{d-1}\alpha_i=0 $$$

now

$$$ \sum_{i} \mu_{i0}\alpha_i $$$

is in $$$Ker(\mathcal{A}^{d-1})$$$ ,but $$$\alpha_i$$$ are newly inserted, if this is not $$$0$$$, some $$$\alpha$$$ can be represented by the basis of $$$Ker(\mathcal{A}^{d-1})$$$ ,a contradiction.

so $$$\forall i,\mu_{i0}=0$$$ because $$$\alpha_i$$$ are linearly independent.

$$$ \sum_{i\ge1,j} \mu_{ij}\mathcal{A}^j\alpha_i=0 $$$

now apply $$$\mathcal{A}^{d-2}$$$ and similarly $$$\forall i,\mu_{i1}=0$$$. the remaining proof is obvious.

At that time, we want to insert $$$C(\beta_i)$$$ but not all of them should be inserted.

Firstly I thought if $$$\beta$$$ is linearly independent with all $$$\mathcal{A}\alpha$$$ , $$$C(\beta)$$$ and $$$C(\alpha)$$$ is direct sum. but that was wrong.

consider the following example $$$A=$$$

0 0 1 2 3
0 0 0 1 -1
0 0 0 1 2
0 0 0 0 0
0 0 0 0 0

$$$X_1$$$ is

1 0
0 1
0 0 
0 0
0 0

$$$X_2$$$ is

0 0
0 0
1 0
0 -2
0 1

$$$X_3$$$ is

0
0
0
0
1

now $$$\alpha$$$ $$$\mathcal{A} \alpha$$$ $$$\mathcal{A}^2 \alpha$$$ are

0 3  2
0 -1 0
0 2  0
0 0  0
1 0  0

now choose in $$$X_2$$$ $$$\beta =$$$

0
0
1
0
0

which is linearly independent with $$$\mathcal{A}\alpha$$$ but $$$\mathcal{A}\beta=$$$

1
0
0
0
0

is linearly independent with $$$\mathcal{A}^2 \alpha$$$

so it occurred to me that I should maintain a linearly independent kernel.

$$$ \sum_{i,j} \mu_{ij}\mathcal{A}^j\alpha_i+\sum_{i,j}\lambda_{ij}\mathcal{A}^j\beta_i=0 $$$

apply $$$\mathcal{A}^{d-1}$$$ we have.

$$$ \sum_{i\ge1,j} \mu_{ij}\mathcal{A}^j\alpha_i+\sum_{i,j}\lambda_{ij}\mathcal{A}^j\beta_i=0 $$$

now apply $$$\mathcal{A}^{d-2}$$$

$$$ \sum_{i\ge1} \mu_{i,1}\mathcal{A}^{d-1}\alpha_i+\sum_{i}\lambda_{i,0}\mathcal{A}^{d-2}\beta_i=0 $$$

because the $$$\beta$$$ we choose ($$$\beta$$$ here is not all the $$$\beta$$$ )to add here satisfy $$$last(\beta)$$$ is linearly independent with $$$last(\alpha)$$$

so $$$\forall i, \mu_{i,1}=\lambda_{i,0}=0$$$ continue the process, we can prove we get some direct sum of cyclic subspace.

but is the sum the complete space $$$V$$$ ?

Note that originally $$$X_1, X_2 \dots$$$ are a basis.

when we come to the k-th layer(corresponding to inserting $$$X_k$$$) , we can consider this algorithm as substituting some vectors in $$$X_k$$$ with the previously inserted vectors. For example, use $$$\mathcal{A} \alpha$$$ to change out some $$$\beta$$$ . Because originally $$$last(X_{d-1})$$$ is linearly independent with size $$$|X_{d-1}|=|last(X_{d-1})|$$$ , finally the vectors in the answer set form a set with size $$$|last(X_{d-1})|$$$ too.(the basis of a space is a fixed number).

So finally we will get a set with size $$$|X_{d-1}|$$$ in place of $$$X_{d-1}$$$. For $$$X_k$$$ it is similar. $$$\square$$$

Theorem There is a basis under which the matrix is in Jordan form.

proof

For every eigenvalue, run algorithm of lemma 2. And the union of vectors is a Jordan basis.

Furthermore, it is easy to see that the number of different size of Jordan square is determined by the solution of $$$(A-\lambda^kI)X=0$$$ so it is determined by $$$rank(A-\lambda I)^k$$$ . Which is invariant for similar relation.

So Jordan form is canonical form of similar matrices. $$$\square$$$

Corollary Jordan form and its transfer matrix can be computed in $$$O(n^4)$$$.

proof

Computing the solution for $$$A^kX=0$$$ need $$$Cn^4$$$ operations if implemented using ordinary method.

Computing the chain of a vector $$$\alpha$$$ (the space $$$C(\alpha)$$$) need $$$C\sum d_i\times n^2=Cn^3$$$. Maintaining the kernel need $$$C\sum d_i^3=O(n^3)$$$

So the complexity is $$$O(n^4)$$$ $$$\square$$$

This method is a clear and straightforward geometry method. It is easy to implement compared to algebraic method of $$$\lambda$$$-matrix. And the proof of lemma 2 is different from classical textbook because I want to prove the correctness of the algorithm.

There are better algorithms for computing Jordan form. For example, An $$$O(n^3)$$$ Algorithm for the Frobenius Normal Form Storjohann 1998 is an easy faster way.

I did not find a fast algorithm that can compute transfer matrix of Jordan form online.

but use idea of some reduction. It may be done with better complexity.

The code below needs eigenvalues.

#include<bits/stdc++.h>
#define de fprintf(stderr,"on-%d\n",__LINE__)
#define pb push_back
#define fr(i,a,b) for(int i(a),end##i(b);i<=end##i;i++)
#define fd(i,a,b) for(int i(a),end##i(b);i>=end##i;i--)
#define reduce Reduce
#define apply Apply
using namespace std;
const int N=30;
int m;
struct frac{
	int q,p;
	frac(){
		q=0;p=1;
	}
	void out(){
		if(p==1)printf("%7d ",q);
		else printf("%3d/%3d ",q,p);
	}
};
frac zero;
frac make(int x,int y){
	frac A;
	A.q=x;A.p=y;
	return  A;	
}
frac r[N];
int d[N];
frac V(int x){
	return make(x,1); 
}
frac read(){
	int x;scanf("%d",&x);
	return V(x);	
}
int gcd(int x,int y){
	if(!x)return y;
	return gcd(y%x,x);
}
frac reduce(frac &A){
	int g=gcd(A.p,A.q);	
	A.p/=g;A.q/=g;
	if(A.p<0)A.p=-A.p,A.q=-A.q;
	return A;
}
frac reduce(frac A){
	int g=gcd(A.p,A.q);
	A.p/=g;A.q/=g;
	if(A.p<0)A.p=-A.p,A.q=-A.q;
	return A;
}
frac operator +(const frac &A,const frac &B){
	return reduce(make(A.q*B.p+B.q*A.p,A.p*B.p));	
}
frac operator -(const frac &A,const frac &B){
	return reduce(make(A.q*B.p-B.q*A.p,A.p*B.p));	
}
frac operator *(const frac &A,const frac &B){
	assert(A.p);
	return reduce(make(A.q*B.q,A.p*B.p));
}
frac operator /(const frac &A,const frac &B){
	return reduce(make(A.q*B.p,A.p*B.q));
}
struct matrix{
	frac A[N][N];
	frac* operator [](int x){return A[x];}
	void out(){
		fr(i,1,m){
			fr(j,1,m)A[i][j].out();
			printf("\n");
		}
	}
};
matrix I; 
void init(){
	fr(i,1,m)fr(j,1,m)I[i][j]=V(i==j);
}
matrix operator *(matrix A,matrix B){
	matrix C;
	fr(i,1,m)fr(j,1,m){
		C[i][j]=V(0);
		fr(k,1,m)C[i][j]=C[i][j]+A[i][k]*B[k][j];
	}
	return C;
}
matrix operator *(frac v,matrix A){
	fr(i,1,m)fr(j,1,m)A[i][j]=v*A[i][j];
	return A;
}
matrix operator +(matrix A,matrix B){
	fr(i,1,m)fr(j,1,m)A[i][j]=A[i][j]+B[i][j];
	return A;
}
matrix operator -(matrix A,matrix B){
	fr(i,1,m)fr(j,1,m)A[i][j]=A[i][j]-B[i][j];
	return A;
}
vector<frac> apply(matrix &A,const vector<frac>& B){
	vector<frac> C;C.resize(m+1);
	fr(i,1,m)fr(k,1,m){
		C[i]=C[i]+A[i][k]*B[k];	
	}
	return C;
}
bool p[N];
int to[N];
void rowreduce(matrix &A,int n,int m,int op=0){
	fr(i,1,m)to[i]=p[i]=0;
	fr(i,1,n){
		int s=0;
		fr(j,1,m)if(!p[j]&&A[i][j].q){
			s=j;break;	
		}
		if(!s)continue;
		to[i]=s;
		p[s]=1;
		fr(j,op*i+1,n)if(j!=i&&A[j][s].q){
			frac t=A[j][s]/A[i][s];	
			fr(k,1,m)A[j][k]=A[j][k]-t*A[i][k];		
		}
	}
}
void swap(frac* A,frac *B){
	fr(i,1,m)swap(A[i],B[i]);	
}
matrix inverse(matrix A){
	matrix B=I;
	fr(i,1,m){
		int s=0;
		fr(j,i,m)if(A[j][i].q){
			s=j;break;	
		}
		assert(s);
		swap(A[i],A[s]);swap(B[i],B[s]);
		const frac t=A[i][i];
		fr(j,1,m){
			A[i][j]=A[i][j]/t;
			B[i][j]=B[i][j]/t;
		}
		fr(j,1,m)if(j!=i&&A[j][i].q){
			const frac t=A[j][i];	
			fr(k,1,m){
				A[j][k]=A[j][k]-t*A[i][k];		
				B[j][k]=B[j][k]-t*B[i][k];		
			}
		}
	}
	return B;
}
vector<vector<frac> > solve(matrix A){
	vector<vector<frac> >W;
	rowreduce(A,m,m,0);
	fr(i,1,m)if(!p[i]){
		vector<frac> T;T.resize(m+1);
		T[i]=V(1);
		fr(j,1,m)if(to[j]&&A[j][i].q){
			T[to[j]]=zero-A[j][i]/A[j][to[j]];	
		}
		W.push_back(T);
	}
	return W;
}
bool noempty(frac* A){
	fr(i,1,m)if(A[i].q)return 1;
	return 0;
}
int main(){
#ifdef pig
	freopen("pig.in","r",stdin);
	freopen("pig.out","w",stdout);
#endif
	cin>>m;
	init();
	matrix A;
	fr(i,1,m)fr(j,1,m)A[i][j]=read();
	int s;cin>>s;
	fr(i,1,s){
		r[i]=read();
		scanf("%d",&d[i]);
	}
	matrix Q;int top=0;
	fr(i,1,s){
		matrix B=A-r[i]*I,C=B;
		vector<vector<vector<frac> > > V;
		vector<vector<frac> > P;
		fr(j,1,d[i]){
			vector<vector<frac> > W=solve(C);	
			matrix A;
			fr(k,0,P.size()-1){
				fr(l,1,m)A[k+1][l]=P[k][l];	
			}
			fr(k,0,W.size()-1){
				fr(l,1,m){
					A[k+P.size()+1][l]=W[k][l];
				}
			}
			rowreduce(A,P.size()+W.size(),m,1);
			vector<vector<frac> >T;
			fr(k,P.size()+1,W.size()+P.size())if(noempty(A[k])){
				vector<frac> TT;TT.resize(m+1);
				fr(l,1,m)TT[l]=A[k][l];
				T.pb(TT);	
			}
			V.pb(T);
			P=W;	
			C=C*B;
		}
		vector<vector<frac> >ans;
		matrix now;int tot=0;
		fd(j,V.size()-1,0){
			fr(k,0,V[j].size()-1){
				vector<frac> T; T.resize(m+1);
				fr(l,1,m){
					T[l]=V[j][k][l];
				}
				fr(l,0,j-1)T=apply(B,T);
				tot++;
				fr(l,1,m)now[tot][l]=T[l];	
			}

			rowreduce(now,tot,m,1);
			fr(k,tot-V[j].size()+1,tot)if(noempty(now[k])){
				vector<frac> T;T.resize(m+1);
				fr(l,1,m)T[l]=V[j][k-(tot-V[j].size()+1)][l];	
				fr(t,0,j){
					ans.pb(T);
					T=apply(B,T);
				}
			}
		}
		fr(j,0,ans.size()-1){
			top++;
			fr(k,1,m)Q[k][top]=ans[j][k];
		}
	}
	assert(top==m);
	Q.out();
	printf("\n");
	(inverse(Q)*A*Q).out();
	return 0;
}

Full text and comments »

  • Vote: I like it
  • +57
  • Vote: I do not like it

By pigpigger, history, 19 months ago, In English

Problems from analysis

1 数学分析wsj习题5.41

$$$ p_n(x)=\frac{1}{2^nn!}\frac{d^n}{dx^n}(x^2-1)^n\\ prove:\ p_n(1)=1 \\ proof\ 1:\\ \frac{d^n}{dx^n}(x^2-1)^n=n!\sum_{k=1}^n\binom{n}{k}\binom{2k}{n}x^{2k-n}(-1)^{n-k}\\ x=1\\ \sum_{k}\binom{n}{k}\binom{2k}{n}(-1)^{n-k}\\\ =\sum_{k}\binom{n}{k}\binom{2k}{n}(-1)^{n-k}\\ =\sum_{k,j}\binom{n}{k}\binom{k}{j}\binom{k}{n-j}(-1)^{n-k}\\ =\sum_{k,j}\binom{n}{j}\binom{n-j}{k-j}\binom{k}{n-j}(-1)^{n-k}\\ =\sum_{k,j,l}\binom{n}{j}\binom{n-j}{k-j}\binom{k-j}{l}\binom{j}{n-j-l}(-1)^{n-k}\\ =\sum_{k,j,l}\binom{n}{j}\binom{n-j}{l}\binom{n-j-l}{k-j-l}\binom{j}{n-j-l}(-1)^{n-k}\\ =\sum_{k,j,l}\binom{n}{j}\binom{n-j}{l}\binom{j}{n-j-l}(-1)^{n+j+l}(-1)^{k-j-l}\binom{n-j-l}{k-j-l}\\ =\sum_{k,j,l}\binom{n}{j}\binom{n-j}{l}\binom{j}{n-j-l}(-1)^{n+j+l}[j+l==n]\\ =\sum_{j}\binom{n}{j}\\ =2^n $$$
$$$ prove \ 2: \sum_{k}\binom{n}{k}\binom{2k}{n}(-1)^{n-k}\\ =\sum_{k}\binom{k}{n-k}\binom{2k}{k}(-1)^{n-k}\\ =[x^n]\sum_{k}\binom{2k}{k}(1+x)^kx^k(-1)^{n-k} \ \ (ps:\sum_k\binom{2k}{k}x^k=\sqrt{\frac{1}{1-4x}})\\ =[x^n]\sqrt{\frac{1}{1+4x(1+x)}}(-1)^n\\ =[x^n]\frac{1}{1+2x}(-1)^n\\ =2^n $$$

but finally it occurred to me that ......

$$$ \frac{1}{2^nn!}\frac{d^n}{dx^n}(x^2-1)^n\\ =\frac{1}{2^nn!}\frac{d^n}{dx^n}((x+1)^n(x-1)^n)\\ apply \ leibniz \ formula $$$

2

$$$ prove \\\sum_k{\binom{n}{k}\frac{(-1)^k}{m+k+1}}=\frac{m!n!}{(m+n+1)!} $$$

I found there was no suitable tool for it

so I had to look into Concrete Mathematics -- Gosper!

现学

proof 1:

$$$ S(n)=t(n,k)=\binom{n}{k}\frac{(-1)^k}{m+k+1}\\ \hat{t}(n,k)=\beta_0(n)t(n,k)+\beta_1(n)t(n+1,k)=T(k+1)-T(k)\\ T(k) \ has \ the \ form \ T(k)=\frac{r(k)s(k)t(k)}{p(k)} \\ \frac{t(n+1,k)}{t(n,k)}=\frac{n+1}{n+1-k}\\ so \ let \ p(n,k)=\beta_0(n)(n+1-k)+(n+1)\beta_1(n)\\ now\ \hat{t}(n,k)=p(n,k)\frac{t(n,k)}{n+1-k}\\ \bar{t}(n,k)=\frac{t(n,k)}{p(n,k)}\\ \bar{p}(n,k)=\frac{\hat{p}(n,k)}{p(n,k)}\\ \frac{\bar{t}(n,k+1)}{\bar{t}(n,k)}=\frac{\bar{p}(n,k+1)q(n,k)}{\bar{p}(n,k)r(n,k+1)} =\frac{-(n+1-k)(m+k+1)}{(k+1)(m+k+2)}\\ \bar{p}(n,k)=1\\ q(n,k)=(n+1-k)(m+k+1)\\

r(n,k)=-k(m+k+1)\\ (n+1-k)\beta_0(n)+(n+1)\beta_1(n)=(m+k+1)((n+1-k)s(n,k+1)+ks(n,k))\\ consider \ \ the \ \ coefficient \ \ of \ \ k\\ the \ \ degree \ \ of \ \ s \ \ must \ \ be \ \ 0\\ hence \ s(n,k)=1 \ \ \beta_0(n)=-n-1 \ \ \beta_1(n)=m+n+2\\ S(n+1)/S(n)=(n+1)/(m+n+2) $$$

proof 2

$$$ \Delta^n(\frac{1}{x})\\ =\Delta^n((1-x)^{\underline{-1}})=(-1)^n(x-1)^{\underline{-n-1}}=(-1)^n\frac{n!}{x(x+1)...(x+n)} $$$

on the other hand we can expand

$$$ \Delta^n=(E-1)^n=..... $$$

proof 3

then I observe the form

it is similar to a identity that haunted me for more than a year

it is about the connection between the recurrence and the inclusive-exclusive expression of stirling number of second kind

$$$ \sum_{i=1}^m \frac{\binom{m}{i}(-1)^{m-i}}{1-ix}=\frac{x^m \ m!}{\prod_{i=1}^m(1-ix)} $$$

the left hand side is inclusive-exclusive

the right of obtained by the recurrence

it is easy(?) to see this is equivalent to the probem

thus we get a combinatorial demonstration

ps : whether x is an integer is not important since 考虑多项式在无限个点值为零则为零

proof 4

but in fact if we consider starting from the right hand side

Partial Fraction Expansion

$$$ F(x)=\frac{x!}{(x+n+1)!}=\frac{1}{(x+n+1)(x+n)...(x+1)} $$$

convert it into partial fraction

$$$ F(x)=\sum_{k=1}^{n+1}\frac{b_k}{x+k}\\ b_k=\lim_{x\to-k}(x+k)F(x)=\frac{1}{(n+1-k)!(k-1)!(-1)^k}\\ F(x)=\frac{1}{n!}\sum_{k=1}^{n+1}\binom{n}{k-1}\frac{(-1)^{k-1}}{x+k}\\ $$$

then set x=m.

proof 5

the real solution:

$$$ \sum_k\binom{n}{k}\frac{(-1)^k}{m+k+1}\\ =\sum_k\binom{n}{k}\int_0^1{x^{m+k}}dx(-1)^k\\ =\int_0^1x^m(1-x)^n\\ $$$

分部积分法+递推

Full text and comments »

  • Vote: I like it
  • +22
  • Vote: I do not like it