Computing Jordan canonical form
Definition 1 root subspace for λ is the subspace satisfying λ \existk,(A−λI)kα=0
Lemma 1 The sum of different root subspace is direst sum
proof
f(x) is the characteristic polynomial's factor of λ1 , g(x) is the remaining part
Now f(x) is annihilator for $$$W_{\lambda 1}becausethewholecharacteristicpolynomialisannihilator,butg(x)cannotannihilateW{\lambda _1}$$$ (consider the diagonal of the matrices)
we have (f(x),g(x))=1 now we claim that g(A) is invertible. There exist u(x)g(x)=1mod
so u(\mathcal{A})g(\mathcal{A})=\mathcal{I}
to prove W_{\lambda_1}\oplus W_{\lambda_2}\dots\oplus W_{\lambda_t} we only need to prove vector 0 have only one representation.
say 0=u_1+u_2+\dots+u_t . applying g(\mathcal{A}) to both side yield g(\mathcal{A})u_1=0 . Then applying u(\mathcal{A}) gives the result. \square
Definition 2 cyclic subspace C(\alpha) is the subspace generated by \alpha \ ,\mathcal{A}\alpha,\ \mathcal{A}^2\alpha \dots and denote the last none zero term of the sequence last(\alpha)
(it's easy to prove \alpha \ ,\mathcal{A}\alpha,\ \mathcal{A}^2\alpha \dots are independent.
Lemma 2 Nilpotent operator is the direct sum of some cyclic subspace
proof
we give a constructive and algorithmic proof.
Let the degree of minimal polynomial be d.
First start with solving AX=0 ,then A^2X=0 ..... A^dX=0 The solution space is expanding. Denote X_i the set of basis that is newly inserted after solving the i-th equation.
Denote the vectors in X_d \alpha_1,\alpha_2\dots\alpha_k , in X_{d-1} \beta_1,\beta_2\dots\beta_l in X_{d-2} \gamma_1,\gamma_2\dots\gamma_m
First we claim that C(\alpha_1)\oplus C(\alpha_2)\dots \oplus C(\alpha_k) .
applying \mathcal{A}^{d-1} to both side yields
now
is in Ker(\mathcal{A}^{d-1}) ,but \alpha_i are newly inserted, if this is not 0, some \alpha can be represented by the basis of Ker(\mathcal{A}^{d-1}) ,a contradiction.
so \forall i,\mu_{i0}=0 because \alpha_i are linearly independent.
now apply \mathcal{A}^{d-2} and similarly \forall i,\mu_{i1}=0. the remaining proof is obvious.
At that time, we want to insert C(\beta_i) but not all of them should be inserted.
Firstly I thought if \beta is linearly independent with all \mathcal{A}\alpha , C(\beta) and C(\alpha) is direct sum. but that was wrong.
consider the following example A=
0 0 1 2 3
0 0 0 1 -1
0 0 0 1 2
0 0 0 0 0
0 0 0 0 0
X_1 is
1 0
0 1
0 0
0 0
0 0
X_2 is
0 0
0 0
1 0
0 -2
0 1
X_3 is
0
0
0
0
1
now \alpha \mathcal{A} \alpha \mathcal{A}^2 \alpha are
0 3 2
0 -1 0
0 2 0
0 0 0
1 0 0
now choose in X_2 \beta =
0
0
1
0
0
which is linearly independent with \mathcal{A}\alpha but \mathcal{A}\beta=
1
0
0
0
0
is linearly independent with \mathcal{A}^2 \alpha
so it occurred to me that I should maintain a linearly independent kernel.
apply \mathcal{A}^{d-1} we have.
now apply \mathcal{A}^{d-2}
because the \beta we choose (\beta here is not all the \beta )to add here satisfy last(\beta) is linearly independent with last(\alpha)
so \forall i, \mu_{i,1}=\lambda_{i,0}=0 continue the process, we can prove we get some direct sum of cyclic subspace.
but is the sum the complete space V ?
Note that originally X_1, X_2 \dots are a basis.
when we come to the k-th layer(corresponding to inserting X_k) , we can consider this algorithm as substituting some vectors in X_k with the previously inserted vectors. For example, use \mathcal{A} \alpha to change out some \beta . Because originally last(X_{d-1}) is linearly independent with size |X_{d-1}|=|last(X_{d-1})| , finally the vectors in the answer set form a set with size |last(X_{d-1})| too.(the basis of a space is a fixed number).
So finally we will get a set with size |X_{d-1}| in place of X_{d-1}. For X_k it is similar. \square
Theorem There is a basis under which the matrix is in Jordan form.
proof
For every eigenvalue, run algorithm of lemma 2. And the union of vectors is a Jordan basis.
Furthermore, it is easy to see that the number of different size of Jordan square is determined by the solution of (A-\lambda^kI)X=0 so it is determined by rank(A-\lambda I)^k . Which is invariant for similar relation.
So Jordan form is canonical form of similar matrices. \square
Corollary Jordan form and its transfer matrix can be computed in O(n^4).
proof
Computing the solution for A^kX=0 need Cn^4 operations if implemented using ordinary method.
Computing the chain of a vector \alpha (the space C(\alpha)) need C\sum d_i\times n^2=Cn^3. Maintaining the kernel need C\sum d_i^3=O(n^3)
So the complexity is O(n^4) \square
This method is a clear and straightforward geometry method. It is easy to implement compared to algebraic method of \lambda-matrix. And the proof of lemma 2 is different from classical textbook because I want to prove the correctness of the algorithm.
There are better algorithms for computing Jordan form. For example, An O(n^3) Algorithm for the Frobenius Normal Form Storjohann 1998 is an easy faster way.
I did not find a fast algorithm that can compute transfer matrix of Jordan form online.
but use idea of some reduction. It may be done with better complexity.
The code below needs eigenvalues.
#include<bits/stdc++.h>
#define de fprintf(stderr,"on-%d\n",__LINE__)
#define pb push_back
#define fr(i,a,b) for(int i(a),end##i(b);i<=end##i;i++)
#define fd(i,a,b) for(int i(a),end##i(b);i>=end##i;i--)
#define reduce Reduce
#define apply Apply
using namespace std;
const int N=30;
int m;
struct frac{
int q,p;
frac(){
q=0;p=1;
}
void out(){
if(p==1)printf("%7d ",q);
else printf("%3d/%3d ",q,p);
}
};
frac zero;
frac make(int x,int y){
frac A;
A.q=x;A.p=y;
return A;
}
frac r[N];
int d[N];
frac V(int x){
return make(x,1);
}
frac read(){
int x;scanf("%d",&x);
return V(x);
}
int gcd(int x,int y){
if(!x)return y;
return gcd(y%x,x);
}
frac reduce(frac &A){
int g=gcd(A.p,A.q);
A.p/=g;A.q/=g;
if(A.p<0)A.p=-A.p,A.q=-A.q;
return A;
}
frac reduce(frac A){
int g=gcd(A.p,A.q);
A.p/=g;A.q/=g;
if(A.p<0)A.p=-A.p,A.q=-A.q;
return A;
}
frac operator +(const frac &A,const frac &B){
return reduce(make(A.q*B.p+B.q*A.p,A.p*B.p));
}
frac operator -(const frac &A,const frac &B){
return reduce(make(A.q*B.p-B.q*A.p,A.p*B.p));
}
frac operator *(const frac &A,const frac &B){
assert(A.p);
return reduce(make(A.q*B.q,A.p*B.p));
}
frac operator /(const frac &A,const frac &B){
return reduce(make(A.q*B.p,A.p*B.q));
}
struct matrix{
frac A[N][N];
frac* operator [](int x){return A[x];}
void out(){
fr(i,1,m){
fr(j,1,m)A[i][j].out();
printf("\n");
}
}
};
matrix I;
void init(){
fr(i,1,m)fr(j,1,m)I[i][j]=V(i==j);
}
matrix operator *(matrix A,matrix B){
matrix C;
fr(i,1,m)fr(j,1,m){
C[i][j]=V(0);
fr(k,1,m)C[i][j]=C[i][j]+A[i][k]*B[k][j];
}
return C;
}
matrix operator *(frac v,matrix A){
fr(i,1,m)fr(j,1,m)A[i][j]=v*A[i][j];
return A;
}
matrix operator +(matrix A,matrix B){
fr(i,1,m)fr(j,1,m)A[i][j]=A[i][j]+B[i][j];
return A;
}
matrix operator -(matrix A,matrix B){
fr(i,1,m)fr(j,1,m)A[i][j]=A[i][j]-B[i][j];
return A;
}
vector<frac> apply(matrix &A,const vector<frac>& B){
vector<frac> C;C.resize(m+1);
fr(i,1,m)fr(k,1,m){
C[i]=C[i]+A[i][k]*B[k];
}
return C;
}
bool p[N];
int to[N];
void rowreduce(matrix &A,int n,int m,int op=0){
fr(i,1,m)to[i]=p[i]=0;
fr(i,1,n){
int s=0;
fr(j,1,m)if(!p[j]&&A[i][j].q){
s=j;break;
}
if(!s)continue;
to[i]=s;
p[s]=1;
fr(j,op*i+1,n)if(j!=i&&A[j][s].q){
frac t=A[j][s]/A[i][s];
fr(k,1,m)A[j][k]=A[j][k]-t*A[i][k];
}
}
}
void swap(frac* A,frac *B){
fr(i,1,m)swap(A[i],B[i]);
}
matrix inverse(matrix A){
matrix B=I;
fr(i,1,m){
int s=0;
fr(j,i,m)if(A[j][i].q){
s=j;break;
}
assert(s);
swap(A[i],A[s]);swap(B[i],B[s]);
const frac t=A[i][i];
fr(j,1,m){
A[i][j]=A[i][j]/t;
B[i][j]=B[i][j]/t;
}
fr(j,1,m)if(j!=i&&A[j][i].q){
const frac t=A[j][i];
fr(k,1,m){
A[j][k]=A[j][k]-t*A[i][k];
B[j][k]=B[j][k]-t*B[i][k];
}
}
}
return B;
}
vector<vector<frac> > solve(matrix A){
vector<vector<frac> >W;
rowreduce(A,m,m,0);
fr(i,1,m)if(!p[i]){
vector<frac> T;T.resize(m+1);
T[i]=V(1);
fr(j,1,m)if(to[j]&&A[j][i].q){
T[to[j]]=zero-A[j][i]/A[j][to[j]];
}
W.push_back(T);
}
return W;
}
bool noempty(frac* A){
fr(i,1,m)if(A[i].q)return 1;
return 0;
}
int main(){
#ifdef pig
freopen("pig.in","r",stdin);
freopen("pig.out","w",stdout);
#endif
cin>>m;
init();
matrix A;
fr(i,1,m)fr(j,1,m)A[i][j]=read();
int s;cin>>s;
fr(i,1,s){
r[i]=read();
scanf("%d",&d[i]);
}
matrix Q;int top=0;
fr(i,1,s){
matrix B=A-r[i]*I,C=B;
vector<vector<vector<frac> > > V;
vector<vector<frac> > P;
fr(j,1,d[i]){
vector<vector<frac> > W=solve(C);
matrix A;
fr(k,0,P.size()-1){
fr(l,1,m)A[k+1][l]=P[k][l];
}
fr(k,0,W.size()-1){
fr(l,1,m){
A[k+P.size()+1][l]=W[k][l];
}
}
rowreduce(A,P.size()+W.size(),m,1);
vector<vector<frac> >T;
fr(k,P.size()+1,W.size()+P.size())if(noempty(A[k])){
vector<frac> TT;TT.resize(m+1);
fr(l,1,m)TT[l]=A[k][l];
T.pb(TT);
}
V.pb(T);
P=W;
C=C*B;
}
vector<vector<frac> >ans;
matrix now;int tot=0;
fd(j,V.size()-1,0){
fr(k,0,V[j].size()-1){
vector<frac> T; T.resize(m+1);
fr(l,1,m){
T[l]=V[j][k][l];
}
fr(l,0,j-1)T=apply(B,T);
tot++;
fr(l,1,m)now[tot][l]=T[l];
}
rowreduce(now,tot,m,1);
fr(k,tot-V[j].size()+1,tot)if(noempty(now[k])){
vector<frac> T;T.resize(m+1);
fr(l,1,m)T[l]=V[j][k-(tot-V[j].size()+1)][l];
fr(t,0,j){
ans.pb(T);
T=apply(B,T);
}
}
}
fr(j,0,ans.size()-1){
top++;
fr(k,1,m)Q[k][top]=ans[j][k];
}
}
assert(top==m);
Q.out();
printf("\n");
(inverse(Q)*A*Q).out();
return 0;
}