I'm going to talking about an approach which can replace matrix multiplication in some problems. It is not only easier to implement, but also easier to adjust your code.

It is hard to write both long and detailed blog. Therefore, you can comment anything which you didn't understand well. I will reply (or update this blog if it is necessary).

## I. Background

Matrix multiplication is useful. In many counting problems, when *n* is small, we can use DP to solve. However, when *n* is large (*n* ≈ 10^{9}), we must use matrix multiplication to increase solution's speed. In solutions using matrix multiplication, generating base matrix is not easy at all. I found an good approach to solve some of those problems without matrix multiplication.

There are several advantages of my approach:

- We don't need to implement multiply operator between two matrix (
*A*·*B*) and power operator (*A*^{k}). - We don't need to spend time to generate base matrix

Besides, there are following disadvantages:

- This approach is only applicable in counting problems, it means this approach can't replace matrix multiplication in all of problems.
- I'm not sure if this approach are usable in all of counting problems (which can be solved using matrix multiplication)

## II. The simplest example

For example, I will use this problem:

How many right bracket sequence which has length *n* and depth is not larger than *L*? (*N* ≤ 10^{9}, *L* ≤ 10)

If *n* = 4 and *L* = 1, ()() is only right bracket sequence, but (()), (((), and ))(( is not.

Let's think about some other common ways before talking about main approach.

Firstly, because of large *n*, we can use matrix multiplication to use this problem. *F[i]* is an array of *L* elements, where *F[i][k]=x* means there are *x* sequence length *i* and current height is *k*. As I said above, the hardest step is to generating base matrix. Amazingly, my approach can solve this problem, and solution size is less than 30 lines.

Let's think about DP approach. It is easy to find formula *f*(*n*, *h*) = *f*(*n* - 1, *h* - 1) + *f*(*n* - 1, *h* + 1) that *f*(*n*, *h*) means number of valid sequences with *n* is the length of remaining sequence, *h* is current height. The goal is calculate *f*(*n*, 0). Of course, time complexity in this case is too large. My approach combine two approaches above.

Now, I will talk about my approach.

Let *f*(*n*, *h*, *h*0) is number of sequence length *n*, start at height *h* (current height), and end at height *h*0.

There are two case: *n* = 2 * *k* and *n* = 2 * *k* + 1.

- If
*n*= 0 then return 1 if*h*=*h*0, 0 otherwise - If
*n*is even, we have:*f*(2 **k*,*h*,*h*0) = sum of all*f*(*k*,*h*,*i*) **f*(*k*,*i*,*h*0) where i in range 0..*L* - If
*n*is odd:*f*(2 **k*+ 1,*h*,*h*0) =*f*(2 **k*,*h*- 1,*h*0) +*f*(2 **k*,*h*+ 1,*h*0) - Additionally, we need to pay attention at following case: if
*h*< 0 or*h*>*L*then*f*= 0 - The goal is output
*f*(*n*, 0, 0).

The complexity of this approach is , equally to solution using matrix multiplication. Note that there are only states, not *O*(*L*^{2} * *n*). For example, *n* = 100, values *n* in all of states will be in the set {100, 50, 25, 24, 12, 6, 3, 2, 1, 0}. Therefore, we can use index of *n* in above set, in other words, the depth of current state, to represent *n* in states instead of *n*.

Pseudo-code

```
def f(n, h, h0, Depth):
if h<0 or h>L: return 0
if n==0: return (h==h0 ? 1 : 0)
if Saved[h][h0][Depth]==true: return Value[h][h0][Depth]
if n is even:
Result =0
for i in 0..L: Result += f(n/2, h, i, Depth+1) * f(n/2, i, h0, Depth+1)
else:
Result = f(n-1, h-1, h0, Depth+1) + f(n-1, h+1, h0, Depth+1)
Saved[h][h0][Depth] = true
Value[h][h0][Depth] = Result
read n, L (from input)
output f(n, 0, 0, 0)
```

To explain how can Depth represent n, let n=100 for example, for each values of n, we have a value of Depth:

n 100 25 24 12 6 3 2 1 0 Depth 0 1 2 3 4 5 6 7 8

## III. General case: when f(n, [a,b,c,...]) can be calculated from f(n-1, [a',b',c',...])

In this case, if you can still use matrix multiplication, you will have a big difficulty in generating base matrix. Consider following problem (I don't know where it is from):

There are *t* kind of flowers. In these *t* kinds, there are 4 kinds of flowers: *gerbera*, *orchid*, *azalea* and *hydrangea*. We use them to create a sequence with *n* pots of flowers. There are several conditions:

- A
*hydrangea*must be put between an*azalea*and an*orchid**(aho*or*oha)* - There are at least
*p*flower different from*gerbera*between any pairs of*gerbera*. Our goal is to find number of valid sequences.

Suppose that there are t=5 kinds of flowers: *azaleas* (a), *hydrangeas* (h), *orchids* (o) *gerbera* (g) and *begonias* (b) and between two gerbera, there must be at least *p=3* flowers different from gerbera. With *n=6*, there are 2906 valid sequences, 5 of them are: *aoaaoo*, *ahohag*, *gbbbgo*, *gbbbog*, *bbbbbb*. Following sequences are invalid: *ohoaha* (substring *“aha”* is invalid because it should be *“oha”* or *“aho”*), *gogbao* (because there are not 3 flowers between *g* and *g* ), *ahohah* (because the last *h* is not adjacent with *a* and *o* ).

Constrains: *n* ≤ 10^{9}, *p* ≤ 20, *t* ≤ 20

It is not to hard to find its DP formula: *f*(*n*, *x*, *Just*) returns number of valid sequence. Its parameters, *n*, *x*, *Just*, are described as follow:

*n*is length of remaining sequence which we need to build.*x*is number of pots different from*gerbera*that I have just put them, in other words, we have just put*x*pots of flowers different from*gerbera*(you can imagine that all pots in range*n+1*..*n+x*are not gerbera).*Just*represents the last pot of flower (the last pot which we have just put, in other words,*Just*represents pot number*n*+ 1),*Just*can be one of three following values,*Just*= 1 means*azalea*or*orchid*,*Just*= 2 means*hydrangea*,*Just*= 0 in other cases (included*gerbera*and*t*- 4 remaining kind of flowers)

The following code is DP function, it will work with *n* ≤ 10000:

```
long f(int n, int x, int Just) {
if (x>=p) x=p;
if (Just==2) {
if (n==0) return 0;
return f(n-1, x+1, 1);
} else {
if (n==0) return 1;
if (F[x][Just].count(n)) return F[x][Just][n];
long Sum = f(n-1, x+1, 1) * 2;
if (Just==1) Sum += f(n-1, x+1, 2);
if (x>=p) Sum += f(n-1, 0, 0);
Sum += f(n-1, x+1, 0) * (t-4);
return F[x][Just][n] = Sum % M;
}
}
cout << f(n, ::p, 0) << endl;
```

Now, I will talk about correct solution. To be able to solve with *n* up to 10^{9}, I use my above approach. Now we have *f*(*n*, *p*, *Just*, *p*0, *Just*0) means we are start from state (*n*, *p*, *Just*), how many way to go to state (0, *p*0, *Just*0).

To prevent misunderstanding, I will explain more about *Stop* parameter. Whenever *Stop* = *true*, *f*(*n*, *p*, *Just*, *p*0, *Just*, *Stop*) = *f*(*n*, *p*, *Just*). Whenever *Stop* = *false*, f(n, p, Just, p0, Just, Stop) = f(n, p, Just, p0, Just)$

We have two case: *n* = 2 * *k* and *n* = 2 * *k* + 1. If *n* = 2 * *k* + 1 or *n* = 0, we implement it as well as old DP-function. Else if *n* = 2 * *k*, *f*(2 * *k*, *p*, *Just*, *p*0, *Just*0) = sum of all *f*(*k*, *p*, *Just*, *i*, *j*) * *f*(*k*, *i*, *j*, *p*0, *Just*0) for all valid pairs of *i*, *j* (*i* in range 0..: : *p*, *j* in range 0..2)

Pay attention in case *n* = 0, *n* = 0 doesn't represent the end of the sequence. Because our sequence are broke into many segments, *n* = 0 only represents an end of a part. Therefore, I use additional parameter *Stop* typed *boolean*.

```
map<int, int> G[21][3][21][3][2];
#define C p][Just][p0][Just0][Stop
long g(int n, int p, int Just, int p0, int Just0, bool Stop) {
if (p>=::p) p=::p;
if (n%2==1 || n==0) {
if (Just==2) {
if (n==0) return Stop ? 0 : p==p0 && Just==Just0;
return g(n-1, p+1, 1, p0, Just0, Stop);
} else {
if (n==0) return Stop ? 1 : p==p0 && Just==Just0;
if (G[C].count(n)) return G[C][n];
long Sum = g(n-1, p+1, 1, p0, Just0, Stop) * 2;
if (Just==1) Sum += g(n-1, p+1, 2, p0, Just0, Stop);
if (p>=::p) Sum += g(n-1, 0, 0, p0, Just0, Stop);
Sum += g(n-1, p+1, 0, p0, Just0, Stop) * (t-4);
return G[C][n] = Sum % M;
}
} else {
if (G[C].count(n)) return G[C][n];
long Sum = 0;
for (int i=0; i<=::p; i++)
for (int k=0; k<=2; k++) {
long G1 = g(n/2, p, Just, i, k, false);
long G2 = g(n/2, i, k, p0, Just0, Stop);
Sum += G1*G2;
}
return G[C][n] = Sum % M;
}
}
cout << g(n, ::p, 0, rand()%21, rand()%3, true) << endl;
```

Note that in my above code, `::p`

and `p`

are different. `::p`

is the `p`

in the input (how many flowers different from *gerbera* between any two pairs of *gerbera*), `p`

is parameter of function `g`

.

*rand()%21*, and *rand%3* mean that those values are not important (whenever *Stop==true*, *p0* and *Just0* are not important)

The complexity of above solution is . In fact, we can avoid using map to reduce the complexity. If we do, it becomes , equal to the complexity of using matrix multiplication. (I used map for easier readability)

## IV. f(n) = f(n-1) + f(n-2)

Now, let's calculate 10^9-th number in fibonacci sequence (in some modulo). Use matrix multiplication in this problem is easy, however, we have another way to solve this problem without using matrix multiplication. Consider following example:

You are standing at position n in Ox axis. In a step, you can move to the left 1 or 2. How many ways to reach position 0?

It is not hard to realize *f*(*n*) = *f*(*n* - 1) + *f*(*n* - 2) with *f*(0) = 1 and *f*(1) = 1. Therefore *f*(*n*) is *(n+1)-th* element in fibonacci sequence.

There are two cases:

- n=2*k: we have two choices: first choice is to jump from 2*k to k and jump to 0, another choice is to jump from n to k+1, move 2 step left, and jump from k-1 to 0. Therefore, f(2*k) = f(k)*f(k) + f(k-1) * f(k-1)
- n=2*k+1: consider two segments 0..k and k..n. There are two choices: first choice is to jump from n to k and jump from k to 0, another choice is to jump from n to k+1, move 2 steps to the left, and jump from k-1 to 0. Therefore, f(2*k+1) = f(k)*f(k+1) + f(k-1)*f(k).

If we stop at this point, time complexity is too large, consider case n=100:

- Depth 0: call f(100)
- Depth 1: call f(49), f(50)
- Depth 2: f(23), f(24), f(25)
- Depth 3: 10, 11, 12, 13
- …

Time complexity is now still ~~O(n)~~ O(log n)

We will reduce complexity as following. Let MinDepth[i] be the smallest n in Depth i, for example, array MinDepth[] in above example is {100, 49, 23, 10, …}. Now we will calculate f(25) using f(23) and f(24). Because MinDepth[2] is 23, and 25-23>=2, therefore, f(25) = f(23) + f(24).

Pseudo-code

```
MinDepth[] = {oo,oo,...}
def f(n, Depth):
if n==0 or n==1: return 1
if n-MinDepth[Depth]>=2:
return f(n-1, Depth) + f(n-2, Depth)
if Saved[n]: return Value[n]
MinDepth[Depth] = min(MinDepth[Depth], n)
if n is even:
Count1 = f(n/2-1, Depth+1) * f(n/2-1, Depth+1)
Count2 = f(n/2, Depth+1) * f(n/2, Depth+1)
Result = Count1 + Count2
else:
Count1 = f(n/2-1, Depth+1) * f(n/2, Depth+1)
Count2 = f(n/2, Depth+1) * f(n/2+1, Depth+1)
Result = Count1 + Count2
Saved[n] = true
Value[n] = Result
input n
output f(n,0)
```

- Depth 0: call f(100)
- Depth 1: call f(49), f(50)
- Depth 2: f(23), f(24), f(25)=f(23)+f(24)
- Depth 3: 10, 11, f(12)=f(10)+f(11)
- …

Time complexity now is *O*(*logn*)

## V. Conclusion

It is hard to write both long and detailed blog. Therefore, you can comment anything which you didn't understand well. I will reply (or update this blog if it is necessary).

Thanks your blog :D

Nice stuff. I think in example IV the line

`MinDepth[Depth] = min(MinDepth[Depth], Depth)`

should be changed to`MinDepth[Depth] = min(MinDepth[Depth], n)`

thanks, you are right

EDIT: updated

In III, in the definition the DP formula: f(n, x, Just), I don't understand what did you mean with

there was x pots of flowers different from gerberaDoes It means that there are x pots of flowers different of gerbera at the end of the n pots of flowers, or does it has another meaning?Thank you. I have just updated my blog.

x the function means that pots in range n+1 .. n+x is different from gerbera.

Nice article, although I believe it is just a more intuitive reduction of the matrix multiplication approach. I will try to explain myself by using the last example, with an alternative approach:

Let's say we call a method

find_fib(i) a method that returnsfib(i) andfib(i- 1). Now, we have one base case (i= 1) and two recursion cases:i= 2j, for somej: we have thatfib(i) =fib(j) *fib(j) +fib(j- 1) *fib(j- 1)i= 2j- 1, for somej: we have thatfib(i) =fib(j) *fib(j- 1) +fib(j- 1) *fib(j- 1). So, we can compute the pair foriby just a recursive call ofj=i/ 2 (in caseiis even, foriodd we can call the method fori- 1 and work from there on). This isverysimilar to the recursive logarithmic exponentiation.However, an alternative approach will be the "binary" iterative version of the exponentiation. For that, we will have to be able to compute <

fib(a+b),fib(a+b- 1) > from <fib(a),fib(a- 1) > and <fib(b),fib(b- 1) > . And you can, in fact, by using the same reasoning from above.But, let's think about this: what are we doing here? Well, we are actually simulating the multiplication matrix! Not the matrix itself though, but the different powers of it! It's like instead of holding the matrix, we just find the rules that let us compute a "hashed" version of it (e.g., instead of holding , we hold

w_{n}=A^{n}*vand we "invent" the addition on it:w_{a + b}as a function ofw_{a}andw_{b}.I think this approach is worth noting, as I think you can easily get away with solving recurrences like this without "exposing" the whole matrix and get a (better than cubing in the depth of the recurrence) solution for certain problems with sparse matrices, although I'm not sure about it.

Actually, generating the base matrix is not hard at all if you have done some linear algebra before or at the very least worked with matrix operations a bit.