Table of content
Teleporter | Description |
---|---|
I. Lyndon Definitions | Definitions of Lyndon word, Lyndon factorization, ... |
II. Duval Algorithm | Talk about the duval algorithm and how it works |
III. Real Life Applications | Motivation to learn the algorithm |
IV. Programming Applications | The code is short and simple to implement |
V. My Questions | Are there any other applications ? |
................................................................... | .......................................................................................................................... |
I. Lyndon Factorization
1) String Concatenation
Definition: The operation of joining character strings end-to-end
Property I: Non-empty string S is a concatenation of all substrings of itself
Property II: Non-empty string S is a concatenation of any empty string at any position in itself with itself
Property III: Let A,B,C the non-empty string then A+B+C=A+(B+C)=(A+B)+C
For convention, let define the operator + is string concatenation
2) Lyndon Word
Definition: A nonempty string that is strictly smaller in lexicographic order than all of its rotations.
Property I: Lyndon word is nonempty and lexicographically strictly smaller than any of its proper suffixes.
Property II: Let S,T,Z is nonempty word. S is Lyndon word if S<T ∀ S=Z+T
Property III: Lyndon word is MLCS — Minimal Lexicographical Circular Substring or ( LMSR — Lexicographically Minimal String Rotation ) of itself.
3) Lyndon Factorization
Definition: Split the string into many lyndon words in such a way that the words in the sequence are nonincreasing lexicographically
Property I: For s=s1+s2+⋯+sk where si is nonempty shortest-able string and in decreasing order s1≥s2≥⋯≥sk
Property II: Lyndon factorization is unique.
Property III: The last Lyndon Factor is Lexicographically Smallest Suffix of the string
4) Least Starting Position (LSP)
Definition: The minimal position of some substrings that make it LMSR.
Property I: Let X the substring of S that its starting position p is LSP. Then some Lyndon Factors in X has the LSP p
Property II: K-repeated String, where each substring has size L then there are K LSP: 0,L,2L,…,(K−1)L
Property III: The Circular String start from LSP of given string is Lexicographically Minimal String Rotation
II. Duval Algorithm
Exist Time: 1983
Duval algorithm is an effecient algorithm for listing the Lyndon words of length at most n with a given alphabet size s in lexicographic order in O(n)
The position of the last Lyndon factorized word from Duval algorithm provides minimum cyclic string
III. Real Life Applications
1) Finger print identification:
- We can encode the finger print into many detailed circular strings. How to search such finger print again from those in very huge data base ? Circular comparision using lyndon factorization is requried.
2) Biological genetics:
- In some cases, we need to know whether these two group's genetics are belonged to the same bacteria, virus.
3) Games:
- Well, ofcourse we can apply the algorithm in some language/words-related games
IV. Programming Applications
1) Least rotation to make smallest lexicographical ordered string.
The problem:
Given a string S of size N
A right-rotation is that move the leftmost character to rightmost of S
Find the least right-rotation to make S become the smallest lexicographical ordered string
Important Property: After those right-rotations, the string will be Minimum Acyclic String
The solution:
One thing we can come in mind is to use hash or string algorithms in O(n log n), but they are kinda complex
I will make a new blog about all other approachs
- Bruteforces Solution: Let t=s+s. Then for each substring of size |s|, we compare and find the smallest
- Duval Solution: We can apply lyndon factorization with a little bit observation
Practices Problem:
V. My Question
1) Are there any other programming applications for Lyndon Factorization ?
- The algorithm is simple to code while running pretty fast as its low complexities. It would be kinda sad if there is only one main application, isnt it :(
2) Are there any other problems for Lyndon Factorization ?
- To remember something better and understand the algorithm deeper, we need to practice right :D It would be nice if there are some more problems