Questions tagged [lyndon-words]

For questions related to Lyndon words, strings that are strictly smaller in order than all of their roations. Consider using with the [combinatorics] tag.

A Lyndon word is a nonempty string that is strictly smaller in lexicographic order than all of its rotations.

Several equivalent definitions exist.

A $k$-ary Lyndon word of length $n > 0$ is an $n$-character string over an alphabet of size $k$, and which is the unique minimum element in the lexicographical ordering in the multiset of all its rotations. Being the singularly smallest rotation implies that a Lyndon word differs from any of its non-trivial rotations, and is therefore aperiodic.

Alternately, a word $w$ is a Lyndon word if and only if it is nonempty and lexicographically strictly smaller than any of its proper suffixes, that is $w < v$ for all nonempty words $v$ such that $w = uv$ and $u$ is nonempty.

Another characterisation is the following: A Lyndon word has the property that it is nonempty and, whenever it is split into two nonempty substrings, the left substring is always lexicographically less than the right substring. That is, if $w$ is a Lyndon word, and $w = uv$ is any factorization into two substrings, with $u$ and $v$ understood to be non-empty, then $u < v$. This definition implies that a string $w$ of length $ ≥ 2$ is a Lyndon word if and only if there exist Lyndon words $u$ and $v$ such that $u < v$ and $w = uv$. Although there may be more than one choice of $u$ and $v$ with this property, there is a particular choice, called the standard factorization, in which $v$ is as long as possible.

The Lyndon words over the two-symbol binary alphabet $\{0,1\}$, sorted by length and then lexicographically within each length class, form an infinite sequence that begins

$$0, 1, 01, 001, 011, 0001, 0011, 0111, 00001, 00011, 00101, 00111, 01011, 01111, ...$$

Source: Wikipedia

13 questions