Lyndon words, Duval factorization, necklace and bracelet enumeration, and De Bruijn sequences in pure Python.
Project description
lyndon-words
Combinatorics on words in pure Python with zero dependencies: Lyndon-word testing, Duval's Chen-Fox-Lyndon factorization, necklace and bracelet enumeration, and De Bruijn sequences.
What is this?
A Lyndon word is a non-empty word that is strictly smaller than all of its rotations. Lyndon words are the building blocks of combinatorics on words: every word factors uniquely into a non-increasing concatenation of Lyndon words (the Chen-Fox-Lyndon theorem), and Lyndon words index the necklaces (rotation classes) and the De Bruijn sequences over an alphabet.
This library implements the classic algorithms directly, with no dependencies and strict typing.
Install
pip install lyndon-words
PyPI release pending. Install from source:
git clone https://github.com/amaar-mc/lyndon-words cd lyndon-words pip install -e .
Quick start
from lyndon_words import is_lyndon, factorize, enumerate_necklaces, enumerate_bracelets, de_bruijn, lyndon_array
# Lyndon-word test (accepts str, list, or tuple)
is_lyndon("aab") # True
is_lyndon("aba") # False
is_lyndon((0, 0, 1)) # True
# Duval's Chen-Fox-Lyndon factorization (non-increasing Lyndon factors)
factorize("banana") # ['b', 'an', 'an', 'a']
factorize("aababab") # ['aabab', 'ab']
# Necklaces: rotation-class representatives over {0, 1} of length 3
list(enumerate_necklaces(alphabet_size=2, length=3))
# [(0, 0, 0), (0, 0, 1), (0, 1, 1), (1, 1, 1)]
# Bracelets: rotation + reflection classes over {0, 1} of length 4
list(enumerate_bracelets(alphabet_size=2, length=4))
# [(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 1, 1), (0, 1, 0, 1), (0, 1, 1, 1), (1, 1, 1, 1)]
# De Bruijn sequence B(2, 3): every length-3 binary word appears once cyclically
de_bruijn(alphabet_size=2, length=3)
# (0, 0, 0, 1, 0, 1, 1, 1)
# Lyndon array: longest Lyndon prefix length at each position
lyndon_array("banana") # [1, 2, 1, 2, 1, 1]
lyndon_array("abcd") # [4, 3, 2, 1] (entire increasing suffix is Lyndon)
lyndon_array("aaaa") # [1, 1, 1, 1] (all equal, no run > 1 is Lyndon)
lyndon_array("0010011") # [7, 2, 1, 4, 3, 1, 1]
Word representation
Words over an integer alphabet are represented as tuple[int, ...], for example
(0, 0, 1) for a 3-symbol word over {0, 1}. This is the canonical type, and every
enumeration and generation function (enumerate_necklaces, enumerate_bracelets,
de_bruijn) yields or returns tuples of ints.
The word predicates is_lyndon and factorize are generic over any comparable
Sequence, so str, list, and tuple all work. The factors returned by factorize
have the same type as the input: pass a str and you get a list of str factors, pass a
tuple and you get a list of tuple factors. String examples like "aab" behave exactly
like integer tuples because string comparison is lexicographic.
API
Enumeration and generation functions take keyword-only alphabet_size and length.
| Function | Description |
|---|---|
is_lyndon(word) |
True iff word is non-empty and strictly smaller than all proper rotations |
factorize(word) |
Duval's O(n) Chen-Fox-Lyndon factorization into non-increasing Lyndon factors |
lyndon_array(word) |
For each index i, the length of the longest Lyndon prefix of word[i:] |
enumerate_necklaces(*, alphabet_size, length) |
FKM enumeration of rotation classes, lexicographic order |
enumerate_bracelets(*, alphabet_size, length) |
Enumeration of rotation + reflection classes, lexicographic order |
de_bruijn(*, alphabet_size, length) |
De Bruijn sequence B(alphabet_size, length) via FKM |
alphabet_size >= 1 and length >= 1 are required; otherwise a ValueError is raised.
Definitions
Lyndon word. A non-empty word strictly smaller, in lexicographic order, than every one of its proper rotations (equivalently, every proper suffix). Lyndon words are aperiodic.
Chen-Fox-Lyndon factorization. Every non-empty word factors uniquely into Lyndon words
l_1 >= l_2 >= ... >= l_m. Duval's algorithm computes this in O(n) time and O(1) extra
space.
Necklace. A rotation-equivalence class of length-n words. The representative is the
lexicographically least rotation. The count is
(1/n) * sum over d dividing n of phi(d) * k^(n/d).
Bracelet. An equivalence class under rotation and reflection (the dihedral group). The representative is the lexicographically least element of the orbit.
Lyndon array. For a word s of length n, the Lyndon array is the integer array
L where L[i] is the length of the longest Lyndon word that is a prefix of s[i:].
Every entry satisfies 1 <= L[i] <= n - i. The first entry L[0] equals the length of
the first (leftmost) factor in the Chen-Fox-Lyndon factorization of s: that factor is
the unique longest Lyndon prefix of the entire word.
De Bruijn sequence. A cyclic sequence B(k, n) of length k^n in which every
length-n word appears exactly once as a contiguous cyclic subword. The FKM construction
concatenates, in lexicographic order, every Lyndon word whose length divides n.
References
- Duval, J.-P. (1983). Factorizing words over an ordered alphabet. Journal of Algorithms.
- Chen, K.-T., Fox, R. H., Lyndon, R. C. (1958). Free differential calculus IV. Annals of Mathematics.
- Fredricksen, H., Maiorana, J. (1978). Necklaces of beads in k colors and k-ary de Bruijn sequences. Discrete Mathematics.
- Fredricksen, H., Kessler, I. J. (1986). An algorithm for generating necklaces of beads in two colors. Discrete Mathematics.
- Ruskey, F. (2003). Combinatorial Generation. University of Victoria.
License
MIT. Copyright (c) 2026 Amaar Chughtai.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lyndon_words-0.2.0.tar.gz.
File metadata
- Download URL: lyndon_words-0.2.0.tar.gz
- Upload date:
- Size: 880.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f11e967a3cb250542fa69ead6aed64247c1ef7c2d6870b7f6abf5063b97742ec
|
|
| MD5 |
b223e8209d65635342405f0af5fadd45
|
|
| BLAKE2b-256 |
5f91b58a2f43b891dcb61b4accda31efd7c104c1972346121aacdf2c3911a104
|
File details
Details for the file lyndon_words-0.2.0-py3-none-any.whl.
File metadata
- Download URL: lyndon_words-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ece1746a620eb34fb85df09766d89691c44b19ddf99524c64a49af259578049
|
|
| MD5 |
5902a52d423e93ab5906f3a515c18a43
|
|
| BLAKE2b-256 |
be3e8c9181dce23b9638c5aa400da7e1855b5a2c81093c36cddd6392f7680a60
|