Unsupervised syllable segmentation and evaluation toolkit for speech audio
Project description
findsylls
Unsupervised syllable(-like) segmentation & evaluation toolkit for speech audio. Extract amplitude / modulation envelopes, segment into candidate syllables, and (optionally) evaluate versus Praat TextGrid annotations at nuclei, syllable boundary/span, and word boundary/span levels.
Features
- Pluggable amplitude envelope front-ends: RMS, Hilbert, low-pass, spectral band subtraction (SBS), gammatone, theta oscillator.
- Peak & valley segmentation (extensible: hooks for future algorithms like Mermelstein, oscillator-based).
- Robust TextGrid parsing for phones, syllables, words with vowel / syllabic consonant filtering.
- Multi-level evaluation metrics (TP / Ins / Del / Sub; precision/recall/F1/TER aggregation helpers).
- Batch pipeline utilities + fuzzy filename matching (
.wav↔.TextGrid). - Optional plotting layer for qualitative inspection.
Install
# Core (from PyPI)
pip install findsylls
# Local clone (editable development)
pip install -e .[dev]
# With plotting / visualization extras
pip install 'findsylls[viz]'
# Install from Git (exact tag)
pip install 'git+https://github.com/hjvm/findsylls.git@v0.1.1'
Quick Start
from findsylls import segment_audio
sylls, env, t = segment_audio("example.wav", envelope_fn="sbs", segment_fn="peaks_and_valleys")
print(sylls[:5])
Batch evaluation:
from findsylls import run_evaluation
results = run_evaluation(
textgrid_paths="data/**/*.TextGrid",
wav_paths="data/**/*.wav",
phone_tier=1,
syllable_tier=2,
word_tier=3,
envelope_fn="hilbert",
)
print(results.head())
Aggregate:
from findsylls import aggregate_results
summary = aggregate_results(results, dataset_name="MyCorpus")
print(summary)
CLI
After install:
findsylls segment input.wav --envelope sbs --method peaks_and_valleys --out sylls.json
findsylls evaluate "data/**/*.wav" "data/**/*.TextGrid" --phone-tier 1 --syllable-tier 2 --word-tier 3 --envelope hilbert --out results.csv
Show help:
findsylls --help
findsylls segment --help
findsylls evaluate --help
API Surface
| Function | Purpose |
|---|---|
segment_audio |
One-file end‑to‑end (load → envelope → segment). |
run_evaluation |
Batch match WAV/TextGrid and compute metrics. |
get_amplitude_envelope |
Compute envelope via a registered method. |
segment_envelope |
Dispatch segmentation algorithm. |
flatten_results / aggregate_results |
Reshape & aggregate evaluation outputs. |
plot_segmentation_result |
Multi-panel qualitative plot (optional). |
Adding Methods
- Envelope: implement
compute_*returning(env, times)inenvelope/and register inenvelope/dispatch.py. - Segmentation: implement
segment_<name>(envelope, times, **kwargs)insegmentation/and add branch insegmentation/dispatch.py.
TextGrid Tier Indexing
Indices are 0-based (as provided by the textgrid library). Pass None to skip a tier or -1 for placeholder syllable generation (currently returns empty list).
Evaluation Conventions
- Default tolerance = 0.05s.
EVAL_METHODSordering drives flatten/aggregate loops; include new metric keys there if extending.- Substitutions matter for span metrics; remain zero for nuclei/boundary F1 semantics.
Performance Notes
- Audio loading prefers
torchaudiowhen present (install separately) else falls back tosoundfile/librosa. - Envelope computation is vectorized (NumPy);
thetamethod includes a gammatone filterbank which can be slower—use SBS or Hilbert for faster prototyping. - For large corpora, pre‑sample or cap duration (see example notebook) to iterate quickly.
- Parallelization: current API processes files sequentially; external parallel mapping (e.g.,
joblibormultiprocessing) aroundsegment_audiois safe if you don’t mutate globals.
FAQ
Why are boundary/spans columns missing? If a tier index is None or produces no intervals, those metrics are skipped intentionally.
How do I add my own envelope? Implement a function returning (envelope, times) and register it in envelope/dispatch.py.
Can I stream long recordings? Not yet; current design assumes full in‑memory arrays. A streaming envelope interface is on the roadmap.
Why do I get 0 TP for nuclei? Likely vowel set mismatch; confirm phone tier labels are standard (ARPABET or simple vowels) and consider adjusting SYLLABIC.
Roadmap / TODO
- Implement
generate_syllable_intervals(placeholder now). - Additional segmentation algorithms (Mermelstein, oscillator-based).
- More robust CLI progress + JSON schema for outputs.
- Optional streaming / large-file handling.
Legacy Code
The previous exploratory/monolithic implementations are retained under a legacy/ folder (formerly old/ and findsylls_old/) for reference only. They are excluded from distribution and not supported; prefer the public API described above.
License
MIT. See LICENSE.
Citation
If you use this software in academic work, please cite the repository until a formal paper/preprint is available. A CITATION.cff file will appear in a future release.
Plain text:
Vázquez Martínez, Héctor Javier. (2025). findsylls: Unsupervised syllable segmentation & evaluation toolkit (Version 0.1.1) [Computer software]. https://github.com/hjvm/findsylls
BibTeX:
@software{findsylls,
author = {Vázquez Martínez, Héctor Javier},
title = {findsylls: Unsupervised syllable segmentation & evaluation toolkit},
year = {2025},
version = {0.1.1},
url = {https://github.com/hjvm/findsylls},
license = {MIT}
}
For development guidelines see .github/copilot-instructions.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file findsylls-0.2.0.tar.gz.
File metadata
- Download URL: findsylls-0.2.0.tar.gz
- Upload date:
- Size: 305.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b17769afa4eebae092bb4b8b6d5a632ccec6c48cba5c90aef9320b059fea7632
|
|
| MD5 |
2051cd73a2f467809f6996f1209096d9
|
|
| BLAKE2b-256 |
1c328481532bc60ce1f9c96c45133fcd76661b9aa0998f73471b71773e09abad
|
File details
Details for the file findsylls-0.2.0-py3-none-any.whl.
File metadata
- Download URL: findsylls-0.2.0-py3-none-any.whl
- Upload date:
- Size: 29.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4038537b7ec7ccb4cdc412c1a2e4f9143885ad9910bd815bfdafc8041df2830
|
|
| MD5 |
a49a1fadf0cd50cb048db01215dc9e24
|
|
| BLAKE2b-256 |
6ce5cf229720f3c573660547986f68ec59b53ef578b986eaeccdb72bd9dfdd3a
|