A modular, dialect-aware Zomi syllabification library with rule-based and CRF backends.
Project description
📦 zomi‑syl
zomi‑syl
A modular, dialect‑aware Zomi syllabification library with rule‑based and CRF backends.
zomi-syl provides a production‑ready syllabifier for Zomi, supporting multiple dialects, multiple backends, and a clean, extensible architecture. It includes:
- A fast rule‑based syllabifier
- A statistical CRF syllabifier
- A unified API
- A full CLI
- A backend registry
- Benchmarking tools
- Dialect profiles
- A clean, documented developer workflow
🚀 Features
- Multiple backends: rule‑based, CRF, transformer‑ready
- Dialect‑aware syllabification (csy [Siyin], ctd [Tedim] , gnb [Gangte], kmm [Kom Rem], pck [Paite], vap [Vaiphei], smt [Simte], tcz [Thado/Thadou], zom [Zo/Zou], [Mate], [Thangkhal], Zolai Standard, Myanmar Zomi, India Zomi)
- Unified API (
zs.syllabify(),zs.analyze()) - Full CLI (
zomi-syl syllabify,zomi-syl models benchmark,zomi-syl models compare) - Benchmarking & evaluation tools
- Extensible backend architecture
- Clean developer documentation
📦 Installation
pip install zomi-syl
🧠 Quick Start
Syllabify a word
zomi-syl syllabify itna
Analyze a word
zomi-syl analyze itna --json
Batch syllabify
zomi-syl batch words.txt --output out.txt
🧰 Python API
import zomi_syl as zs
zs.syllabify("itna")
zs.analyze("itna")
🧩 Backends
zomi-syl supports multiple backends through a unified registry:
- rule — deterministic rule‑based syllabifier
- crf — statistical CRF syllabifier
- transformer — placeholder for future transformer models
List available backends:
zomi-syl models list
Show backend metadata:
zomi-syl models info crf
📊 Benchmarking
Single backend
zomi-syl models benchmark crf
Compare multiple backends
zomi-syl models compare rule crf
Compare all backends
zomi-syl models compare --all
🩺 Diagnostics
Run a full backend self‑test:
zomi-syl models doctor
This checks:
- registry integrity
- model metadata
- backend loadability
- single prediction
- batch prediction
🌏 Dialect Profiles
Profiles live under:
src/zomi_syl/profiles/
Supported dialects:
- Gangte | Not Yet
- Kom | Not Yet
- Mate | Not Yet
Paite | Yes- Simte | Not Yet
- Siyin | Not Yet
Tedim | Yes- Thangkhaal | Not Yet
- Thado/Thadou | Not Yet
- Vaiphei | Not Yet
- Zo/Zou | Not Yet
- India Zomi | Not Yet
- Myanmar Zomi | Not Yet
- Zolai Standard | Not Yet
Eventhough some dialects are not yet supportted, zomi-syl will give higher 90% accurarcy for all the dialects.
List profiles:
zomi-syl profiles list
Show profile info:
zomi-syl profiles info tedim
🧪 Testing
Run all tests:
pytest
Golden CRF regression data:
tests/golden/crf_golden.tsv
🗂 Project Structure
src/zomi_syl/
api.py
cli.py
backends/
profiles/
models/
evaluation/
rule_based/
utils/
...
scripts/
docs/
tests/
training/
🛠 Development
Developer documentation lives in:
docs/Developer/
Key guides:
- Adding new backends
- Unified Metadata Schema (UMS)
- CRF training
- Backend loader
- Test templates
📄 Changelog
The changelog is generated automatically:
make changelog
Template:
docs/Developer/CHANGELOG_template.md
📦 Release Checklist
See:
docs/RELEASE_CHECKLIST_v0.1.0.md
📜 License
MIT License — see LICENSE.
🙌 Contributing
See:
CONTRIBUTING.md
🔗 Command Reference
Full CLI command tree:
zomi-syl
│
├── syllabify
├── analyze
├── batch
├── benchmark
│
├── profiles list|info|validate
├── datasets list|download|validate
│
├── config show|path|validate|set
├── cache info|clear|remove
│
├── validate
├── download
├── version
│
└── models
├── list
├── info
├── benchmark
├── compare
└── doctor
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zomi_syl-0.1.910.tar.gz.
File metadata
- Download URL: zomi_syl-0.1.910.tar.gz
- Upload date:
- Size: 85.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d452c6c49f3053f05de1af30d6fc12d9e483edc3dd557bc45f74e582ac6b9b2
|
|
| MD5 |
bd8675f1f7e1faa76291b3cac02ea3b7
|
|
| BLAKE2b-256 |
f137e4eab70186194ffea0c075883739e247161dcc32cc0aaad1826354fde1ec
|
Provenance
The following attestation bundles were made for zomi_syl-0.1.910.tar.gz:
Publisher:
publishpypi.yml on ZomiLearner/zomi-syl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
zomi_syl-0.1.910.tar.gz -
Subject digest:
2d452c6c49f3053f05de1af30d6fc12d9e483edc3dd557bc45f74e582ac6b9b2 - Sigstore transparency entry: 1925907657
- Sigstore integration time:
-
Permalink:
ZomiLearner/zomi-syl@202da6c8d766169f62ef1a060cfdddb46f45883b -
Branch / Tag:
refs/tags/v0.1.910 - Owner: https://github.com/ZomiLearner
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publishpypi.yml@202da6c8d766169f62ef1a060cfdddb46f45883b -
Trigger Event:
push
-
Statement type:
File details
Details for the file zomi_syl-0.1.910-py3-none-any.whl.
File metadata
- Download URL: zomi_syl-0.1.910-py3-none-any.whl
- Upload date:
- Size: 67.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
322e4509ff678fb6b2012e28fedf7b67ee40f2a5d10b825723166911a989d01a
|
|
| MD5 |
fe70c6d154f0b558aa293b8ea9034681
|
|
| BLAKE2b-256 |
e49028f577aaba55bb828e6b0d6ab94ff64e2798c8585050e4742e13c7af4ae8
|
Provenance
The following attestation bundles were made for zomi_syl-0.1.910-py3-none-any.whl:
Publisher:
publishpypi.yml on ZomiLearner/zomi-syl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
zomi_syl-0.1.910-py3-none-any.whl -
Subject digest:
322e4509ff678fb6b2012e28fedf7b67ee40f2a5d10b825723166911a989d01a - Sigstore transparency entry: 1925907805
- Sigstore integration time:
-
Permalink:
ZomiLearner/zomi-syl@202da6c8d766169f62ef1a060cfdddb46f45883b -
Branch / Tag:
refs/tags/v0.1.910 - Owner: https://github.com/ZomiLearner
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publishpypi.yml@202da6c8d766169f62ef1a060cfdddb46f45883b -
Trigger Event:
push
-
Statement type: