Pipeline STT complet du francais — audio vers texte (CTC + P2G)
Project description
Lectura STT — Pipeline STT complet du francais
Pipeline de transcription automatique du francais : audio vers texte. Chaine le decodeur CTC medium (10.6M params, PER ~4.34%) avec le pipeline P2G (phones → orthographe).
WER benchmark : ~23.5% (all) / ~19.7% (parole courante).
Installation
# Mode minimal (CTC uniquement, transcription phonetique)
pip install lectura-stt
# Avec pipeline P2G complet (formules + noms propres)
pip install lectura-stt[p2g]
# Avec backend ONNX (inference locale rapide)
pip install lectura-stt[onnx]
# Avec support micro
pip install lectura-stt[micro]
Exemple
import numpy as np
from lectura_stt import creer_engine
engine = creer_engine()
# Charger un fichier WAV
import wave
with wave.open("bonjour.wav", "rb") as wf:
sr = wf.getframerate()
audio = np.frombuffer(
wf.readframes(wf.getnframes()), dtype=np.int16
).astype(np.float32) / 32768.0
result = engine.transcrire(audio, sr=sr)
print(result.ipa) # "b ɔ̃ ʒ u ʁ | l ə | m ɔ̃ d ."
print(result.texte) # "Bonjour le monde."
Architecture
Pipeline optimal (avec PhoneLexicon)
Lorsqu'un PhoneLexicon est disponible (via le graphemiseur), le pipeline
optimal est active automatiquement :
Audio 16kHz mono
|
v
[lectura-ctc] --> IPA phones "b ɔ̃ ʒ u ʁ | l ə | m ɔ̃ d ."
|
v
[parse_ctc_v2] --> segments enrichis (mots, liaisons, composes, ponctuation)
|
v
[strip_liaisons] --> supprime les liaisons erronees (via lexique phonetique)
|
v
[split_elisions] --> separe les clitiques elides (l'ami → l + ami)
|
v
[split_merged_words] --> decoupe les mots sur-segmentes
|
v
[P2G analyser_v2] --> conversion IPA → orthographe avec lex_select
|
v
[merge_and_rescore] --> fusionne les mots sur-segmentes (rescoring lexical)
|
v
[try_elision_merges] --> fusionne les clitiques elides adjacents
|
v
[rejoin_elisions] --> reconstruction texte final avec apostrophes et tirets
|
v
"Bonjour le monde."
Pipeline simplifie (sans PhoneLexicon)
Audio 16kHz mono
|
v
[lectura-ctc] --> IPA phones "b ɔ̃ ʒ u ʁ | l ə | m ɔ̃ d ."
|
v
[_parse_ctc] --> mots IPA ["bɔ̃ʒuʁ", "lə", "mɔ̃d"] + ponctuation ["."]
|
v
[lectura-p2g] --> ortho ["bonjour", "le", "monde"]
|
v
[_assembler] --> "Bonjour le monde."
Licence
AGPL-3.0-or-later — voir LICENCE.txt. Licence commerciale disponible — voir LICENCE-COMMERCIALE.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lectura_stt-3.0.3.tar.gz.
File metadata
- Download URL: lectura_stt-3.0.3.tar.gz
- Upload date:
- Size: 38.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d49bde67de37841f51f61681e7f27ccd6d72124a1c79fd4e0cc130e391df87c
|
|
| MD5 |
6d22d8f2e3245eb5f9074ebe047a3d17
|
|
| BLAKE2b-256 |
29ee09eaac1c26eae7561bc06cdf971d916f0901a5a5889d710f825f25f9ec29
|
File details
Details for the file lectura_stt-3.0.3-py3-none-any.whl.
File metadata
- Download URL: lectura_stt-3.0.3-py3-none-any.whl
- Upload date:
- Size: 37.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6c01ba92e0f55dcb70406543b2ab33ddeb0381e449d929038bcf65a3251f80e
|
|
| MD5 |
2389d81e79f3282edb996907ab824c42
|
|
| BLAKE2b-256 |
31ca3bf83abac525e535375e93d267f4050ea4d5c63bdbea102831a5e4ff415b
|