Synthese vocale par concatenation de diphones WORLD — francais
Project description
lectura-tts-diphone
Synthese vocale francaise par concatenation de diphones dans le domaine WORLD.
Installation
# Sans dependances (import seul)
pip install lectura-tts-diphone
# Inference locale (pyworld + numpy + scipy)
pip install "lectura-tts-diphone[local]"
# Avec G2P integre (texte → audio)
pip install "lectura-tts-diphone[all]"
Utilisation
Depuis du texte (necessite lectura-g2p)
from lectura_tts_diphone import synthetiser
audio = synthetiser("Bonjour le monde")
# audio: numpy array float32, 44100 Hz
Depuis des phonemes IPA
from lectura_tts_diphone import creer_engine
engine = creer_engine()
audio = engine.synthesize_groups([
{"phones": ["b", "ɔ̃", "ʒ", "u", "ʁ"], "boundary": "none"},
{"phones": ["l", "ə", "m", "ɔ̃", "d"], "boundary": "period"},
])
Controles prosodiques
| Parametre | Defaut | Description |
|---|---|---|
| duration_scale | 1.0 | Vitesse globale (>1 = plus lent) |
| pause_scale | 1.0 | Duree des pauses inter-groupes |
| macro_expressivity | 2.0 | Gestes prosodiques (0=neutre, 4=exagere) |
| micro_expressivity | 5.0 | Micro-variations (0=robot, 10=tres expressif) |
| spectral_contrast | 1.5 | Contraste spectral (1.0=off, 2.0=fort) |
| prosody_style | "auto" | "declaratif", "question", "exclamation", "suspensif", "neutre" |
| seed | None | Graine pour micro-prosodie reproductible |
Modes de synthese
- FLUIDE : lecture naturelle, enchainement continu
- MOT_A_MOT : lecture mot par mot avec pauses
- SYLLABES : lecture syllabe par syllabe
Architecture
Texte → [G2P] → Phonemes IPA → Diphone chain
↓
WORLD params (F0 + SP + AP)
↓
Stretch + Concat (overlap)
↓
Prosodie (F0 contour + durees)
↓
GV compensation (contraste spectral)
↓
pw.synthesize → Audio 44100 Hz
Les diphones sont des parametres WORLD (F0 + spectral envelope + aperiodicity) extraits du corpus SIWIS et moyennes par type de transition phonetique.
Emplacements des modeles
Recherche dans l'ordre :
- Parametre
models_direxplicite $LECTURA_MODELS_DIR/tts_diphone/~/.lectura/models/tts_diphone/- Modeles embarques dans le package
Fichier requis : diphones.dpk.gz (ou .dpk.gz.enc chiffre)
Fichier optionnel : diphone_statistics.pkl
Licence
Double licence : AGPL-3.0 (code) + Licence Commerciale (modeles).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lectura_tts_diphone-1.3.4.tar.gz.
File metadata
- Download URL: lectura_tts_diphone-1.3.4.tar.gz
- Upload date:
- Size: 37.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d71260bcb9ec66aaed096219e8a46c4bf568e2130a32f96cf05e3b00e3aaccf0
|
|
| MD5 |
e68767727fdd83f30f02c006b52057cf
|
|
| BLAKE2b-256 |
4079c910d3fc76f1de1ad74bd43e4a89ae015369133acda91217f8f1acef608e
|
File details
Details for the file lectura_tts_diphone-1.3.4-py3-none-any.whl.
File metadata
- Download URL: lectura_tts_diphone-1.3.4-py3-none-any.whl
- Upload date:
- Size: 36.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef5d4a4ce943d5ceab72ead7148549b65c78a06c2ebb7f4c761bbddcc61a6990
|
|
| MD5 |
92b4a30e7614e53dd5b55eced30ae258
|
|
| BLAKE2b-256 |
c9845d300305ce563541ab7fcedb41e694b3e910a36ccf0083052d302d70988f
|