Tool to make high quality text to speech (tts) corpus from audio + text books.
Project description
Narizaka
Tool to make high quality text to speech (tts) corpus from audio + text books.
How it works
First it splits audio files in to small segments 5-15 seconds, then it iterates over each segment and transcribes with whisper ASR and alligns this transcription with original text, if distance is very small we consider it as match and add it to the dataset.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
narizaka-1.0.0.tar.gz
(8.3 kB
view hashes)