ML for Speech presents SpeechToolkit, a unified, all-in-one toolkit for TTS, ASR, VC, & other models.
Project description
SpeechToolkit
NOTE: This project is still in an early alpha stage and is not ready for production yet.
A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, and more!
Please note that this toolkit is currently in an early alpha and not all features have been implemented.
If you prefer not to use SpeechToolkit but would like to interact with models individually and separately, please check out the ML for Speech page.
Implemented Features
- Text-to-speech
- StyleTTS 2
- Voice conversion
- LVC-VC
- NaturalSpeech3 Voice Conversion
- StyleTTS2-VC
- Automatic speech recognition
- Whisper
- Distil-Whisper
- Canary
- Audio classification
- Language detection
- Accent detection
- Large accent detection model
Installation & Usage
Documentation is available online.
Disclaimer
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Proivded models may make mistakes.
THE MODEL IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS MODEL INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS MODEL.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file speechtoolkit-0.0.4.tar.gz
.
File metadata
- Download URL: speechtoolkit-0.0.4.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2fd8a9c325be484ea58ec6c5b1dadaa71268abad8cb0988b390cffd672b01d56 |
|
MD5 | 8c29f07b138fb816b8c87039614a3383 |
|
BLAKE2b-256 | f4dc7205953553ded87057ccf2fdbd215b1465865499cc785d8ba9ff8ca50b0a |
File details
Details for the file speechtoolkit-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: speechtoolkit-0.0.4-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b7a83ce6d5f95db0596e2812609c78cb713d3f3ca08de6611a49693c5cac3b4 |
|
MD5 | 81d02db09c6674d58db435d84149595f |
|
BLAKE2b-256 | 729204641060448d41dc628133c4612a8f1a6efe80f4243524282fca5e35e3b0 |