Project description

SpeechToolkit

NOTE: This project is still in an early alpha stage and is not ready for production yet.

A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, and more!

Please note that this toolkit is currently in an early alpha and not all features have been implemented.

If you prefer not to use SpeechToolkit but would like to interact with models individually and separately, please check out the ML for Speech page.

Implemented Features

Text-to-speech
- StyleTTS 2
Voice conversion
- LVC-VC
- NaturalSpeech3 Voice Conversion
- StyleTTS2-VC
Automatic speech recognition
- Whisper
- Distil-Whisper
- Canary
Audio classification
- Language detection
- Accent detection
- Large accent detection model

Installation & Usage

Documentation is available online.

Disclaimer

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

Proivded models may make mistakes.

THE MODEL IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS MODEL INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS MODEL.

Project details

Release history Release notifications | RSS feed

0.0.5

May 17, 2024

This version

0.0.4

Apr 20, 2024

0.0.3

Apr 20, 2024

0.0.2

Apr 20, 2024

0.0.1

Apr 20, 2024

0.0.0

Apr 20, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechtoolkit-0.0.4.tar.gz (6.9 kB view details)

Uploaded Apr 20, 2024 Source

Built Distribution

speechtoolkit-0.0.4-py3-none-any.whl (9.8 kB view details)

Uploaded Apr 20, 2024 Python 3

File details

Details for the file speechtoolkit-0.0.4.tar.gz.

File metadata

Download URL: speechtoolkit-0.0.4.tar.gz
Upload date: Apr 20, 2024
Size: 6.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.19

File hashes

Hashes for speechtoolkit-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`2fd8a9c325be484ea58ec6c5b1dadaa71268abad8cb0988b390cffd672b01d56`
MD5	`8c29f07b138fb816b8c87039614a3383`
BLAKE2b-256	`f4dc7205953553ded87057ccf2fdbd215b1465865499cc785d8ba9ff8ca50b0a`

See more details on using hashes here.

File details

Details for the file speechtoolkit-0.0.4-py3-none-any.whl.

File metadata

Download URL: speechtoolkit-0.0.4-py3-none-any.whl
Upload date: Apr 20, 2024
Size: 9.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.19

File hashes

Hashes for speechtoolkit-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6b7a83ce6d5f95db0596e2812609c78cb713d3f3ca08de6611a49693c5cac3b4`
MD5	`81d02db09c6674d58db435d84149595f`
BLAKE2b-256	`729204641060448d41dc628133c4612a8f1a6efe80f4243524282fca5e35e3b0`