Skip to main content

ML for Speech presents SpeechToolkit, a unified, all-in-one toolkit for TTS, ASR, VC, & other models.

Project description

SpeechToolkit

NOTE: This project is still in an early alpha stage and is not ready for production yet.

A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, and more!

Please note that this toolkit is currently in an early alpha and not all features have been implemented.

If you prefer not to use SpeechToolkit but would like to interact with models individually and separately, please check out the ML for Speech page.

Implemented Features

  • Text-to-speech
    • StyleTTS 2
    • MetaVoice
    • Parler TTS
    • XTTS
  • Voice conversion
    • LVC-VC
    • NaturalSpeech3 Voice Conversion
    • StyleTTS2-VC
  • Automatic speech recognition
    • Whisper
    • Distil-Whisper
    • Canary
  • Audio classification
    • Language detection

Installation & Usage

Documentation is available online.

Disclaimer

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

Proivded models may make mistakes.

THE MODEL IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS MODEL INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS MODEL.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechtoolkit-0.0.5.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

speechtoolkit-0.0.5-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file speechtoolkit-0.0.5.tar.gz.

File metadata

  • Download URL: speechtoolkit-0.0.5.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for speechtoolkit-0.0.5.tar.gz
Algorithm Hash digest
SHA256 55cfd4e62ae25868cd5cea206740ee0614ef2b3c70bc1d89a141a0c2be010526
MD5 5eb08937a2ff0a862cd1662b6dc12df5
BLAKE2b-256 944d768a846f72611a11dfca4cc988b5d0e4df30d023db00d3c54f225a7a74e2

See more details on using hashes here.

File details

Details for the file speechtoolkit-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for speechtoolkit-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0f71326134dd5f1e1e47ac310a09c4b8381f55ec902fd6b7bf05f446d3284c9d
MD5 242e110b49ab34c0dc14ba30279558d3
BLAKE2b-256 79435d22bbd9ee336acbfc7125595aca645fd8f69c949baffd056769c3658de8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page