Skip to main content

ML for Speech presents SpeechToolkit, a unified, all-in-one toolkit for TTS, ASR, VC, & other models.

Project description

SpeechToolkit

NOTE: This project is still in an early alpha stage and is not ready for production yet.

A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, and more!

Please note that this toolkit is currently in an early alpha and not all features have been implemented.

If you prefer not to use SpeechToolkit but would like to interact with models individually and separately, please check out the ML for Speech page.

Implemented Features

  • Text-to-speech
    • StyleTTS 2
  • Voice conversion
    • LVC-VC
    • NaturalSpeech3 Voice Conversion
    • StyleTTS2-VC
  • Automatic speech recognition
    • Whisper
    • Distil-Whisper
    • Canary
  • Audio classification
    • Language detection
    • Accent detection
    • Large accent detection model

Installation & Usage

Documentation is available online.

Disclaimer

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

Proivded models may make mistakes.

THE MODEL IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS MODEL INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS MODEL.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechtoolkit-0.0.4.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

speechtoolkit-0.0.4-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file speechtoolkit-0.0.4.tar.gz.

File metadata

  • Download URL: speechtoolkit-0.0.4.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.19

File hashes

Hashes for speechtoolkit-0.0.4.tar.gz
Algorithm Hash digest
SHA256 2fd8a9c325be484ea58ec6c5b1dadaa71268abad8cb0988b390cffd672b01d56
MD5 8c29f07b138fb816b8c87039614a3383
BLAKE2b-256 f4dc7205953553ded87057ccf2fdbd215b1465865499cc785d8ba9ff8ca50b0a

See more details on using hashes here.

File details

Details for the file speechtoolkit-0.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for speechtoolkit-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6b7a83ce6d5f95db0596e2812609c78cb713d3f3ca08de6611a49693c5cac3b4
MD5 81d02db09c6674d58db435d84149595f
BLAKE2b-256 729204641060448d41dc628133c4612a8f1a6efe80f4243524282fca5e35e3b0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page