Skip to main content

ML for Speech presents SpeechToolkit, a unified, all-in-one toolkit for TTS, ASR, VC, & other models.

Project description

SpeechToolkit

NOTE: This project is still in an early alpha stage and is not ready for production yet.

A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, and more!

Please note that this toolkit is currently in an early alpha and not all features have been implemented.

If you prefer not to use SpeechToolkit but would like to interact with models individually and separately, please check out the ML for Speech page.

Implemented Features

  • Text-to-speech
    • StyleTTS 2
  • Voice conversion
    • LVC-VC
    • NaturalSpeech3 Voice Conversion
    • StyleTTS2-VC
  • Automatic speech recognition
    • Whisper
    • Distil-Whisper
    • Canary
  • Audio classification
    • Language detection
    • Accent detection
    • Large accent detection model

Installation & Usage

Documentation is available online.

Disclaimer

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

Proivded models may make mistakes.

THE MODEL IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS MODEL INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS MODEL.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechtoolkit-0.0.3.tar.gz (7.1 kB view details)

Uploaded Source

File details

Details for the file speechtoolkit-0.0.3.tar.gz.

File metadata

  • Download URL: speechtoolkit-0.0.3.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for speechtoolkit-0.0.3.tar.gz
Algorithm Hash digest
SHA256 284ba6050c2a7a273da872ad089d738492f544d1d748777784beb7a9c449d79a
MD5 a0aa51f823c1c47fcb8fc24e85af727d
BLAKE2b-256 faa1d5b81a80d88a15298cd8b742de3165bd935e58c9c973e9ad8b8a6ae6873f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page