Skip to main content

Local TTS and audio transcription web app

Project description

VocalFlow

Local TTS, voice cloning, and transcription for Windows.

PyPI CI License Windows Python


Self-hosted web app that runs entirely on your machine. No cloud APIs, no data leaves your PC.

Features

  • Voice Cloning — clone any voice from a short audio clip (3-10s), save & reuse voice prompts
  • Custom Voice — 9 preset speakers with emotion/tone control ("say it angrily", "whisper softly")
  • Voice Design — create new voices from text descriptions, no reference audio needed
  • Transcription — Whisper-powered transcription with word-level timestamps, 6 model sizes
  • Smart GPU — automatic model load/unload between switches, only one model in VRAM at a time
  • Flash Attention 2 for faster inference
  • 10+ languages with auto-detection

Requirements

  • Windows 10/11 with a CUDA GPU (6+ GB VRAM)
  • Python 3.11
  • FFmpegwinget install ffmpeg
  • SoXwinget install sox

Quick Start

pip install vocalflow
vocalflow

Open http://localhost:5001. Models download automatically on first use.

From source

git clone https://github.com/0xBinayak/VocalFlow.git
cd VocalFlow
uv sync
uv run app.py

For auto-reload during development: uv run gradio app.py

Models

Model Params Use
Qwen3-TTS-1.7B-Base 1.7B Voice cloning (best quality)
Qwen3-TTS-0.6B-Base 0.6B Voice cloning (faster)
Qwen3-TTS-1.7B-CustomVoice 1.7B Preset speakers + instructions
Qwen3-TTS-0.6B-CustomVoice 0.6B Preset speakers only
Qwen3-TTS-1.7B-VoiceDesign 1.7B Voice from text description
Whisper (tiny-turbo) 39M-1.5B Transcription

All TTS models run in bfloat16 with SDPA/Flash Attention. Whisper falls back to CPU if no GPU.

Contributing

  1. Fork & clone, uv sync, create a branch
  2. Ensure uvx ruff check app.py transcribe.py main.py passes
  3. Open a PR against main

Open an issue if you find a bug.

License

MIT

Acknowledgments

Qwen3-TTS | OpenAI Whisper | Gradio

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocalflow-0.7.0.tar.gz (67.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vocalflow-0.7.0-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file vocalflow-0.7.0.tar.gz.

File metadata

  • Download URL: vocalflow-0.7.0.tar.gz
  • Upload date:
  • Size: 67.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vocalflow-0.7.0.tar.gz
Algorithm Hash digest
SHA256 e799603cd2030b4bad3a3a982666099faa7c44e8e1e2d60b4eac4d3cde1fec46
MD5 1e640dbe14fe951bc7d9db18358cd32f
BLAKE2b-256 626ad5dc5457f7eb4795c4adbd63c70e1fd07e31d93972b6a99ae87eef3fef2f

See more details on using hashes here.

Provenance

The following attestation bundles were made for vocalflow-0.7.0.tar.gz:

Publisher: publish.yml on 0xBinayak/VocalFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vocalflow-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: vocalflow-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vocalflow-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 caec20afcd324e9d3b8eeef04d619f7b7e5c533640f1649cf34ac5685580566d
MD5 aa764098428f7af658ff391dba69561c
BLAKE2b-256 a59633a9ac27de2004531fc37e5056abe1590ae7ab6bdbaaf5fa381ea62bb98a

See more details on using hashes here.

Provenance

The following attestation bundles were made for vocalflow-0.7.0-py3-none-any.whl:

Publisher: publish.yml on 0xBinayak/VocalFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page