Skip to main content

Local TTS and audio transcription web app

Project description

VocalFlow

Local TTS, voice cloning, and transcription for Windows.

PyPI CI License Windows Python


Self-hosted web app powered by Qwen3-TTS and Whisper. Everything runs locally — no cloud APIs, no data leaves your machine.

Features

  • Voice Cloning — clone any voice from a short audio clip, save and reuse voice prompts
  • Custom Voice — 9 preset speakers with emotion and tone control
  • Voice Design — create new voices from natural language descriptions
  • Transcription — word-level timestamps, 6 model sizes, auto language detection
  • Smart GPU — automatic model switching with VRAM cleanup, Flash Attention 2
  • 10+ languages including English, Chinese, Japanese, Korean, and more

Requirements

  • Windows 10/11 with a CUDA GPU (6 GB+ VRAM)
  • Python 3.11  |  FFmpeg winget install ffmpeg  |  SoX winget install sox

Quick Start

pip install vocalflow
vocalflow

Open http://localhost:5001 — models download automatically on first use.

From source

git clone https://github.com/0xBinayak/VocalFlow.git && cd VocalFlow
uv sync && uv run app.py

Dev mode with auto-reload: uv run gradio app.py

Contributing

Fork, uv sync, branch, ensure uvx ruff check app.py transcribe.py main.py passes, PR.  Report a bug

License

MIT  |  Qwen3-TTS  |  OpenAI Whisper  |  Gradio

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocalflow-0.7.1.tar.gz (67.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vocalflow-0.7.1-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file vocalflow-0.7.1.tar.gz.

File metadata

  • Download URL: vocalflow-0.7.1.tar.gz
  • Upload date:
  • Size: 67.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vocalflow-0.7.1.tar.gz
Algorithm Hash digest
SHA256 4002eb5063b3b5fc7a45731b9541926d2ee87cb6b6c764305cb8ae5191a65729
MD5 0224b4f9b9e58a2ffc9e1fd6303803aa
BLAKE2b-256 32b18875ab6ad092e9a8e9c6df710793ecc74109fd66a587768e1288dc28cae4

See more details on using hashes here.

Provenance

The following attestation bundles were made for vocalflow-0.7.1.tar.gz:

Publisher: publish.yml on 0xBinayak/VocalFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vocalflow-0.7.1-py3-none-any.whl.

File metadata

  • Download URL: vocalflow-0.7.1-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vocalflow-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e543a53dcaadc20faea34b3b30028509331e593a5f9d49e2036ed5fc23e56aa5
MD5 9d5a6c60ae83b1737bc00b34ba64dffb
BLAKE2b-256 7984dc6091a55b2c158cf237559d9daf8ed5de253fd3be2d98a4127130c254e3

See more details on using hashes here.

Provenance

The following attestation bundles were made for vocalflow-0.7.1-py3-none-any.whl:

Publisher: publish.yml on 0xBinayak/VocalFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page