Local TTS and audio transcription web app

Project description

VocalFlow

Local TTS, voice cloning, and transcription for Windows.

Windows Python

Self-hosted web app powered by Qwen3-TTS and Whisper. Everything runs locally — no cloud APIs, no data leaves your machine.

Features

Voice Cloning — clone any voice from a short audio clip, save and reuse voice prompts
Custom Voice — 9 preset speakers with emotion and tone control
Voice Design — create new voices from natural language descriptions
Transcription — word-level timestamps, 6 model sizes, auto language detection
Smart GPU — automatic model switching with VRAM cleanup, Flash Attention 2
10+ languages including English, Chinese, Japanese, Korean, and more

Requirements

Windows 10/11 with a CUDA GPU (6 GB+ VRAM)
Python 3.11 | FFmpeg winget install ffmpeg | SoX winget install sox

Quick Start

pip install vocalflow
vocalflow

Open http://localhost:5001 — models download automatically on first use.

From source

git clone https://github.com/0xBinayak/VocalFlow.git && cd VocalFlow
uv sync && uv run app.py

Dev mode with auto-reload: uv run gradio app.py

Contributing

Fork, uv sync, branch, ensure uvx ruff check app.py transcribe.py main.py passes, PR. Report a bug

License

MIT | Qwen3-TTS | OpenAI Whisper | Gradio

Project details

Release history Release notifications | RSS feed

This version

0.7.1

Mar 23, 2026

0.7.0

Mar 23, 2026

0.6.2

Mar 23, 2026

0.6.1

Mar 23, 2026

0.6.0

Mar 23, 2026

0.3.2

Mar 18, 2026

0.3.1

Mar 17, 2026

0.3.0

Mar 17, 2026

0.2.1

Mar 17, 2026

0.1.0

Mar 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vocalflow-0.7.1.tar.gz (67.4 kB view details)

Uploaded Mar 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vocalflow-0.7.1-py3-none-any.whl (14.5 kB view details)

Uploaded Mar 23, 2026 Python 3

File details

Details for the file vocalflow-0.7.1.tar.gz.

File metadata

Download URL: vocalflow-0.7.1.tar.gz
Upload date: Mar 23, 2026
Size: 67.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vocalflow-0.7.1.tar.gz
Algorithm	Hash digest
SHA256	`4002eb5063b3b5fc7a45731b9541926d2ee87cb6b6c764305cb8ae5191a65729`
MD5	`0224b4f9b9e58a2ffc9e1fd6303803aa`
BLAKE2b-256	`32b18875ab6ad092e9a8e9c6df710793ecc74109fd66a587768e1288dc28cae4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vocalflow-0.7.1.tar.gz:

Publisher: publish.yml on 0xBinayak/VocalFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vocalflow-0.7.1.tar.gz
- Subject digest: 4002eb5063b3b5fc7a45731b9541926d2ee87cb6b6c764305cb8ae5191a65729
- Sigstore transparency entry: 1156204818
- Sigstore integration time: Mar 23, 2026
Source repository:
- Permalink: 0xBinayak/VocalFlow@776786f567797e06998a45730df93810a1619b63
- Branch / Tag: refs/tags/v0.7.1
- Owner: https://github.com/0xBinayak
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@776786f567797e06998a45730df93810a1619b63
- Trigger Event: push

File details

Details for the file vocalflow-0.7.1-py3-none-any.whl.

File metadata

Download URL: vocalflow-0.7.1-py3-none-any.whl
Upload date: Mar 23, 2026
Size: 14.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vocalflow-0.7.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e543a53dcaadc20faea34b3b30028509331e593a5f9d49e2036ed5fc23e56aa5`
MD5	`9d5a6c60ae83b1737bc00b34ba64dffb`
BLAKE2b-256	`7984dc6091a55b2c158cf237559d9daf8ed5de253fd3be2d98a4127130c254e3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vocalflow-0.7.1-py3-none-any.whl:

Publisher: publish.yml on 0xBinayak/VocalFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vocalflow-0.7.1-py3-none-any.whl
- Subject digest: e543a53dcaadc20faea34b3b30028509331e593a5f9d49e2036ed5fc23e56aa5
- Sigstore transparency entry: 1156204821
- Sigstore integration time: Mar 23, 2026
Source repository:
- Permalink: 0xBinayak/VocalFlow@776786f567797e06998a45730df93810a1619b63
- Branch / Tag: refs/tags/v0.7.1
- Owner: https://github.com/0xBinayak
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@776786f567797e06998a45730df93810a1619b63
- Trigger Event: push

vocalflow 0.7.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

VocalFlow

Features

Requirements

Quick Start

From source

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance