F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

mrfakename SWivid

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

F5-TTS: Diffusion Transformer with ConvNeXt V2, faster trained and inference.

E2 TTS: Flat-UNet Transformer, closest reproduction from paper.

Sway Sampling: Inference-time flow step sampling strategy, greatly improves performance

Thanks to all the contributors !

News

2025/03/12: 🔥 F5-TTS v1 base model with better training and inference performance. Few demo.
2024/10/08: F5-TTS & E2 TTS base models on 🤗 Hugging Face, 🤖 Model Scope, 🟣 Wisemodel.

Installation

Create a separate environment if needed

# Create a conda env with python_version>=3.10  (you could also use virtualenv)
conda create -n f5-tts python=3.11
conda activate f5-tts

# Install FFmpeg if you haven't yet
conda install ffmpeg

Install PyTorch with matched device

NVIDIA GPU

# Install pytorch with your CUDA version, e.g.
pip install torch==2.8.0+cu128 torchaudio==2.8.0+cu128 --extra-index-url https://download.pytorch.org/whl/cu128

# And also possible previous versions, e.g.
pip install torch==2.4.0+cu124 torchaudio==2.4.0+cu124 --extra-index-url https://download.pytorch.org/whl/cu124
# etc.

AMD GPU

# Install pytorch with your ROCm version (Linux only), e.g.
pip install torch==2.5.1+rocm6.2 torchaudio==2.5.1+rocm6.2 --extra-index-url https://download.pytorch.org/whl/rocm6.2

Intel GPU

# Install pytorch with your XPU version, e.g.
# Intel® Deep Learning Essentials or Intel® oneAPI Base Toolkit must be installed
pip install torch torchaudio --index-url https://download.pytorch.org/whl/test/xpu

# Intel GPU support is also available through IPEX (Intel® Extension for PyTorch)
# IPEX does not require the Intel® Deep Learning Essentials or Intel® oneAPI Base Toolkit
# See: https://pytorch-extension.intel.com/installation?request=platform

Apple Silicon

# Install the stable pytorch, e.g.
pip install torch torchaudio

Then you can choose one from below:

1. As a pip package (if just for inference)
pip install f5-tts
2. Local editable (if also do training, finetuning)
git clone https://github.com/SWivid/F5-TTS.git
cd F5-TTS
# git submodule update --init --recursive  # (optional, if use bigvgan as vocoder)
pip install -e .

Docker usage also available

# Build from Dockerfile
docker build -t f5tts:v1 .

# Run from GitHub Container Registry
docker container run --rm -it --gpus=all --mount 'type=volume,source=f5-tts,target=/root/.cache/huggingface/hub/' -p 7860:7860 ghcr.io/swivid/f5-tts:main

# Quickstart if you want to just run the web interface (not CLI)
docker container run --rm -it --gpus=all --mount 'type=volume,source=f5-tts,target=/root/.cache/huggingface/hub/' -p 7860:7860 ghcr.io/swivid/f5-tts:main f5-tts_infer-gradio --host 0.0.0.0

Runtime

Deployment solution with Triton and TensorRT-LLM.

Benchmark Results

Decoding on a single L20 GPU, using 26 different prompt_audio & target_text pairs, 16 NFE.

Model	Concurrency	Avg Latency	RTF	Mode
F5-TTS Base (Vocos)	2	253 ms	0.0394	Client-Server
F5-TTS Base (Vocos)	1 (Batch_size)	-	0.0402	Offline TRT-LLM
F5-TTS Base (Vocos)	1 (Batch_size)	-	0.1467	Offline Pytorch

See detailed instructions for more information.

Inference

In order to achieve desired performance, take a moment to read detailed guidance.
By properly searching the keywords of problem encountered, issues are very helpful.

1. Gradio App

Currently supported features:

Basic TTS with Chunk Inference
Multi-Style / Multi-Speaker Generation
Voice Chat powered by Qwen2.5-3B-Instruct
Custom inference with more language support

# Launch a Gradio app (web interface)
f5-tts_infer-gradio

# Specify the port/host
f5-tts_infer-gradio --port 7860 --host 0.0.0.0

# Launch a share link
f5-tts_infer-gradio --share

NVIDIA device docker compose file example

services:
  f5-tts:
    image: ghcr.io/swivid/f5-tts:main
    ports:
      - "7860:7860"
    environment:
      GRADIO_SERVER_PORT: 7860
    entrypoint: ["f5-tts_infer-gradio", "--port", "7860", "--host", "0.0.0.0"]
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

volumes:
  f5-tts:
    driver: local

2. CLI Inference

# Run with flags
# Leave --ref_text "" will have ASR model transcribe (extra GPU memory usage)
f5-tts_infer-cli --model F5TTS_v1_Base \
--ref_audio "provide_prompt_wav_path_here.wav" \
--ref_text "The content, subtitle or transcription of reference audio." \
--gen_text "Some text you want TTS model generate for you."

# Run with default setting. src/f5_tts/infer/examples/basic/basic.toml
f5-tts_infer-cli
# Or with your own .toml file
f5-tts_infer-cli -c custom.toml

# Multi voice. See src/f5_tts/infer/README.md
f5-tts_infer-cli -c src/f5_tts/infer/examples/multi/story.toml

Training

1. With Hugging Face Accelerate

Refer to training & finetuning guidance for best practice.

2. With Gradio App

# Quick start with Gradio web interface
f5-tts_finetune-gradio

Read training & finetuning guidance for more instructions.

Evaluation

Development

Use pre-commit to ensure code quality (will run linters and formatters automatically):

pip install pre-commit
pre-commit install

When making a pull request, before each commit, run:

pre-commit run --all-files

Note: Some model components have linting exceptions for E722 to accommodate tensor notation.

Acknowledgements

E2-TTS brilliant work, simple and effective
Emilia, WenetSpeech4TTS, LibriTTS, LJSpeech valuable datasets
lucidrains initial CFM structure with also bfs18 for discussion
SD3 & Hugging Face diffusers DiT and MMDiT code structure
torchdiffeq as ODE solver, Vocos and BigVGAN as vocoder
FunASR, faster-whisper, UniSpeech, SpeechMOS for evaluation tools
ctc-forced-aligner for speech edit test
mrfakename huggingface space demo ~
f5-tts-mlx Implementation with MLX framework by Lucas Newman
F5-TTS-ONNX ONNX Runtime version by DakeQQ
Yuekai Zhang Triton and TensorRT-LLM support ~

Citation

If our work and codebase is useful for you, please cite as:

@article{chen-etal-2024-f5tts,
      title={F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching}, 
      author={Yushen Chen and Zhikang Niu and Ziyang Ma and Keqi Deng and Chunhui Wang and Jian Zhao and Kai Yu and Xie Chen},
      journal={arXiv preprint arXiv:2410.06885},
      year={2024},
}

License

Our code is released under MIT License. The pre-trained models are licensed under the CC-BY-NC license due to the training data Emilia, which is an in-the-wild dataset. Sorry for any inconvenience this may cause.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

mrfakename SWivid

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.1.20

Apr 20, 2026

1.1.19 yanked

Apr 16, 2026

Reason this release was yanked:

bug

1.1.18 yanked

Mar 24, 2026

Reason this release was yanked:

bug

1.1.17

Mar 4, 2026

1.1.16

Feb 16, 2026

1.1.15

Dec 21, 2025

1.1.12 yanked

Dec 20, 2025

1.1.11 yanked

Dec 20, 2025

Reason this release was yanked:

bug

1.1.10

Nov 28, 2025

1.1.9

Sep 13, 2025

1.1.8

Aug 28, 2025

1.1.7

Jul 14, 2025

1.1.6

Jul 2, 2025

1.1.5

Jun 5, 2025

1.1.4

May 4, 2025

1.1.3 yanked

May 4, 2025

Reason this release was yanked:

Buggy for custom model usage

1.1.0

Apr 3, 2025

1.0.8

Mar 24, 2025

1.0.3

Mar 16, 2025

1.0.0

Mar 12, 2025

0.6.2

Feb 26, 2025

0.3.4

Jan 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

f5_tts-1.1.20.tar.gz (1.4 MB view details)

Uploaded Apr 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

f5_tts-1.1.20-py3-none-any.whl (1.3 MB view details)

Uploaded Apr 20, 2026 Python 3

File details

Details for the file f5_tts-1.1.20.tar.gz.

File metadata

Download URL: f5_tts-1.1.20.tar.gz
Upload date: Apr 20, 2026
Size: 1.4 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for f5_tts-1.1.20.tar.gz
Algorithm	Hash digest
SHA256	`e0925765b2d508a5cc08ad78f91fefc8ea61a244b64cf610150e62627a69fb96`
MD5	`c8d624f2600d9d1c1c51537124c4857a`
BLAKE2b-256	`22bc98a206f63dc67c6403c6674e1d44ef74b1cdeb7fc3819cb9bdcf9ae629bc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for f5_tts-1.1.20.tar.gz:

Publisher: publish-pypi.yaml on SWivid/F5-TTS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: f5_tts-1.1.20.tar.gz
- Subject digest: e0925765b2d508a5cc08ad78f91fefc8ea61a244b64cf610150e62627a69fb96
- Sigstore transparency entry: 1341770449
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: SWivid/F5-TTS@6f910225197d36374ecb389166481432d739973d
- Branch / Tag: refs/tags/1.1.20
- Owner: https://github.com/SWivid
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yaml@6f910225197d36374ecb389166481432d739973d
- Trigger Event: release

File details

Details for the file f5_tts-1.1.20-py3-none-any.whl.

File metadata

Download URL: f5_tts-1.1.20-py3-none-any.whl
Upload date: Apr 20, 2026
Size: 1.3 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for f5_tts-1.1.20-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b2123d4752827300a9ee4a088a6e9cd0844c7ea909ac1285655e0e357b3c9a7d`
MD5	`bc3334f861aa3119bd9f7a31344ad56f`
BLAKE2b-256	`9e15e0ab2e28defced06d30918a203a2e721695eb701e33645a1037b6a2f9098`

See more details on using hashes here.

Provenance

The following attestation bundles were made for f5_tts-1.1.20-py3-none-any.whl:

Publisher: publish-pypi.yaml on SWivid/F5-TTS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: f5_tts-1.1.20-py3-none-any.whl
- Subject digest: b2123d4752827300a9ee4a088a6e9cd0844c7ea909ac1285655e0e357b3c9a7d
- Sigstore transparency entry: 1341770452
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: SWivid/F5-TTS@6f910225197d36374ecb389166481432d739973d
- Branch / Tag: refs/tags/1.1.20
- Owner: https://github.com/SWivid
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yaml@6f910225197d36374ecb389166481432d739973d
- Trigger Event: release

f5-tts 1.1.20

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Thanks to all the contributors !

News

Installation

Create a separate environment if needed

Install PyTorch with matched device

Then you can choose one from below:

1. As a pip package (if just for inference)

2. Local editable (if also do training, finetuning)

Docker usage also available

Runtime

Benchmark Results

Inference

1. Gradio App

2. CLI Inference

Training

1. With Hugging Face Accelerate

2. With Gradio App

Evaluation

Development

Acknowledgements

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance