Kyutai's pocket-sized TTS!

Project description

Pocket TTS

logo

A lightweight text-to-speech (TTS) application designed to run efficiently on CPUs. Forget about the hassle of using GPUs and web APIs serving TTS models. With Kyutai's Pocket TTS, generating audio is just a pip install and a function call away.

Supports Python 3.10, 3.11, 3.12, 3.13 and 3.14. Requires PyTorch 2.5+. Does not require the gpu version of PyTorch.

🔊 Demo | 🐱‍💻GitHub Repository | 🤗 Hugging Face Model Card | 📄 Paper | 📚 Documentation

Main takeaways

Runs on CPU
Small model size, 100M parameters
Audio streaming
Low latency, ~200ms to get the first audio chunk
Faster than real-time, ~6x real-time on a CPU of MacBook Air M4
Uses only 2 CPU cores
Python API and CLI
Voice cloning
English only at the moment
Can handle infinitely long text inputs

Trying it from the website, without installing anything

Navigate to the Kyutai website to try it out directly in your browser. You can input text, select different voices, and generate speech without any installation.

Trying it with the CLI

The `generate` command

You can use pocket-tts directly from the command line. We recommend using uv as it installs any dependencies on the fly in an isolated environment (uv installation instructions here). You can also use pip install pocket-tts to install it manually.

This will generate a wav file ./tts_output.wav saying the default text with the default voice, and display some speed statistics.

uvx pocket-tts generate
# or if you installed it manually with pip:
pocket-tts generate

Modify the voice with --voice and the text with --text. We provide a small catalog of voices.

You can take a look at this page which details the licenses for each voice.

The --voice argument can also take a plain wav file as input for voice cloning. Feel free to check out the generate documentation for more details and examples. For trying multiple voices and prompts quickly, prefer using the serve command.

The `serve` command

You can also run a local server to generate audio via HTTP requests.

uvx pocket-tts serve
# or if you installed it manually with pip:
pocket-tts serve

Navigate to http://localhost:8000 to try the web interface, it's faster than the command line as the model is kept in memory between requests.

You can check out the serve documentation for more details and examples.

Using it as a Python library

Install the package with

pip install pocket-tts
# or
uv add pocket-tts

You can use this package as a simple Python library to generate audio from text.

from pocket_tts import TTSModel
import scipy.io.wavfile

tts_model = TTSModel.load_model()
voice_state = tts_model.get_state_for_audio_prompt(
    "hf://kyutai/tts-voices/alba-mackenna/casual.wav"
)
audio = tts_model.generate_audio(voice_state, "Hello world, this is a test.")
# Audio is a 1D torch tensor containing PCM data.
scipy.io.wavfile.write("output.wav", tts_model.sample_rate, audio.numpy())

You can have multiple voice states around if you have multiple voices you want to use. load_model() and get_state_for_audio_prompt() are relatively slow operations, so we recommend to keep the model and voice states in memory if you can.

You can check out the Python API documentation for more details and examples.

Unsupported features

At the moment, we do not support (but would love pull requests adding):

We tried running this TTS model on the GPU but did not observe a speedup compared to CPU execution, notably because we use a batch size of 1 and a very small model.

Development and local setup

We accept contributions! Feel free to open issues or pull requests on GitHub.

You can find development instructions in the CONTRIBUTING.md file. You'll also find there how to have an editable install of the package for local development.

Prohibited use

Use of our model must comply with all applicable laws and regulations and must not result in, involve, or facilitate any illegal, harmful, deceptive, fraudulent, or unauthorized activity. Prohibited uses include, without limitation, voice impersonation or cloning without explicit and lawful consent; misinformation, disinformation, or deception (including fake news, fraudulent calls, or presenting generated content as genuine recordings of real people or events); and the generation of unlawful, harmful, libelous, abusive, harassing, discriminatory, hateful, or privacy-invasive content. We disclaim all liability for any non-compliant use.

Project details

Release history Release notifications | RSS feed

2.0.0

Apr 21, 2026

1.1.1

Feb 17, 2026

1.1.0

Feb 16, 2026

1.0.3

Jan 22, 2026

1.0.2

Jan 20, 2026

1.0.1

Jan 13, 2026

This version

1.0.0

Jan 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pocket_tts-1.0.0.tar.gz (520.1 kB view details)

Uploaded Jan 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pocket_tts-1.0.0-py3-none-any.whl (48.2 kB view details)

Uploaded Jan 13, 2026 Python 3

File details

Details for the file pocket_tts-1.0.0.tar.gz.

File metadata

Download URL: pocket_tts-1.0.0.tar.gz
Upload date: Jan 13, 2026
Size: 520.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pocket_tts-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`be4eef2157b24ffeac01deb27a579d561c12f9f05860cb184c42a87c69b57f8e`
MD5	`d5088c8118bd0c3f6fc96576e3fa6a2d`
BLAKE2b-256	`a31e472045dedeeeaea215b9f0313dc0c43c916c7681a520e607c37a68281644`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pocket_tts-1.0.0.tar.gz:

Publisher: publish-package.yml on kyutai-labs/pocket-tts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pocket_tts-1.0.0.tar.gz
- Subject digest: be4eef2157b24ffeac01deb27a579d561c12f9f05860cb184c42a87c69b57f8e
- Sigstore transparency entry: 815906270
- Sigstore integration time: Jan 13, 2026
Source repository:
- Permalink: kyutai-labs/pocket-tts@b05a9d061bdcde2704a3902a76abe1d6e3ca3226
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/kyutai-labs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-package.yml@b05a9d061bdcde2704a3902a76abe1d6e3ca3226
- Trigger Event: release

File details

Details for the file pocket_tts-1.0.0-py3-none-any.whl.

File metadata

Download URL: pocket_tts-1.0.0-py3-none-any.whl
Upload date: Jan 13, 2026
Size: 48.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pocket_tts-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`86a85a34361148fc63669a91f5466064b5bd6295ff79830179e7f42c65542a71`
MD5	`2e5933b99d9e587d52ec00266a48ce6c`
BLAKE2b-256	`241bf40ba1bd73587cd305a75f68991e4986f31ed9ad27cd2c5c4677d3050a62`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pocket_tts-1.0.0-py3-none-any.whl:

Publisher: publish-package.yml on kyutai-labs/pocket-tts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pocket_tts-1.0.0-py3-none-any.whl
- Subject digest: 86a85a34361148fc63669a91f5466064b5bd6295ff79830179e7f42c65542a71
- Sigstore transparency entry: 815906294
- Sigstore integration time: Jan 13, 2026
Source repository:
- Permalink: kyutai-labs/pocket-tts@b05a9d061bdcde2704a3902a76abe1d6e3ca3226
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/kyutai-labs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-package.yml@b05a9d061bdcde2704a3902a76abe1d6e3ca3226
- Trigger Event: release

pocket-tts 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Pocket TTS

Main takeaways

Trying it from the website, without installing anything

Trying it with the CLI

The `generate` command

The `serve` command

Using it as a Python library

Unsupported features

Development and local setup

Prohibited use

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

pocket-tts 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Pocket TTS

Main takeaways

Trying it from the website, without installing anything

Trying it with the CLI

The generate command

The serve command

Using it as a Python library

Unsupported features

Development and local setup

Prohibited use

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

The `generate` command

The `serve` command