High-performance speech synthesis

These details have not been verified by PyPI

Project description

fish_speech_python

Python bindings for the Fish Speech Candle implementation, using PyO3.

Supported Platforms

[!WARNING] Read this list very carefully. Hardware support is very limited. If you try to use this library on unsupported hardware, it will probably not work.

You have been warned.

Python: 3.9+.

OS + hardware:

Linux:
- CPU: x86_64, glibc 2.34+
  - Example: Ubuntu 22.04 IS supported, Ubuntu 20.04 IS NOT supported
- GPU: Nvidia CUDA 11.8+ with compute capability >= 8.0 (RTX 30 series+, A100 series+)
  - Example: 2080 Ti is NOT supported (Turing)
  - Example: RX5700 XT is NOT supported (AMD)
  - NOTE: We don't currently have a CUDA build matrix, so it's compiled with CUDA 11.8; sorry. It should be compatible with newer CUDA versions, but please use the Rust runtime if full optimization is required.
macOS (M1+, 14.0+ (Monterey))

Windows and AMD hardware will never be supported, so don't ask. Feel free to raise an issue if you need ARM or Alpine Linux.

Installation

# From PyPI
pip install fish_speech_rs

and done.

Usage

Codec

This is the low-level API. You feed it PCM audio, it compresses it into codes, and then decompresses it back into PCM.

from fish_speech import FireflyCodec
import numpy as np
# optional but highly recommended
from huggingface_hub import snapshot_download

# This just returns a directory path.
# Substitute with your own directory path if you don't want to download from Hugging Face.
dir = snapshot_download("jkeisling/fish-speech-1.5")

# Load the codec model (set device to "cuda" for speed)
codec = FireflyCodec(
    dir,
    version="1.5",  # Supports 1.2 to 1.5; 1.5 is default
    device="cuda"    # Or "cpu" (much slower), "metal" on Apple Silicon
)

# 1s of random audio. Substitute with your own audio.
# You will need to resample to codec.sample_rate yourself. Soundfile is recommended.
pcm = np.random.randn(1, 1, codec.sample_rate).astype(np.float32)  # (batch, channels, samples)
# Encode raw PCM into compressed codes
codes = codec.encode(pcm)

# Decode the compressed codes back into PCM
decoded_pcm = codec.decode(codes)

Input: Raw PCM audio (please handle resampling to 44.1 kHz yourself)
Output: Encoded Numpy uint32 “codes” (compressed speech)

LM

The language model (LM) takes text and turns it into speech codes, which you then decode back to audio.

from fish_speech import LM
from typing import List

# Load the TTS model
lm = LM(
    dir,
    version="1.5",
    device="cuda",
    # bf16 only recommended for CUDA, otherwise leave it default (f32)
    dtype="bf16"
)

# Extract the speaker prompt from reference audio
speaker_prompt = lm.get_speaker_prompt([{
    'text': 'foobar',
    'codes': codes  # From previous encoding step
}], sysprompt="Speak out the provided text.")

# Generate speech codes
# Text chunking and normalization are your responsibility (sorry!);
# official text preprocessing helper function coming soon
generated_codes = lm.generate(["This is a test", "This is another test"], speaker_prompt=speaker_prompt)

# Decode to PCM audio using codec from earlier
pcm = codec.decode(generated_codes)

If you're in a Jupyter notebook, you can use the following code to play the audio in a widget:

# assumes you ran the above code
from IPython.display import Audio

Audio(pcm.flatten(), rate=codec.sample_rate)

Developing

Requires Python and Rust toolchains. Clone this repo, set up a Rust and Python toolchain.

python -m venv .venv
pip install -r requirements.txt
maturin develop

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

Feb 24, 2025

0.2.3

Feb 23, 2025

0.2.2

Feb 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fish_speech_rs-0.3.0-cp313-cp313-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded Feb 24, 2025 CPython 3.13manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.3.0-cp313-cp313-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded Feb 24, 2025 CPython 3.13macOS 11.0+ ARM64

fish_speech_rs-0.3.0-cp312-cp312-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded Feb 24, 2025 CPython 3.12manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.3.0-cp312-cp312-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded Feb 24, 2025 CPython 3.12macOS 11.0+ ARM64

fish_speech_rs-0.3.0-cp311-cp311-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded Feb 24, 2025 CPython 3.11manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.3.0-cp311-cp311-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded Feb 24, 2025 CPython 3.11macOS 11.0+ ARM64

fish_speech_rs-0.3.0-cp310-cp310-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded Feb 24, 2025 CPython 3.10manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.3.0-cp310-cp310-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded Feb 24, 2025 CPython 3.10macOS 11.0+ ARM64

fish_speech_rs-0.3.0-cp39-cp39-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded Feb 24, 2025 CPython 3.9manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.3.0-cp39-cp39-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded Feb 24, 2025 CPython 3.9macOS 11.0+ ARM64

File details

Details for the file fish_speech_rs-0.3.0-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp313-cp313-manylinux_2_34_x86_64.whl
Upload date: Feb 24, 2025
Size: 5.1 MB
Tags: CPython 3.13, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`fa387dbed8a12b4bda6ef99d9a0b5a573b7b07aeee2f4b0b939a783952d4326a`
MD5	`701dfa33c5b304bd3fad9d97add8447d`
BLAKE2b-256	`3289a1a4da204193e19df5833abddfe06a98b946aa67351f251668ce15800397`

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.3.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp313-cp313-macosx_11_0_arm64.whl
Upload date: Feb 24, 2025
Size: 3.2 MB
Tags: CPython 3.13, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`d9515f218e984ef03b005bce6df17ad2755ec3199b897a0122fd8bc88f3be8a2`
MD5	`66c70d4080664141199bc1dc7f0021da`
BLAKE2b-256	`8edafe7d59e47d78f085fe249f07f19d7e1a858d48621d7347a1247197a2f8c7`

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.3.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp312-cp312-manylinux_2_34_x86_64.whl
Upload date: Feb 24, 2025
Size: 5.1 MB
Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`5b6eb7e2c2f92516df80142b13e561bda68c557bf63f26bd9eebc6921db02f2d`
MD5	`ebea20694586b53b264bdcf72acd8ac1`
BLAKE2b-256	`aeaca182c249be70190301944720d911e6fa1ac369ae4a3c2f409be38d818905`

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.3.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp312-cp312-macosx_11_0_arm64.whl
Upload date: Feb 24, 2025
Size: 3.2 MB
Tags: CPython 3.12, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`f6824d2d94c003671fd7d17917c7e9e20b7d8d1678cd056eaa88b59441aaaf16`
MD5	`06d222b2cc4739a3d689931ecaaac9cb`
BLAKE2b-256	`1be14b343617723d2f143314e9af484df679d5e19af54ded2dfeca7605b3b299`

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.3.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp311-cp311-manylinux_2_34_x86_64.whl
Upload date: Feb 24, 2025
Size: 5.1 MB
Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`0314e9fc1a0c87a510a8b3f1063dc4badac3ff6ed9c9b8f0991b71ade724aca1`
MD5	`401c7dca7f1f51d05a3067b1680965aa`
BLAKE2b-256	`bc8e8ee54c5bf677e38be21bbcac31b910bdd8a51c32da341bb6bd4d56cb1234`

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.3.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp311-cp311-macosx_11_0_arm64.whl
Upload date: Feb 24, 2025
Size: 3.2 MB
Tags: CPython 3.11, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`41b7b467bf618db39319ec48000fc25e39575e2783f38f289394dc375d81a3c0`
MD5	`2a466704de31bfd7dafb3a5c76a53852`
BLAKE2b-256	`a2bb428a4506a1b37d2830f0d9798ad5cd21c2b5a4f9dfbb927c8121a415f214`

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.3.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp310-cp310-manylinux_2_34_x86_64.whl
Upload date: Feb 24, 2025
Size: 5.1 MB
Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`0e2071f89d3d799f36ba9ec55067d39703cef503cc61d46608dcac280c102ed0`
MD5	`aaf3b096b77d3a15ee7701c8db34e00b`
BLAKE2b-256	`a181621cfaee05aecd0b286e81e31d6abee0012382f260f9523a63940a82b3c9`

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.3.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp310-cp310-macosx_11_0_arm64.whl
Upload date: Feb 24, 2025
Size: 3.2 MB
Tags: CPython 3.10, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`e511d2fe8f14cffbfdb13cab0fb90ce61eae7a0e75fd5c06d3aeb90c827b9341`
MD5	`609e95af9994fb2e4eab1b49e5ac91cf`
BLAKE2b-256	`8fc2ac0ddc33a059e500b7e4c88ce64df20bd93ef0a177ce8d037db369504e12`

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.3.0-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp39-cp39-manylinux_2_34_x86_64.whl
Upload date: Feb 24, 2025
Size: 5.1 MB
Tags: CPython 3.9, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`812c155cfd59eb3be09c6c7bf17ddab73b07e9e02725b046dfdee07797052ee9`
MD5	`50b70538a0233a2c9785c1b050118b8e`
BLAKE2b-256	`75c528c748bff15f7c50d56aad49b9ded9bd01c2095f6bfd6146075f015b6b5f`

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.3.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

Download URL: fish_speech_rs-0.3.0-cp39-cp39-macosx_11_0_arm64.whl
Upload date: Feb 24, 2025
Size: 3.2 MB
Tags: CPython 3.9, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.8.2

File hashes

Hashes for fish_speech_rs-0.3.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`88736af77f252d16a6eba3232a7f359a4071e46e037b1588e8af4c566f3ff823`
MD5	`30127f0590411fb63c22eb4bcf8ebe27`
BLAKE2b-256	`27b27bd9717cc1005ff2de3ad5536d4a3c28adcb5fef7e33fa6d00f2f927bb39`

See more details on using hashes here.

fish-speech-rs 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

fish_speech_python

Supported Platforms

Installation

Usage

Codec

LM

Developing

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes