Skip to main content

High-performance speech synthesis

Project description

fish_speech_python

Python bindings for the Fish Speech Candle implementation, using PyO3.

Installation

Supports Python 3.9+.

Supported Platforms

  • Linux:
    • x86_64, glibc 2.34+
    • CUDA 12+ with compute capability >= 8.0 (RTX 30 series+, A100 series+)
  • macOS (M1+, 14.0+ (Monterey))

Windows and AMD hardware will never be supported, so don't ask. Feel free to raise an issue if you need ARM or Alpine Linux.

# From PyPI
pip install fish_speech_rs

Usage

Codec

This is the low-level API. You feed it PCM audio, it compresses it into codes, and then decompresses it back into PCM.

from fish_speech import FireflyCodec
from huggingface_hub import snapshot_dir
import numpy as np

# Download weights from Hugging Face
dir = snapshot_dir("jkeisling/fish-speech-1.5")

# Load the codec model (set device to "cuda" for speed)
codec = FireflyCodec(
    dir,
    version="1.5",  # Supports 1.4 and 1.5
    device="cuda"    # Or "cpu" if you hate yourself, "metal" on Apple Silicon
)

# Encode raw PCM into compressed codes
pcm = np.random.randn(1, 1, 44_100).astype(np.float32)  # (batch, channels, samples)
codes = codec.encode(pcm)

# Decode the compressed codes back into PCM
decoded_pcm = codec.decode(codes)
  • Input: Raw PCM audio (please handle resampling to 44.1 kHz yourself)
  • Output: Encoded Numpy uint32 “codes” (compressed speech)

LM

The language model (LM) takes text and turns it into speech codes, which you then decode back to audio.

from fish_speech import LM, preprocess_text
from typing import List

# Load the TTS model
lm = LM(
    dir,
    version="1.5",
    device="cuda"
)

# Extract the speaker prompt from reference audio
speaker_prompt = lm.get_speaker_prompt([{
    'text': 'foobar',
    'audio': codes  # From previous encoding step
}], sysprompt="Speak out the provided text.")

# Preprocess text (splits into chunks)
chunks: List[str] = preprocess_text("Hello world. This is fast as hell.")

# Generate speech codes (you can stream this too)
generated_codes = lm.generate(chunks, speaker_prompt=speaker_prompt)

# Decode to PCM audio using codec from earlier
pcm = codec.decode(generated_codes)

Local installation

Requires Python and Rust toolchains.

  1. python -m venv .venv
  2. pip install -r requirements.txt
  3. maturin develop

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fish_speech_rs-0.2.2-cp313-cp313-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.2.2-cp313-cp313-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

fish_speech_rs-0.2.2-cp312-cp312-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.2.2-cp312-cp312-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

fish_speech_rs-0.2.2-cp311-cp311-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.2.2-cp311-cp311-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

fish_speech_rs-0.2.2-cp310-cp310-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.2.2-cp310-cp310-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

fish_speech_rs-0.2.2-cp39-cp39-manylinux_2_34_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

fish_speech_rs-0.2.2-cp39-cp39-macosx_11_0_arm64.whl (3.2 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file fish_speech_rs-0.2.2-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 9d1e12f8e318556f45d279a13ae33ab5f3d87e4b73ec349266360659534b8342
MD5 81db6d663ca3f91cf7e9bf0c56cd0fc9
BLAKE2b-256 8b4580d4c03cb1782f967f69a6430e92c81e57a745de90cf6a4ac5c61664b17e

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.2.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b252a4f6dd6f0c72e5ea36f565d3436985f70989dfec491ec5fb36a7ce6067c3
MD5 3721437fef9666f8393016c0b077e7de
BLAKE2b-256 aaf22fac053edef9c2a6974fb62a200f0350782623433fa445d6923091350ec5

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.2.2-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4a9e5a8fd258756f2c5a15bf841251fc4e0c9b23420904c7ecfa60b185cfd545
MD5 f9baa84803e11e5ef8ec034d6c2c11d1
BLAKE2b-256 8acb3d343e7cc8219df615570a2330b5ba3dcbf6e43396d16f35e5dab21f5227

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.2.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7628ed9b3baf3dafc89fb5c03a042e4adcd3467d6cf0979d862879d65951ae86
MD5 4d402ac015c6432cc36aed0f4aafd3a9
BLAKE2b-256 9d6242eb504d9a92f0ed0233c7b5fcec57a33fc30037629cf6693bb9d425c407

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.2.2-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 a0d27f3d8a33163347ec51131962713f469669d0f5d9e1ae99628cdb5d7134b7
MD5 1cee19a98dcf6d93ddc0148feb276829
BLAKE2b-256 7407dd9fc3c59b373906a6e61051a29adf200f9ee6623c11bc82590af6cc5896

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.2.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 07ce89fc2aa38fa3d48e25a49e5b4f55f541c98d40015c62ad0ea3ef40934c7d
MD5 160562c2d84dd64a1ac66bda02ea71af
BLAKE2b-256 cd03e9faddda9a6f62842c66498678dd12bc36fc245d7a93f4d81023e55c0405

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.2.2-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 8f502c4520bce84112899802f78a994ad2179019ecb75d8c6d756e571165efd2
MD5 71506ac7d433199e52b31d9418c93f55
BLAKE2b-256 97ea6e1c66aaa5cbce3b46a7cec545ffff686c136872b301547d71c6e3fbef31

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.2.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 85353575a02ad4859eac5a5134200b2061d62da1fc28fbec8a8c3ded35cca7ff
MD5 264f37f80953d73591406b1800c4cf4d
BLAKE2b-256 7df107c395589085177a151c41e1806a50cb3d2f006c280590eae7e2b5a43497

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.2.2-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 18f6786bf5015334c22bfc5901bf3e23530a404769ae253725b680c4cd16acb9
MD5 f874cb262befe42e762dc5df1e77c2c7
BLAKE2b-256 87281a66286505e4d796fa128bb0d957e0dedb3b62928bd70884c89304f4a513

See more details on using hashes here.

File details

Details for the file fish_speech_rs-0.2.2-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fish_speech_rs-0.2.2-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 587ad72ef2a148ab32a0ffe4b0f0f33b4eb4a60e95f7e5efff35df51fd9a68d5
MD5 f56648812ed1859d082b393afeee18f1
BLAKE2b-256 4513355f8fd3b98aec97b7022346c24f2c953b2850c200a5c6bc0b6718a015b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page