Skip to main content

The official Python library for the Fish Audio API

Project description

python.png

Fish Audio Python SDK

PyPI version Python Version PyPI - Downloads codecov License

The official Python library for the Fish Audio API

Documentation: Python SDK Guide | API Reference

[!IMPORTANT]

Changes to PyPI Versioning

For existing users on Fish Audio Python SDK, please note that the starting version is now 1.0.0. The last version before this was 2025.6.3. You may need to adjust your version constraints accordingly.

The original API in the fish_audio_sdk package has NOT been removed, but you will not receive any updates if you continue using the old versioning scheme.

The simplest fix is to update your dependency to fish-audio-sdk>=1.0.0 to continue receiving updates, or by pinning to a specific version like fish-audio-sdk==1.0.0 when installing via your package manager. There are no changes to the API itself in this transition.

If you're using the legacy fish_audio_sdk and would like to switch to the newer, more robust fishaudio package, see the migration guide to upgrade.

Installation

pip install fish-audio-sdk

# With audio playback utilities
pip install fish-audio-sdk[utils]

Authentication

Get your API key from fish.audio/app/api-keys:

export FISH_API_KEY=your_api_key_here

Or provide directly:

from fishaudio import FishAudio

client = FishAudio(api_key="your_api_key")

Quick Start

Synchronous:

from fishaudio import FishAudio
from fishaudio.utils import play, save

client = FishAudio()

# Generate audio
audio = client.tts.convert(text="Hello, world!")

# Play or save
play(audio)
save(audio, "output.mp3")

Asynchronous:

import asyncio
from fishaudio import AsyncFishAudio
from fishaudio.utils import play, save

async def main():
    client = AsyncFishAudio()
    audio = await client.tts.convert(text="Hello, world!")
    play(audio)
    save(audio, "output.mp3")

asyncio.run(main())

Core Features

Text-to-Speech

With custom voice:

# Use a specific voice by ID
audio = client.tts.convert(
    text="Custom voice",
    reference_id="802e3bc2b27e49c2995d23ef70e6ac89"
)

With speed control:

audio = client.tts.convert(
    text="Speaking faster!",
    speed=1.5  # 1.5x speed
)

Reusable configuration:

from fishaudio.types import TTSConfig, Prosody

config = TTSConfig(
    prosody=Prosody(speed=1.2, volume=-5),
    reference_id="933563129e564b19a115bedd57b7406a",
    format="wav",
    latency="balanced"
)

# Reuse across generations
audio1 = client.tts.convert(text="First message", config=config)
audio2 = client.tts.convert(text="Second message", config=config)

Chunk-by-chunk processing:

# Stream and process chunks as they arrive
for chunk in client.tts.stream(text="Long content..."):
    send_to_websocket(chunk)

# Or collect all chunks
audio = client.tts.stream(text="Hello!").collect()

Learn more

Speech-to-Text

# Transcribe audio
with open("audio.wav", "rb") as f:
    result = client.asr.transcribe(audio=f.read(), language="en")

print(result.text)

# Access timestamped segments
for segment in result.segments:
    print(f"[{segment.start:.2f}s - {segment.end:.2f}s] {segment.text}")

Learn more

Real-time Streaming

Stream dynamically generated text for conversational AI and live applications:

Synchronous:

def text_chunks():
    yield "Hello, "
    yield "this is "
    yield "streaming!"

audio_stream = client.tts.stream_websocket(text_chunks(), latency="balanced")
play(audio_stream)

Asynchronous:

async def text_chunks():
    yield "Hello, "
    yield "this is "
    yield "streaming!"

audio_stream = await client.tts.stream_websocket(text_chunks(), latency="balanced")
play(audio_stream)

Learn more

Voice Cloning

Instant cloning:

from fishaudio.types import ReferenceAudio

# Clone voice on-the-fly
with open("reference.wav", "rb") as f:
    audio = client.tts.convert(
        text="Cloned voice speaking",
        references=[ReferenceAudio(
            audio=f.read(),
            text="Text spoken in reference"
        )]
    )

Persistent voice models:

# Create voice model for reuse
with open("voice_sample.wav", "rb") as f:
    voice = client.voices.create(
        title="My Voice",
        voices=[f.read()],
        description="Custom voice clone"
    )

# Use the created model
audio = client.tts.convert(
    text="Using my saved voice",
    reference_id=voice.id
)

Learn more

Resource Clients

Resource Description Key Methods
client.tts Text-to-speech convert(), stream(), stream_websocket()
client.asr Speech recognition transcribe()
client.voices Voice management list(), get(), create(), update(), delete()
client.account Account info get_credits(), get_package()

Error Handling

from fishaudio.exceptions import (
    AuthenticationError,
    RateLimitError,
    ValidationError,
    FishAudioError
)

try:
    audio = client.tts.convert(text="Hello!")
except AuthenticationError:
    print("Invalid API key")
except RateLimitError:
    print("Rate limit exceeded")
except ValidationError as e:
    print(f"Invalid request: {e}")
except FishAudioError as e:
    print(f"API error: {e}")

Resources

License

This project is licensed under the Apache-2.0 License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fish_audio_sdk-1.2.0.tar.gz (736.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fish_audio_sdk-1.2.0-py3-none-any.whl (41.2 kB view details)

Uploaded Python 3

File details

Details for the file fish_audio_sdk-1.2.0.tar.gz.

File metadata

  • Download URL: fish_audio_sdk-1.2.0.tar.gz
  • Upload date:
  • Size: 736.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fish_audio_sdk-1.2.0.tar.gz
Algorithm Hash digest
SHA256 27cdae2ff62f3ef989df8fe76a51606c0e9d3ad35791243e65e7ed18ca82c165
MD5 696f53c2be664873846cb2e49ed55853
BLAKE2b-256 202cc36eb069e5a12bc9d606fc91ca8b8a96ed54d41477ece2e9de7a2ea820c2

See more details on using hashes here.

Provenance

The following attestation bundles were made for fish_audio_sdk-1.2.0.tar.gz:

Publisher: python.yml on fishaudio/fish-audio-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fish_audio_sdk-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: fish_audio_sdk-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 41.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fish_audio_sdk-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7b85c5a4b9285e79674d1c64a5b1cca2c0818402395ad62cb46201c20e2048c2
MD5 37d8406e29e556e5511dc00455a89f61
BLAKE2b-256 022145deb1ef8214803e3921393aea2a51bd4c78f5e905c8e0b6f749d2b15302

See more details on using hashes here.

Provenance

The following attestation bundles were made for fish_audio_sdk-1.2.0-py3-none-any.whl:

Publisher: python.yml on fishaudio/fish-audio-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page