The official Python library for the Fish Audio API
Project description
Fish Audio Python SDK
The official Python library for the Fish Audio API
Documentation: Python SDK Guide | API Reference
[!IMPORTANT]
Changes to PyPI Versioning
For existing users on Fish Audio Python SDK, please note that the starting version is now
1.0.0. The last version before this was2025.6.3. You may need to adjust your version constraints accordingly.The original API in the
fish_audio_sdkpackage has NOT been removed, but you will not receive any updates if you continue using the old versioning scheme.The simplest fix is to update your dependency to
fish-audio-sdk>=1.0.0to continue receiving updates, or by pinning to a specific version likefish-audio-sdk==1.0.0when installing via your package manager. There are no changes to the API itself in this transition.If you're using the legacy
fish_audio_sdkand would like to switch to the newer, more robustfishaudiopackage, see the migration guide to upgrade.
Installation
pip install fish-audio-sdk
# With audio playback utilities
pip install fish-audio-sdk[utils]
Authentication
Get your API key from fish.audio/app/api-keys:
export FISH_API_KEY=your_api_key_here
Or provide directly:
from fishaudio import FishAudio
client = FishAudio(api_key="your_api_key")
Quick Start
Synchronous:
from fishaudio import FishAudio
from fishaudio.utils import play, save
client = FishAudio()
# Generate audio
audio = client.tts.convert(text="Hello, world!")
# Play or save
play(audio)
save(audio, "output.mp3")
Asynchronous:
import asyncio
from fishaudio import AsyncFishAudio
from fishaudio.utils import play, save
async def main():
client = AsyncFishAudio()
audio = await client.tts.convert(text="Hello, world!")
play(audio)
save(audio, "output.mp3")
asyncio.run(main())
Core Features
Text-to-Speech
With custom voice:
# Use a specific voice by ID
audio = client.tts.convert(
text="Custom voice",
reference_id="802e3bc2b27e49c2995d23ef70e6ac89"
)
With speed control:
audio = client.tts.convert(
text="Speaking faster!",
speed=1.5 # 1.5x speed
)
Reusable configuration:
from fishaudio.types import TTSConfig, Prosody
config = TTSConfig(
prosody=Prosody(speed=1.2, volume=-5),
reference_id="933563129e564b19a115bedd57b7406a",
format="wav",
latency="balanced"
)
# Reuse across generations
audio1 = client.tts.convert(text="First message", config=config)
audio2 = client.tts.convert(text="Second message", config=config)
Chunk-by-chunk processing:
# Stream and process chunks as they arrive
for chunk in client.tts.stream(text="Long content..."):
send_to_websocket(chunk)
# Or collect all chunks
audio = client.tts.stream(text="Hello!").collect()
Speech-to-Text
# Transcribe audio
with open("audio.wav", "rb") as f:
result = client.asr.transcribe(audio=f.read(), language="en")
print(result.text)
# Access timestamped segments
for segment in result.segments:
print(f"[{segment.start:.2f}s - {segment.end:.2f}s] {segment.text}")
Real-time Streaming
Stream dynamically generated text for conversational AI and live applications:
Synchronous:
def text_chunks():
yield "Hello, "
yield "this is "
yield "streaming!"
audio_stream = client.tts.stream_websocket(text_chunks(), latency="balanced")
play(audio_stream)
Asynchronous:
async def text_chunks():
yield "Hello, "
yield "this is "
yield "streaming!"
audio_stream = await client.tts.stream_websocket(text_chunks(), latency="balanced")
play(audio_stream)
Voice Cloning
Instant cloning:
from fishaudio.types import ReferenceAudio
# Clone voice on-the-fly
with open("reference.wav", "rb") as f:
audio = client.tts.convert(
text="Cloned voice speaking",
references=[ReferenceAudio(
audio=f.read(),
text="Text spoken in reference"
)]
)
Persistent voice models:
# Create voice model for reuse
with open("voice_sample.wav", "rb") as f:
voice = client.voices.create(
title="My Voice",
voices=[f.read()],
description="Custom voice clone"
)
# Use the created model
audio = client.tts.convert(
text="Using my saved voice",
reference_id=voice.id
)
Resource Clients
| Resource | Description | Key Methods |
|---|---|---|
client.tts |
Text-to-speech | convert(), stream(), stream_websocket() |
client.asr |
Speech recognition | transcribe() |
client.voices |
Voice management | list(), get(), create(), update(), delete() |
client.account |
Account info | get_credits(), get_package() |
Error Handling
from fishaudio.exceptions import (
AuthenticationError,
RateLimitError,
ValidationError,
FishAudioError
)
try:
audio = client.tts.convert(text="Hello!")
except AuthenticationError:
print("Invalid API key")
except RateLimitError:
print("Rate limit exceeded")
except ValidationError as e:
print(f"Invalid request: {e}")
except FishAudioError as e:
print(f"API error: {e}")
Resources
- Documentation: SDK Guide | API Reference
- Package: PyPI | GitHub
- Legacy SDK: Documentation | Migration Guide
License
This project is licensed under the Apache-2.0 License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fish_audio_sdk-1.2.0.tar.gz.
File metadata
- Download URL: fish_audio_sdk-1.2.0.tar.gz
- Upload date:
- Size: 736.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27cdae2ff62f3ef989df8fe76a51606c0e9d3ad35791243e65e7ed18ca82c165
|
|
| MD5 |
696f53c2be664873846cb2e49ed55853
|
|
| BLAKE2b-256 |
202cc36eb069e5a12bc9d606fc91ca8b8a96ed54d41477ece2e9de7a2ea820c2
|
Provenance
The following attestation bundles were made for fish_audio_sdk-1.2.0.tar.gz:
Publisher:
python.yml on fishaudio/fish-audio-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fish_audio_sdk-1.2.0.tar.gz -
Subject digest:
27cdae2ff62f3ef989df8fe76a51606c0e9d3ad35791243e65e7ed18ca82c165 - Sigstore transparency entry: 811966396
- Sigstore integration time:
-
Permalink:
fishaudio/fish-audio-python@9f70c82635e815a19d6d668582d553f3be0e9ca8 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/fishaudio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@9f70c82635e815a19d6d668582d553f3be0e9ca8 -
Trigger Event:
push
-
Statement type:
File details
Details for the file fish_audio_sdk-1.2.0-py3-none-any.whl.
File metadata
- Download URL: fish_audio_sdk-1.2.0-py3-none-any.whl
- Upload date:
- Size: 41.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b85c5a4b9285e79674d1c64a5b1cca2c0818402395ad62cb46201c20e2048c2
|
|
| MD5 |
37d8406e29e556e5511dc00455a89f61
|
|
| BLAKE2b-256 |
022145deb1ef8214803e3921393aea2a51bd4c78f5e905c8e0b6f749d2b15302
|
Provenance
The following attestation bundles were made for fish_audio_sdk-1.2.0-py3-none-any.whl:
Publisher:
python.yml on fishaudio/fish-audio-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fish_audio_sdk-1.2.0-py3-none-any.whl -
Subject digest:
7b85c5a4b9285e79674d1c64a5b1cca2c0818402395ad62cb46201c20e2048c2 - Sigstore transparency entry: 811966411
- Sigstore integration time:
-
Permalink:
fishaudio/fish-audio-python@9f70c82635e815a19d6d668582d553f3be0e9ca8 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/fishaudio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@9f70c82635e815a19d6d668582d553f3be0e9ca8 -
Trigger Event:
push
-
Statement type: