Python client and CLI for Volcengine/ByteDance Doubao seed-tts-2.0 bidirectional streaming TTS.

These details have not been verified by PyPI

Project links

Project description

doubao-tts

English | 中文

A small, production-minded Python client and CLI for Volcengine Doubao seed-tts-2.0 bidirectional streaming TTS — native-quality Chinese voices with emotion control, ready for agents, scripts, and serving pipelines.

Why

doubao-tts is the first PyPI package targeting Volcengine's seed-tts-2.0 bidirectional-streaming endpoint. Existing Python TTS wrappers either:

hit the older SAMI HTTP endpoint (no streaming, older voice quality), or
aren't published to PyPI at all.

This package fills that gap with:

A clean synthesize(text, out_path) interface
A CLI that drops straight into agent frameworks (Hermes, Dify, LangChain, n8n, …)
Strict mypy on every public module
95% unit test coverage, atomic output writes, proper credential redaction

Install

pip install doubao-tts

# or with uv:
uv add doubao-tts

# CLI-only, installed as a standalone tool:
uv tool install doubao-tts

Quick start

Python

from doubao_tts import synthesize

synthesize("你好，世界", "hello.mp3")

Async

from doubao_tts import synthesize_async

await synthesize_async(
    "Hello from Doubao seed-tts-2.0!",
    "hello.mp3",
    voice="en-female-assistant",
    speed=1.1,
)

CLI

# simplest
doubao-tts say "你好" --out hello.mp3

# pick a voice + adjust speed
doubao-tts say "好激动！" --voice zh-female-warm --speed 1.2 --out excited.mp3

# read from a file
doubao-tts say --text-file script.txt --out narration.mp3

# browse available voices
doubao-tts list-voices --lang zh

# inspect resolved config (tokens are redacted)
doubao-tts config show

Credentials

Credentials resolve in this order — first match wins:

Keyword arguments to synthesize(...)
Environment variables: VOLCENGINE_APP_ID, VOLCENGINE_ACCESS_TOKEN (also accepted as DOUBAO_APP_ID, DOUBAO_ACCESS_TOKEN)
~/.doubao-tts/config.yaml
Built-in defaults (speaker, audio format, sample rate)

Example ~/.doubao-tts/config.yaml:

app_id: "1234567890"
access_token: "volc_...."
speaker: zh_female_vv_uranus_bigtts
audio_format: mp3
sample_rate: 24000

Get your app ID and access token from the Volcengine Speech console. You need the seed-tts-2.0 product activated on your account.

Integration: Hermes Agent

Hermes Agent v0.x+ supports declarative TTS command providers via its tts.providers.<name> config block. Plug doubao-tts in:

# ~/.hermes/config.yaml
tts:
  provider: doubao
  providers:
    doubao:
      type: command
      command: 'doubao-tts say --text-file {input_path} --out {output_path}'

That's it. Any Hermes voice-out path now routes through Doubao seed-tts-2.0.

Voices

The CLI ships with a curated alias catalogue:

Alias	Language	Gender	Style
`zh-female-warm` (default)	zh-CN	female	warm, conversational
`zh-female-reporter`	zh-CN	female	crisp, news-reporter
`zh-male-warm`	zh-CN	male	warm, narrator
`zh-male-energetic`	zh-CN	male	energetic host
`en-female-assistant`	en-US	female	assistant, neutral
`en-male-assistant`	en-US	male	assistant, neutral

Volcengine publishes hundreds more speaker IDs. You can pass any raw speaker ID to voice= directly — aliases are a convenience, not a gate.

Emotion control

seed-tts-2.0 supports per-utterance emotion tags:

synthesize(
    "好激动，我终于做到了！",
    "out.mp3",
    emotion="excited",
    emotion_scale=4.0,  # 0-5; higher = more intense
)

Model-supported emotions vary by voice; consult the Volcengine console for the up-to-date list per speaker.

Performance notes

One synthesize() call opens a fresh WebSocket and tears it down at the end. End-to-end latency to a 24 kHz MP3 of ~2 seconds of speech is ~750 ms on a healthy connection — network dominates.
The seed-tts-2.0 bidi-stream session currently accepts one synthesis per session (empirically verified); connection reuse saves only TCP+TLS setup (~180 ms / call). A daemon mode with connection pooling is planned for a future release, but most users don't need it.
import doubao_tts is cheap — ~3 ms — because websockets and yaml are only imported on first synthesis call.

Error handling

All user-facing errors inherit from DoubaoTTSError:

from doubao_tts import (
    DoubaoTTSError, DoubaoConfigError,
    DoubaoAuthError, DoubaoAPIError, DoubaoTimeoutError,
    synthesize,
)

try:
    synthesize("你好", "out.mp3")
except DoubaoAuthError:
    ...  # rotate your token
except DoubaoTimeoutError:
    ...  # retry or check network
except DoubaoTTSError as exc:
    ...  # catch-all

Security

Access tokens are redacted in all logs and CLI output — see SECURITY.md for the exact policy.
User text is not logged by default. To troubleshoot protocol issues, opt in with DOUBAO_TTS_TRACE_PAYLOADS=1.
~/.doubao-tts/config.yaml is user-scoped; the shipped .gitignore excludes .env files at the project level.
Vulnerability reports: hypnus.yuan@gmail.com or a private GitHub security advisory.

Development

git clone https://github.com/Hypnus-Yuan/doubao-tts.git
cd doubao-tts

uv sync --all-extras --group dev
uv run pre-commit install
uv run pytest

See CONTRIBUTING.md for the full workflow.

Roadmap

v0.2 — connection-reuse daemon (saves ~180 ms / call on chained requests), streaming callback API, richer voice metadata.
v0.3 — integration recipes for LangChain, LlamaIndex, Dify.
v1.0 — API frozen, semver guarantees.

License

MIT — see LICENSE.

Credits

Protocol framing extracted and hardened from Hermes Agent community work. Thanks to the Volcengine Speech team for the seed-tts-2.0 bidirectional-streaming API.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doubao_tts-0.1.0.tar.gz (29.6 kB view details)

Uploaded Apr 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

doubao_tts-0.1.0-py3-none-any.whl (23.9 kB view details)

Uploaded Apr 30, 2026 Python 3

File details

Details for the file doubao_tts-0.1.0.tar.gz.

File metadata

Download URL: doubao_tts-0.1.0.tar.gz
Upload date: Apr 30, 2026
Size: 29.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for doubao_tts-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ebe56efffc69036624d464cfda7da76f85c6a6c2e5c9807a5de3b07054ef94c2`
MD5	`7e006218f25be8bf7eefe134fde9888e`
BLAKE2b-256	`ff0b453f2ae7f04c29800a64b4cbdf6ef623f6d41095d86e3a5dab3498e478cc`

See more details on using hashes here.

File details

Details for the file doubao_tts-0.1.0-py3-none-any.whl.

File metadata

Download URL: doubao_tts-0.1.0-py3-none-any.whl
Upload date: Apr 30, 2026
Size: 23.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for doubao_tts-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bf4c90d0cd34a477b2dd037b67cebc236c84e91cb4f0d379748d0ddd5be3c82c`
MD5	`dfea286d1fdc8300bd9cebd6212e33a7`
BLAKE2b-256	`3604e8ba4dfb6349168dc00790333e5d193baa06daafd9987831edf7497554f3`

See more details on using hashes here.

doubao-tts 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

doubao-tts

Why

Install

Quick start

Python

Async

CLI

Credentials

Integration: Hermes Agent

Voices

Emotion control

Performance notes

Error handling

Security

Development

Roadmap

License

Credits

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes