LiveKit Agents plugin for ByteDance and Volcengine AI services.
Project description
ByteDance plugin for LiveKit Agents
Community-maintained LiveKit Agents plugin for ByteDance and Volcengine AI services.
This package is unofficial and is not currently maintained by ByteDance, Volcengine, or LiveKit.
Current Scope
livekit-plugins-bytedance is intentionally narrow today. The package name is
reserved for the broader ByteDance/Volcengine ecosystem, but version 0.1.x
only implements:
| Service | API | LiveKit class | Status |
|---|---|---|---|
| Volcengine TTS V3 bidirectional streaming | wss://openspeech.bytedance.com/api/v3/tts/bidirection |
livekit.plugins.bytedance.TTS |
Supported |
| Volcengine BigModel streaming ASR, optimized bidirectional mode | wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async |
livekit.plugins.bytedance.STT |
Supported |
The implemented clients follow these Volcengine WebSocket APIs:
- Volcengine TTS V3 bidirectional API: https://www.volcengine.com/docs/6561/1329505
- Volcengine BigModel ASR WebSocket API, optimized bidirectional streaming endpoint: https://www.volcengine.com/docs/6561/1354869
The supported WebSocket request paths are:
wss://openspeech.bytedance.com/api/v3/tts/bidirection
wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async
The TTS binary protocol constants are also checked against ByteDance's
reference helper package named TTS Websocket Bidirection protocols, including
the downstream event codes for UsageResponse (154), AudioMuted (250),
TTSResponse (352), TTSEnded (359), and TTSSubtitle (364).
Explicitly Not Supported Yet
This package does not currently implement:
- Volcengine legacy TTS v1 (
/api/v1/tts/ws_binary) - Volcengine ASR batch/offline APIs
- Volcengine ASR
bigmodel_nostreamstreaming-input mode - Volcengine ASR legacy non-optimized bidirectional path
(
/api/v3/sauc/bigmodel) - Doubao/Ark LLM APIs
- Volcengine realtime dialogue APIs
- ByteDance video, image, embedding, or moderation APIs
- Non-streaming LiveKit
TTS.synthesize()
For those services, use a provider-specific package if one exists. The existing
third-party livekit-plugins-volcengine package is separate from this package
and uses the livekit.plugins.volcengine import namespace.
Supported TTS Features
- Streaming synthesis through
TTS.stream() - Volcengine TTS V3 connection/session/task binary protocol
X-Api-Keyauthentication for the current Volcengine console- Legacy console authentication through
X-Api-App-KeyandX-Api-Access-Key resource_idvalues documented for this API:seed-tts-2.0seed-icl-2.0
- Optional cloned-voice model selection:
seed-tts-2.0-standardseed-tts-2.0-expressive
speakerssmlaudio_format:pcm,mp3,ogg_opus, orwavsample_ratebit_ratespeech_rateloudness_rateenable_subtitlerequest flagdisable_markdown_filterdisable_emoji_filterenable_latex_tnlatex_parserexplicit_languageexplicit_dialectaigc_watermarkaigc_metadatacache_configpost_process- TTS 2.0
context_texts use_tag_parserX-Control-Require-Usage-Tokens-Return- Server-side sentence splitting
- LiveKit retry behavior for transient websocket failures before audio is emitted
Subtitle and usage payloads can be requested from Volcengine, but this LiveKit TTS plugin currently exposes only synthesized audio frames through the LiveKit TTS stream. Non-audio protocol events such as usage responses, muted-audio signals, sentence boundaries, subtitles, and TTS-ended markers are parsed and ignored for now rather than surfaced as LiveKit TTS events.
Supported STT Features
- Streaming recognition through
STT.stream() - Volcengine BigModel ASR WebSocket binary protocol v3
- Optimized bidirectional streaming endpoint:
wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async X-Api-Keyauthentication for the current Volcengine console- Legacy console authentication through
X-Api-App-KeyandX-Api-Access-Key X-Api-Resource-Id,X-Api-Request-Id,X-Api-Sequence, andX-Api-Connect-Idheaders- Default ASR 2.0 resource ID:
volc.seedasr.sauc.duration - PCM input at 16 kHz, 16-bit, mono
- Server-side VAD/final segmentation through
enable_nonstream=True - Interim and final LiveKit transcript events
- Word timestamps when Volcengine returns
utterances[*].words - Selected BigModel request options, including
enable_itn,enable_punc,enable_ddc,show_utterances,enable_speaker_info,ssd_version,result_type, VAD timing options, sensitive-word filtering, andcorpus - Escape hatches through
audio_optionsandrequest_optionsfor provider fields that are not first-class constructor arguments yet
The plugin does not send the ASR audio.language field by default because the
provider document scopes that field to the bigmodel_nostream endpoint, which
this package does not currently support.
The plugin sends credentials with Volcengine's V3 websocket headers:
X-Api-KeyX-Api-Resource-IdX-Api-Connect-Id
For legacy console credentials, it sends:
X-Api-App-KeyX-Api-Access-KeyX-Api-Resource-IdX-Api-Connect-Id
Installation
pip install livekit-plugins-bytedance
Credentials
Create or locate your Volcengine TTS V3 credentials in the Volcengine console, then pass them to the plugin explicitly:
from livekit.plugins import bytedance
tts = bytedance.TTS(
api_key="your-api-key",
resource_id="seed-tts-2.0",
)
If your application prefers environment variables, load them in your own config
layer and pass them to TTS. The plugin does not read environment variables by
itself.
Suggested variable names:
export VOLCENGINE_TTS_V3_API_KEY=...
export VOLCENGINE_TTS_V3_RESOURCE_ID=seed-tts-2.0
export VOLCENGINE_ASR_API_KEY=...
export VOLCENGINE_ASR_RESOURCE_ID=volc.seedasr.sauc.duration
The API also supports old-console authentication. If you still use those
credentials, pass both app_key and access_key instead of api_key.
Usage
Use the default TTS V3 model and speaker:
from livekit.plugins import bytedance
tts = bytedance.TTS(
api_key="your-api-key",
)
Use a specific Seed TTS resource and speaker:
tts = bytedance.TTS(
api_key="your-api-key",
resource_id="seed-tts-2.0",
speaker="zh_female_vv_uranus_bigtts",
)
Use TTS 2.0 style controls:
tts = bytedance.TTS(
api_key="your-api-key",
resource_id="seed-tts-2.0",
speaker="zh_female_vv_uranus_bigtts",
context_texts=["自然、专业、和善,像面试官一样说话"],
speech_rate=0,
loudness_rate=0,
)
Use the descriptive class name if you prefer:
from livekit.plugins.bytedance import VolcengineV3TTS
tts = VolcengineV3TTS(
api_key="your-api-key",
)
Use streaming ASR:
from livekit.plugins import bytedance
stt = bytedance.STT(
api_key="your-api-key",
resource_id="volc.seedasr.sauc.duration",
)
stream = stt.stream()
stream.push_frame(audio_frame)
stream.end_input()
async for event in stream:
if event.type == "final_transcript":
print(event.alternatives[0].text)
Testing
Run the default suite:
uv run pytest livekit-plugins/livekit-plugins-bytedance
The default tests are hermetic and do not require Volcengine credentials. They cover the TTS V3 binary protocol, ASR WebSocket v3 binary protocol, websocket handshake headers, retry behavior, zombie websocket handling, server error classification, partial audio drain behavior, and ASR transcript event mapping.
Real end-to-end tests should use a separate marker and require:
export VOLCENGINE_TTS_V3_API_KEY=...
export VOLCENGINE_TTS_V3_RESOURCE_ID=seed-tts-2.0
export VOLCENGINE_ASR_API_KEY=...
export VOLCENGINE_ASR_RESOURCE_ID=volc.seedasr.sauc.duration
License
Apache-2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file livekit_plugins_bytedance-0.1.0.tar.gz.
File metadata
- Download URL: livekit_plugins_bytedance-0.1.0.tar.gz
- Upload date:
- Size: 19.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e206250ffc858af0deac75a2f9ee3a529ff03fa73465d75c2e268dd87fdae626
|
|
| MD5 |
9616016a403d3c74a9fa2796f9d3821b
|
|
| BLAKE2b-256 |
785118316d5dc4f543eba0ca4aa002ae58cc8848efcc573b32912326c632be09
|
Provenance
The following attestation bundles were made for livekit_plugins_bytedance-0.1.0.tar.gz:
Publisher:
publish-pypi-bytedance.yml on Ao-Last/livekit-plugins-extra
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
livekit_plugins_bytedance-0.1.0.tar.gz -
Subject digest:
e206250ffc858af0deac75a2f9ee3a529ff03fa73465d75c2e268dd87fdae626 - Sigstore transparency entry: 1791830123
- Sigstore integration time:
-
Permalink:
Ao-Last/livekit-plugins-extra@71954fe6d0188833dd7eb4f53dfa5bbc3e2fffbd -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Ao-Last
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi-bytedance.yml@71954fe6d0188833dd7eb4f53dfa5bbc3e2fffbd -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file livekit_plugins_bytedance-0.1.0-py3-none-any.whl.
File metadata
- Download URL: livekit_plugins_bytedance-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1057d0861a48801e091d1727363d38bee3a1dd68fed78ab145d0c7271edc1687
|
|
| MD5 |
025cdef9428aca98d1ede150a344c720
|
|
| BLAKE2b-256 |
74a06753b7171c3c9de9b145dc269d3786086b4ffe10d58d3cf70aeb2bc4ac09
|
Provenance
The following attestation bundles were made for livekit_plugins_bytedance-0.1.0-py3-none-any.whl:
Publisher:
publish-pypi-bytedance.yml on Ao-Last/livekit-plugins-extra
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
livekit_plugins_bytedance-0.1.0-py3-none-any.whl -
Subject digest:
1057d0861a48801e091d1727363d38bee3a1dd68fed78ab145d0c7271edc1687 - Sigstore transparency entry: 1791830168
- Sigstore integration time:
-
Permalink:
Ao-Last/livekit-plugins-extra@71954fe6d0188833dd7eb4f53dfa5bbc3e2fffbd -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Ao-Last
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi-bytedance.yml@71954fe6d0188833dd7eb4f53dfa5bbc3e2fffbd -
Trigger Event:
workflow_dispatch
-
Statement type: