Async streaming speech-to-text helper built on Google Cloud Speech

These details have not been verified by PyPI

Project links

Project description

Smooth Transcriber

smooth-transcriber 提供一个异步流式语音转文本的 Python 库，基于 Google Cloud Speech-to-Text 官方示例进行了抽象。该库专注于以下目标：

从音频流实时提取文本，支持边上传边获取识别结果；
提供一致的数据模型与配置对象，方便在应用中集成；
使用 PyAV 将各种音频格式转换为 Google Speech API 所需的 PCM 流，支持异步迭代器接口；
通过标准日志接口输出关键事件，便于监控流式识别的状态。

快速开始

pip install smooth-transcriber

如果需要从 WebSocket 或其他源读取音频流，只需提供异步迭代器：

import asyncio
from smooth_transcriber import StreamingTranscriber, TranscriptionConfig

async def websocket_audio_stream():
    """从 WebSocket 或其他源接收音频数据（任意格式：webm、wav 等）"""
    async for chunk_bytes in receive_audio_chunks():
        yield chunk_bytes

async def main() -> None:
    # 方式1：指定语言
    config = TranscriptionConfig(language_code="zh-CN", chunk_delay=0.1)
    
    # 方式2：自动语言检测（不指定 language_code 或设置为 None）
    config_auto = TranscriptionConfig(language_code=None, chunk_delay=0.1)
    # 或者直接使用默认值（默认即为 None，启用自动检测）
    config_auto = TranscriptionConfig(chunk_delay=0.1)
    
    transcriber = StreamingTranscriber(credentials_path="google.json")

    # 直接传入音频流，自动处理格式转换
    async for event in transcriber.stream_from_audio(websocket_audio_stream(), config):
        if event.type == "final":
            print("[FINAL]", event.transcript)
        else:
            print("[INTERIM]", event.transcript)

asyncio.run(main())

语言配置

指定语言

如果需要指定特定语言，在创建 TranscriptionConfig 时设置 language_code：

config = TranscriptionConfig(language_code="zh-CN")  # 中文（简体）
config = TranscriptionConfig(language_code="en-US")  # 英语（美国）
config = TranscriptionConfig(language_code="ja-JP")  # 日语

自动语言检测

如果不指定 language_code（默认为 None），库会自动检测音频语言。自动检测支持以下常见语言：

中文（简体、繁体）
英语（美国、英国）
日语、韩语
西班牙语、法语、德语、俄语

# 启用自动语言检测
config = TranscriptionConfig()  # language_code 默认为 None
# 或者显式设置为 None
config = TranscriptionConfig(language_code=None)

注意：自动语言检测可能会略微增加识别延迟，如果已知音频语言，建议明确指定以获得更好的性能。

事件模型

stream_from_audio 和 stream 方法返回的 TranscriptionEvent 对象包含以下属性：

属性说明

type (Literal["interim", "final"]): 事件类型
- "interim": 临时识别结果，可能会被后续结果更新
- "final": 最终确认的识别结果，不会再改变
transcript (str): 转录的文本内容
is_final (bool): 是否为最终结果（与 type == "final" 等价）
confidence (Optional[float]): 置信度分数，范围通常在 0.0 到 1.0 之间
- 仅在 final 类型事件中提供
- interim 类型事件中为 None
stability (Optional[float]): 稳定性指标，表示识别结果的稳定程度
- 仅在 interim 类型事件中提供
- final 类型事件中为 None
result_index (Optional[int]): 结果索引，表示该结果在结果序列中的位置

使用示例

async for event in transcriber.stream_from_audio(audio_stream, config):
    if event.type == "final":
        print(f"[最终结果] {event.transcript}")
        if event.confidence:
            print(f"置信度: {event.confidence:.2%}")
    else:
        print(f"[临时结果] {event.transcript}")
        if event.stability:
            print(f"稳定性: {event.stability:.2%}")
    
    # 检查是否为空
    if event.is_empty():
        continue

设计亮点

完全异步：利用 asyncio 与 Google Speech Async Client，可在事件循环中无缝集成。
模块化：配置、事件模型、音频解码与 API 客户端解耦，可根据需要扩展或替换。
多格式支持：自动检测音频格式并转码，支持多种格式，无需手动转换。
可测试性：提供清晰的接口和模块化设计，便于编写单元测试。

依赖

Python 3.9+
google-cloud-speech
google-auth
av (PyAV) - 用于音频编解码

系统依赖

所有格式：需要安装 ffmpeg（PyAV 依赖 ffmpeg）
- macOS: brew install ffmpeg
- Ubuntu/Debian: sudo apt-get install ffmpeg
- Windows: 从 ffmpeg.org 下载

发布到 PyPI

安装构建工具

pip install build twine

构建分发包

# 构建源码分发包和 wheel 包
python -m build

构建完成后，会在 dist/ 目录下生成：

smooth-transcriber-X.X.X.tar.gz (源码分发包)
smooth_transcriber-X.X.X-py3-none-any.whl (wheel 包)

上传到 PyPI

测试环境 (TestPyPI)

# 上传到 TestPyPI 进行测试
python -m twine upload --repository testpypi dist/*

生产环境 (PyPI)

# 上传到 PyPI
python -m twine upload dist/*

发布前检查

在发布前，建议执行以下检查：

# 检查分发包
python -m twine check dist/*

# 运行测试
pytest

# 验证安装
pip install --upgrade --force-reinstall dist/smooth_transcriber-*.whl

版本管理

在 pyproject.toml 中更新版本号：

[project]
version = "0.1.0"  # 更新为新版本号

注意事项

确保已更新 pyproject.toml 中的版本号
确保所有测试通过
确保 README.md 和文档是最新的
首次发布前，建议先在 TestPyPI 上测试
需要 PyPI 账户和 API token（可通过 ~/.pypirc 配置或使用环境变量）

许可协议

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Nov 29, 2025

This version

0.2.0

Nov 18, 2025

0.1.0

Nov 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smooth_transcriber-0.2.0.tar.gz (20.4 kB view details)

Uploaded Nov 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

smooth_transcriber-0.2.0-py3-none-any.whl (13.9 kB view details)

Uploaded Nov 18, 2025 Python 3

File details

Details for the file smooth_transcriber-0.2.0.tar.gz.

File metadata

Download URL: smooth_transcriber-0.2.0.tar.gz
Upload date: Nov 18, 2025
Size: 20.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for smooth_transcriber-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`fbed533550b1bade8f16692a25d32fb2536053a1841a9163b4fc8c9c5b93f0cb`
MD5	`1680ed74ba22a8a566413f238b795c85`
BLAKE2b-256	`f01c1f37b189dd6e78e7c260b4056fd3c61be3c25f83b2bfc0a98ab6a454a246`

See more details on using hashes here.

File details

Details for the file smooth_transcriber-0.2.0-py3-none-any.whl.

File metadata

Download URL: smooth_transcriber-0.2.0-py3-none-any.whl
Upload date: Nov 18, 2025
Size: 13.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for smooth_transcriber-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d5ad0fc723f84b43b528e2d9496d799d932ac737830e2ac877d6f31fd15053c4`
MD5	`7168511f29fa641befd577a70ca6f93d`
BLAKE2b-256	`af49ca716f976e80673959eee8fcc2a04ea81118f6585b719bfb1ff10eaa1e0c`

See more details on using hashes here.

smooth-transcriber 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Smooth Transcriber

快速开始

语言配置

指定语言

自动语言检测

事件模型

属性说明

使用示例

设计亮点

依赖

系统依赖

发布到 PyPI

安装构建工具

构建分发包

上传到 PyPI

测试环境 (TestPyPI)

生产环境 (PyPI)

发布前检查

版本管理

注意事项

许可协议

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes