音频下载 + 语音转文字 CLI 工具，支持 B站/YouTube 等视频平台及直链音频，中文简繁体可选

These details have not been verified by PyPI

Project links

Homepage

Project description

audio-to-text-cli

音频下载 + 语音转文字命令行工具。

给一个链接，自动下载音频并转录为文字。支持多语言自动识别，中文支持简繁体切换。

支持的音频来源

来源	示例
Bilibili	`https://www.bilibili.com/video/BVxxxxx`
YouTube	`https://www.youtube.com/watch?v=xxxxx`
抖音	`https://www.douyin.com/video/xxxxx`
TikTok	`https://www.tiktok.com/@user/video/xxxxx`
Twitter/X	`https://x.com/user/status/xxxxx`
Vimeo	`https://vimeo.com/xxxxx`
直链音频	`https://example.com/audio.mp3`
其他	任何 yt-dlp 支持的平台

系统要求

Python 3.10+（唯一前置条件）
无需单独安装 FFmpeg — 依赖包自带
无需 C/C++ 编译器 — 所有依赖均有预编译包

支持的操作系统

平台	架构	支持
Windows	x86-64	✅
Linux	x86-64	✅
Linux	ARM64	✅
macOS	Intel (x86-64)	✅
macOS	Apple Silicon (M系列)	✅

GPU 加速（可选）

默认使用 CPU 运行，无需额外配置。如需 GPU 加速，需安装：

NVIDIA CUDA 12
cuDNN 9 for CUDA 12

安装

从 PyPI 安装：

pip install audio-to-text-cli

安装后如果提示 audio-to-text 命令找不到，可以用 python -m audio_to_text.cli 代替，或将 Python Scripts 目录加入系统 PATH。

或从源码安装：

git clone https://github.com/afuaide/audio-to-text-cli.git
cd audio-to-text-cli
pip install -e .

使用方法

安装后会注册 audio-to-text 命令：

# 基本用法 — 自动检测语言，中文默认输出简体
audio-to-text "https://www.bilibili.com/video/BV1wqLV6JEga/"

# 用较小的模型（更快，精度稍低）
audio-to-text -m small "https://www.youtube.com/watch?v=xxxxx"

# 指定语言（跳过自动检测，更快）
audio-to-text -l zh "URL"

# 输出到文件
audio-to-text -o result.txt "URL"

# 中文输出繁体
audio-to-text -c traditional "URL"

# 中文不做简繁转换，保留模型原始输出
audio-to-text -c raw "URL"

# 保留下载的音频文件
audio-to-text --keep-audio "URL"

# 指定音频下载目录
audio-to-text -d ./downloads "URL"

也可以通过 python -m 运行：

python -m audio_to_text.cli "URL"

参数说明

参数	说明	默认值
`url`	音频URL或视频平台链接	必填
`-m, --model`	Whisper 模型大小: `tiny` / `base` / `small` / `medium` / `large-v3`	`large-v3`
`-l, --language`	语言代码 (`zh` 中文, `en` 英语, `ja` 日语, `ko` 韩语等)	自动检测
`-c, --chinese`	中文输出格式: `simplified`(简体) / `traditional`(繁体) / `raw`(不转换)	`simplified`
`-o, --output`	转录结果保存路径	终端输出
`-d, --download-dir`	音频下载目录	临时目录
`--keep-audio`	保留下载的音频文件	否

模型选择

模型	大小	速度	精度	建议场景
`tiny`	~75MB	极快	一般	快速预览
`base`	~150MB	很快	尚可	日常使用
`small`	~500MB	快	较好	推荐起步
`medium`	~1.5GB	中等	很好	精度优先
`large-v3`	~3GB	较慢	最佳	追求最高精度

首次使用某个模型会自动从 HuggingFace 下载，之后会缓存到本地。

依赖说明

依赖	作用
`yt-dlp`	从视频平台提取音频
`faster-whisper`	语音转文字引擎（本地离线运行）
`requests`	下载直链音频
`imageio-ffmpeg`	自带 ffmpeg，用于音频格式转换
`opencc-python-reimplemented`	中文简繁体转换

所有依赖通过 pip install 安装，无需额外安装系统软件。

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.1

May 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audio_to_text_cli-1.0.1.tar.gz (7.6 kB view details)

Uploaded May 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audio_to_text_cli-1.0.1-py3-none-any.whl (9.0 kB view details)

Uploaded May 21, 2026 Python 3

File details

Details for the file audio_to_text_cli-1.0.1.tar.gz.

File metadata

Download URL: audio_to_text_cli-1.0.1.tar.gz
Upload date: May 21, 2026
Size: 7.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audio_to_text_cli-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`1bd08cf57d7d06b316d5ceb4bcc225972a3f9b8241f25800809900e53eaf1aa0`
MD5	`f7c782f7fcc3fe5a251d3aa2b833a90a`
BLAKE2b-256	`55498b06af449a512f58e535566eab555e35c2a59fe3de68f40e1e0c3b80b961`

See more details on using hashes here.

Provenance

The following attestation bundles were made for audio_to_text_cli-1.0.1.tar.gz:

Publisher: publish.yml on afuaide/audio-to-text

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: audio_to_text_cli-1.0.1.tar.gz
- Subject digest: 1bd08cf57d7d06b316d5ceb4bcc225972a3f9b8241f25800809900e53eaf1aa0
- Sigstore transparency entry: 1590648747
- Sigstore integration time: May 21, 2026
Source repository:
- Permalink: afuaide/audio-to-text@7d0b0118c898e7dca0743dc9acf5d84db0f3451b
- Branch / Tag: refs/tags/v1.0.1
- Owner: https://github.com/afuaide
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7d0b0118c898e7dca0743dc9acf5d84db0f3451b
- Trigger Event: release

File details

Details for the file audio_to_text_cli-1.0.1-py3-none-any.whl.

File metadata

Download URL: audio_to_text_cli-1.0.1-py3-none-any.whl
Upload date: May 21, 2026
Size: 9.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audio_to_text_cli-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c874ebf441183e1835337645b91979b95d1480cfd3eb4e3a7c3e76eb6b38a318`
MD5	`2f8ce305837f05d82797329f9f0b2ee1`
BLAKE2b-256	`4c9a25fef9bd2302b822135bafa3fad12607ffe9ac3c716af5f781322be1a1f8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for audio_to_text_cli-1.0.1-py3-none-any.whl:

Publisher: publish.yml on afuaide/audio-to-text

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: audio_to_text_cli-1.0.1-py3-none-any.whl
- Subject digest: c874ebf441183e1835337645b91979b95d1480cfd3eb4e3a7c3e76eb6b38a318
- Sigstore transparency entry: 1590648801
- Sigstore integration time: May 21, 2026
Source repository:
- Permalink: afuaide/audio-to-text@7d0b0118c898e7dca0743dc9acf5d84db0f3451b
- Branch / Tag: refs/tags/v1.0.1
- Owner: https://github.com/afuaide
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7d0b0118c898e7dca0743dc9acf5d84db0f3451b
- Trigger Event: release

audio-to-text-cli 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

audio-to-text-cli

支持的音频来源

系统要求

支持的操作系统

GPU 加速（可选）

安装

使用方法

参数说明

模型选择

依赖说明

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance