python silk voice library
Project description
pilk
python silk codec binding 支持微信语音编解码
pilk: python + silk
关联项目: weixin-wxposed-silk-voice
安装
pip install pilk
介绍与说明
SILK 是一种语音编码格式,由 Skype 公司研发,网上可找到的最新版本是 2012 发布的。
SILK 原始代码已上传到 Release , 包含规范文档
Tencent 系语音支持来自 silk-v3-decoder
Release 中也包含 silk-v3-decoder 重编译的 x64-win 版本,支持中文,源代码
SILK 编码格式 和 Tencent 系语音的关系:
此处 Tencent 系语音,仅以微信语音为例
- 标准 SILK 文件以
b'#!SILK_V3'
开始,以b'\xFF\xFF'
结束,中间为语音数据 - 微信语音文件在标准 SILK 文件的开头插入了
b'\x02'
,去除了结尾的b'\xFF\xFF'
,中间不变
已下统称为语音文件
语音数据
语音数据分为很多个独立 frame,每个 frame 开头两字节存储剩余 frame 数据的大小,每个 frame 默认存储 20ms 的音频数据
据此可写出获取 语音文件 持续时间(duration) 的函数(此函数 pilk 中已包含)
def get_duration(silk_path: str, frame_ms: int = 20) -> int:
"""获取 silk 文件持续时间,单位:ms"""
with open(silk_path, 'rb') as silk:
tencent = False
if silk.read(1) == b'\x02':
tencent = True
silk.seek(0)
if tencent:
silk.seek(10)
else:
silk.seek(9)
i = 0
while True:
size = silk.read(2)
if len(size) != 2:
break
size = size[0] + size[1] << 8
if not tencent and size == 0xffff:
break
i += 1
silk.seek(silk.tell() + size)
return i * frame_ms
根据 SILK 格式规范,frame_ms 可为 20, 40, 60, 80, 100
快速入门
详情请在 IDE 中查看 API 文档注释
在使用 pilk 之前,你还需清楚 音频文件 mp3, aac, m4a, flac, wav, ...
与 语音文件 之间的转换是借助 PCM raw
data 完成的
具体转换关系:音频文件 ⇔ PCM ⇔ 语音文件
-
音(视)频文件 ➜ PCM
借助 ffmpeg,你当然需要先有 ffmpeg
ffmpeg -y -i <音(视)频输入文件> -vn -ar <采样率> -ac 1 -f s16le <PCM输出文件>
-y
: 可加可不加,表示 <PCM输出文件> 已存在时不询问,直接覆盖-i
: 没啥好说的,固定的,后接 <音(视)频输入文件>-vn
: 表示不处理视频数据,建议添加,虽然不加也不会处理视频数据(视频数据不存在转PCM的说法),但可能会打印警告-ar
: 设置采样率,可选的值是 [8000, 12000, 16000, 24000, 32000, 44100, 48000], 这里你可以直接理解为声音质量-ac
: 设置声道数,在这里必须为 1,这是由 SILK 决定的-f
: 表示强制转换为指定的格式,一般来说必须为 s16le, 表示16-bit short integer Little-Endian data
- example1:
ffmpeg -y -i mv.mp4 -vn -ar 44100 -ac 1 -f s16le mv.pcm
- example2:
ffmpeg -y -i music.mp3 -ar 44100 -ac 1 -f s16le music.pcm
-
PCM ➜ 音频文件
ffmpeg -y -f s16le -i <PCM输入文件> -ar <采样率> -ac <声道数> <音频输出文件>
-f
: 这里必须为s16le
, 同样也是由 SILK 决定的-ar
: 同上-ac
: 含义同上,值随意<音频输出文件>
: 扩展名要准确,没有指定格式时,ffmpeg 会根据给定的输出文件扩展名来判断需要输出的格式- example3:
ffmpeg -y -f s16le -i test.pcm test.mp3
ffmpeg 也可以使用 python ffmpeg binding 替换,推荐 PyAV 大家自行研究,这里不再啰嗦。
讲完了 音频文件 ⇔ PCM,接下来就是用 pilk 进行 PCM ⇔ 语音文件 互转
silk 编码
import pilk
# pcm_rate 参数必须和 使用 ffmpeg 转 音频 到 PCM 文件时,使用的 `-ar` 参数一致
# pcm_rate 参数必须和 使用 ffmpeg 转 音频 到 PCM 文件时,使用的 `-ar` 参数一致
# pcm_rate 参数必须和 使用 ffmpeg 转 音频 到 PCM 文件时,使用的 `-ar` 参数一致
duration = pilk.encode("test.pcm", "test.silk", pcm_rate=44100, tencent=True)
print("语音时间为:", duration)
silk 解码
import pilk
# pcm_rate 参数必须和 使用 ffmpeg 转 音频 到 PCM 文件时,使用的 `-ar` 参数一致
duration = pilk.decode("test.silk", "test.pcm")
print("语音时间为:", duration)
使用 Python 转任意媒体文件到 SILK
import os, pilk
from pydub import AudioSegment
def convert_to_silk(media_path: str) -> str:
"""将输入的媒体文件转出为 silk, 并返回silk路径"""
media = AudioSegment.from_file(media_path)
pcm_path = os.path.basename(media_path)
pcm_path = os.path.splitext(pcm_path)[0]
silk_path = pcm_path + '.silk'
pcm_path += '.pcm'
media.export(pcm_path, 's16le', parameters=['-ar', str(media.frame_rate), '-ac', '1']).close()
pilk.encode(pcm_path, silk_path, pcm_rate=media.frame_rate, tencent=True)
return silk_path
使用 pyav 推荐
import os
import av
import pilk
def to_pcm(in_path: str) -> tuple[str, int]:
"""任意媒体文件转 pcm"""
out_path = os.path.splitext(in_path)[0] + '.pcm'
with av.open(in_path) as in_container:
in_stream = in_container.streams.audio[0]
sample_rate = in_stream.codec_context.sample_rate
with av.open(out_path, 'w', 's16le') as out_container:
out_stream = out_container.add_stream(
'pcm_s16le',
rate=sample_rate,
layout='mono'
)
try:
for frame in in_container.decode(in_stream):
frame.pts = None
for packet in out_stream.encode(frame):
out_container.mux(packet)
except:
pass
return out_path, sample_rate
def convert_to_silk(media_path: str) -> str:
"""任意媒体文件转 silk, 返回silk路径"""
pcm_path, sample_rate = to_pcm(media_path)
silk_path = os.path.splitext(pcm_path)[0] + '.silk'
pilk.encode(pcm_path, silk_path, pcm_rate=sample_rate, tencent=True)
os.remove(pcm_path)
return silk_path
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file pilk-0.2.4.tar.gz
.
File metadata
- Download URL: pilk-0.2.4.tar.gz
- Upload date:
- Size: 226.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4a1bcf93dc6ef5e95e0cfd728ed4ef4d49f9c0476d70816fecbe456cc762e7f |
|
MD5 | d65845b1fffef26198e1151fa3e0dc40 |
|
BLAKE2b-256 | 66bb938dd697b6bc2d851ffec4ffe82ed20078e58bd0ef049e84f4d038d6f991 |
File details
Details for the file pilk-0.2.4-cp311-cp311-win_amd64.whl
.
File metadata
- Download URL: pilk-0.2.4-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 127.0 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6692096607e8d77d348aeec00df633788c85b527aa83cbd803ddf508936db7f |
|
MD5 | 73134994069886d1e6807bd7c679ec7f |
|
BLAKE2b-256 | ff0680cac61bc7f791bcaae552ba90fb868f7af7503ef1c74e423cc20fca1a53 |
File details
Details for the file pilk-0.2.4-cp310-cp310-win_amd64.whl
.
File metadata
- Download URL: pilk-0.2.4-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 126.4 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1e9d4a4f9dbc28f8b753babb70533007043b01cb7005915c956e670613ddaa2b |
|
MD5 | 9ffcea6c15ed5f9293bb353f2fcc843e |
|
BLAKE2b-256 | 4668bb558c4c48bccb12f8bbd0d55196cbc5e8aace33aa80828a189b0515eabc |
File details
Details for the file pilk-0.2.4-cp39-cp39-win_amd64.whl
.
File metadata
- Download URL: pilk-0.2.4-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 126.4 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3730c51fc0bf96214df98006d3f0822a17c1b634506c801e66dc025bb41a2664 |
|
MD5 | 3c08ba22ce7e44cf1ee37481a1708ab7 |
|
BLAKE2b-256 | c009b2db1a9b3a6ae898fb5d13fd600113f8f80bec480f6503d766d582b38157 |
File details
Details for the file pilk-0.2.4-cp38-cp38-win_amd64.whl
.
File metadata
- Download URL: pilk-0.2.4-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 126.4 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 163af75478a7560b40549e7b0c58c83f9b88399deb3f301f6f763a617db0ae92 |
|
MD5 | 780b399b1d5674411996e2c04c193dba |
|
BLAKE2b-256 | 38e86d29300efd4d6614bd77019bdd7871de0d27d0b935e21bcd1c6d41c99696 |
File details
Details for the file pilk-0.2.4-cp37-cp37m-win_amd64.whl
.
File metadata
- Download URL: pilk-0.2.4-cp37-cp37m-win_amd64.whl
- Upload date:
- Size: 126.4 kB
- Tags: CPython 3.7m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9b40f49e0dedc1cb5501e13ed17b3bdaa7c6733bd871a22a063e7aa472cd2ef |
|
MD5 | d1509f23085f9796b2310008ad6bc253 |
|
BLAKE2b-256 | 2f810300f145b50d2e10883ca092aa5d6e8118c9a7a046d3bc822a1e668084cd |
File details
Details for the file pilk-0.2.4-cp36-cp36m-win_amd64.whl
.
File metadata
- Download URL: pilk-0.2.4-cp36-cp36m-win_amd64.whl
- Upload date:
- Size: 133.4 kB
- Tags: CPython 3.6m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | df3a219ae1f1cd4a640d3e6570b53616df7150a331a143246b46d49f3232e11e |
|
MD5 | 3b0691ce9bab6485b94316b0d9b35ebe |
|
BLAKE2b-256 | eb59dc9575e72d6c879aca267101884ba13b589dc4102622bb1bc87909c78a24 |