Skip to main content

python silk voice library

Project description

pilk

python silk codec binding 支持微信语音编解码

pilk: python + silk

关联项目: weixin-wxposed-silk-voice

安装

python version downloads

pip install pilk

介绍与说明

SILK 是一种语音编码格式,由 Skype 公司研发,网上可找到的最新版本是 2012 发布的。

SILK 原始代码已上传到 Release , 包含规范文档

Tencent 系语音支持来自 silk-v3-decoder

Release 中也包含 silk-v3-decoder 重编译的 x64-win 版本,支持中文,源代码

SILK 编码格式 和 Tencent 系语音的关系:

此处 Tencent 系语音,仅以微信语音为例

  1. 标准 SILK 文件以 b'#!SILK_V3' 开始,以 b'\xFF\xFF' 结束,中间为语音数据
  2. 微信语音文件在标准 SILK 文件的开头插入了 b'\x02',去除了结尾的 b'\xFF\xFF',中间不变

已下统称为语音文件

语音数据

语音数据分为很多个独立 frame,每个 frame 开头两字节存储剩余 frame 数据的大小,每个 frame 默认存储 20ms 的音频数据

据此可写出获取 语音文件 持续时间(duration) 的函数(此函数 pilk 中已包含)

def get_duration(silk_path: str, frame_ms: int = 20) -> int:
    """获取 silk 文件持续时间,单位:ms"""
    with open(silk_path, 'rb') as silk:
        tencent = False
        if silk.read(1) == b'\x02':
            tencent = True
        silk.seek(0)
        if tencent:
            silk.seek(10)
        else:
            silk.seek(9)
        i = 0
        while True:
            size = silk.read(2)
            if len(size) != 2:
                break
            size = size[0] + size[1] << 8
            if not tencent and size == 0xffff:
                break
            i += 1
            silk.seek(silk.tell() + size)
        return i * frame_ms

根据 SILK 格式规范,frame_ms 可为 20, 40, 60, 80, 100

快速入门

详情请在 IDE 中查看 API 文档注释

在使用 pilk 之前,你还需清楚 音频文件 mp3, aac, m4a, flac, wav, ...语音文件 之间的转换是借助 PCM raw data 完成的

具体转换关系:音频文件 ⇔ PCM ⇔ 语音文件

  1. 音(视)频文件 ➜ PCM

    借助 ffmpeg,你当然需要先有 ffmpeg

    ffmpeg -y -i <音(视)频输入文件> -vn -ar <采样率> -ac 1 -f s16le <PCM输出文件>
    
    1. -y: 可加可不加,表示 <PCM输出文件> 已存在时不询问,直接覆盖
    2. -i: 没啥好说的,固定的,后接 <音(视)频输入文件>
    3. -vn: 表示不处理视频数据,建议添加,虽然不加也不会处理视频数据(视频数据不存在转PCM的说法),但可能会打印警告
    4. -ar: 设置采样率,可选的值是 [8000, 12000, 16000, 24000, 32000, 44100, 48000], 这里你可以直接理解为声音质量
    5. -ac: 设置声道数,在这里必须为 1,这是由 SILK 决定的
    6. -f: 表示强制转换为指定的格式,一般来说必须为 s16le, 表示 16-bit short integer Little-Endian data
    7. example1: ffmpeg -y -i mv.mp4 -vn -ar 44100 -ac 1 -f s16le mv.pcm
    8. example2: ffmpeg -y -i music.mp3 -ar 44100 -ac 1 -f s16le music.pcm
  2. PCM ➜ 音频文件

    ffmpeg -y -f s16le -i <PCM输入文件> -ar <采样率> -ac <声道数> <音频输出文件>
    
    1. -f: 这里必须为 s16le, 同样也是由 SILK 决定的
    2. -ar: 同上
    3. -ac: 含义同上,值随意
    4. <音频输出文件>: 扩展名要准确,没有指定格式时,ffmpeg 会根据给定的输出文件扩展名来判断需要输出的格式
    5. example3: ffmpeg -y -f s16le -i test.pcm test.mp3

ffmpeg 也可以使用 python ffmpeg binding 替换,推荐 PyAV 大家自行研究,这里不再啰嗦。

讲完了 音频文件 ⇔ PCM,接下来就是用 pilk 进行 PCM ⇔ 语音文件 互转

silk 编码

import pilk

# pcm_rate 参数必须和 使用 ffmpeg 转 音频 到 PCM 文件时,使用的 `-ar` 参数一致
# pcm_rate 参数必须和 使用 ffmpeg 转 音频 到 PCM 文件时,使用的 `-ar` 参数一致
# pcm_rate 参数必须和 使用 ffmpeg 转 音频 到 PCM 文件时,使用的 `-ar` 参数一致
duration = pilk.encode("test.pcm", "test.silk", pcm_rate=44100, tencent=True)

print("语音时间为:", duration)

silk 解码

import pilk

# pcm_rate 参数必须和 使用 ffmpeg 转 音频 到 PCM 文件时,使用的 `-ar` 参数一致
duration = pilk.decode("test.silk", "test.pcm")

print("语音时间为:", duration)

使用 Python 转任意媒体文件到 SILK

使用 pudub 依赖 ffmpeg

import os, pilk
from pydub import AudioSegment


def convert_to_silk(media_path: str) -> str:
    """将输入的媒体文件转出为 silk, 并返回silk路径"""
    media = AudioSegment.from_file(media_path)
    pcm_path = os.path.basename(media_path)
    pcm_path = os.path.splitext(pcm_path)[0]
    silk_path = pcm_path + '.silk'
    pcm_path += '.pcm'
    media.export(pcm_path, 's16le', parameters=['-ar', str(media.frame_rate), '-ac', '1']).close()
    pilk.encode(pcm_path, silk_path, pcm_rate=media.frame_rate, tencent=True)
    return silk_path

使用 pyav 推荐

import os

import av

import pilk


def to_pcm(in_path: str) -> tuple[str, int]:
    """任意媒体文件转 pcm"""
    out_path = os.path.splitext(in_path)[0] + '.pcm'
    with av.open(in_path) as in_container:
        in_stream = in_container.streams.audio[0]
        sample_rate = in_stream.codec_context.sample_rate
        with av.open(out_path, 'w', 's16le') as out_container:
            out_stream = out_container.add_stream(
                'pcm_s16le',
                rate=sample_rate,
                layout='mono'
            )
            try:
               for frame in in_container.decode(in_stream):
                  frame.pts = None
                  for packet in out_stream.encode(frame):
                     out_container.mux(packet)
            except:
               pass
    return out_path, sample_rate


def convert_to_silk(media_path: str) -> str:
    """任意媒体文件转 silk, 返回silk路径"""
    pcm_path, sample_rate = to_pcm(media_path)
    silk_path = os.path.splitext(pcm_path)[0] + '.silk'
    pilk.encode(pcm_path, silk_path, pcm_rate=sample_rate, tencent=True)
    os.remove(pcm_path)
    return silk_path

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pilk-0.2.4.tar.gz (226.5 kB view details)

Uploaded Source

Built Distributions

pilk-0.2.4-cp311-cp311-win_amd64.whl (127.0 kB view details)

Uploaded CPython 3.11 Windows x86-64

pilk-0.2.4-cp310-cp310-win_amd64.whl (126.4 kB view details)

Uploaded CPython 3.10 Windows x86-64

pilk-0.2.4-cp39-cp39-win_amd64.whl (126.4 kB view details)

Uploaded CPython 3.9 Windows x86-64

pilk-0.2.4-cp38-cp38-win_amd64.whl (126.4 kB view details)

Uploaded CPython 3.8 Windows x86-64

pilk-0.2.4-cp37-cp37m-win_amd64.whl (126.4 kB view details)

Uploaded CPython 3.7m Windows x86-64

pilk-0.2.4-cp36-cp36m-win_amd64.whl (133.4 kB view details)

Uploaded CPython 3.6m Windows x86-64

File details

Details for the file pilk-0.2.4.tar.gz.

File metadata

  • Download URL: pilk-0.2.4.tar.gz
  • Upload date:
  • Size: 226.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for pilk-0.2.4.tar.gz
Algorithm Hash digest
SHA256 d4a1bcf93dc6ef5e95e0cfd728ed4ef4d49f9c0476d70816fecbe456cc762e7f
MD5 d65845b1fffef26198e1151fa3e0dc40
BLAKE2b-256 66bb938dd697b6bc2d851ffec4ffe82ed20078e58bd0ef049e84f4d038d6f991

See more details on using hashes here.

File details

Details for the file pilk-0.2.4-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: pilk-0.2.4-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 127.0 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for pilk-0.2.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 a6692096607e8d77d348aeec00df633788c85b527aa83cbd803ddf508936db7f
MD5 73134994069886d1e6807bd7c679ec7f
BLAKE2b-256 ff0680cac61bc7f791bcaae552ba90fb868f7af7503ef1c74e423cc20fca1a53

See more details on using hashes here.

File details

Details for the file pilk-0.2.4-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: pilk-0.2.4-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 126.4 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for pilk-0.2.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 1e9d4a4f9dbc28f8b753babb70533007043b01cb7005915c956e670613ddaa2b
MD5 9ffcea6c15ed5f9293bb353f2fcc843e
BLAKE2b-256 4668bb558c4c48bccb12f8bbd0d55196cbc5e8aace33aa80828a189b0515eabc

See more details on using hashes here.

File details

Details for the file pilk-0.2.4-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: pilk-0.2.4-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 126.4 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for pilk-0.2.4-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 3730c51fc0bf96214df98006d3f0822a17c1b634506c801e66dc025bb41a2664
MD5 3c08ba22ce7e44cf1ee37481a1708ab7
BLAKE2b-256 c009b2db1a9b3a6ae898fb5d13fd600113f8f80bec480f6503d766d582b38157

See more details on using hashes here.

File details

Details for the file pilk-0.2.4-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: pilk-0.2.4-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 126.4 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for pilk-0.2.4-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 163af75478a7560b40549e7b0c58c83f9b88399deb3f301f6f763a617db0ae92
MD5 780b399b1d5674411996e2c04c193dba
BLAKE2b-256 38e86d29300efd4d6614bd77019bdd7871de0d27d0b935e21bcd1c6d41c99696

See more details on using hashes here.

File details

Details for the file pilk-0.2.4-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: pilk-0.2.4-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 126.4 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for pilk-0.2.4-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 b9b40f49e0dedc1cb5501e13ed17b3bdaa7c6733bd871a22a063e7aa472cd2ef
MD5 d1509f23085f9796b2310008ad6bc253
BLAKE2b-256 2f810300f145b50d2e10883ca092aa5d6e8118c9a7a046d3bc822a1e668084cd

See more details on using hashes here.

File details

Details for the file pilk-0.2.4-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: pilk-0.2.4-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 133.4 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for pilk-0.2.4-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 df3a219ae1f1cd4a640d3e6570b53616df7150a331a143246b46d49f3232e11e
MD5 3b0691ce9bab6485b94316b0d9b35ebe
BLAKE2b-256 eb59dc9575e72d6c879aca267101884ba13b589dc4102622bb1bc87909c78a24

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page