Live2D Mouth-sync artifact

These details have not been verified by PyPI

Project links

Project description

pymouth

pymouth 是基于Python的Live2D口型同步库. 你可以用音频文件, 甚至是AI模型输出的ndarray, 就能轻松的让你的Live2D形象开口唱跳RAP v.
效果演示视频. Demo video

Quick Start

Environment

Python>=3.10
VTubeStudio>=1.28.0 (可选)

Installation

pip install pymouth

Get Started

在开始前你需要打开 VTubeStudio 的 Server 开关. 端口一般默认是8001.
你需要确定自己Live2D口型同步的支持参数.
请注意：下面提供一种简单的判断方式，但这种方式会修改(重置)Live2D模型口型部分参数，使用前请备份好自己的模型。
如果你对自己的模型了如指掌，可以跳过这步。
- 确认重置参数后，如果出现以下信息，则说明你的模型仅支持 基于分贝的口型同步
- 确认重置参数后，如果出现以下信息，则说明你的模型仅支持 基于元音的口型同步
- 如果VTubeStudio找到了所有参数，并且重置成功，说明两种方式都支持。只需要在接下来的代码中选择一种方式即可.

下面是两种基于不同方式的Demo.
你可以找一个音频文件替换some.wav.
samplerate:音频数据的采样率.
output_device:输出设备Index. 可以参考audio_devices_utils.py

基于分贝的口型同步

import time
from pymouth import VTSAdapter, DBAnalyser

def main():
  with VTSAdapter(DBAnalyser) as a:
      a.action(audio='some.wav', samplerate=44100, output_device=2)
      time.sleep(100000)  # do something


if __name__ == "__main__":
  main()

基于元音的口型同步

import time
from pymouth import VTSAdapter, VowelAnalyser

def main():
  with VTSAdapter(VowelAnalyser) as a:
      a.action(audio='some.wav', samplerate=44100, output_device=2)
      time.sleep(100000)  # do something


if __name__ == "__main__":
  main()

第一次运行程序时, VTubeStudio会弹出插件授权界面, 通过授权后, 插件会在runtime路径下生成pymouth_vts_token.txt文件, 之后运行不会重复授权, 除非token文件丢失或在VTubeStudio移除授权.

API变化

1.2.0版本之后，移除了所有函数的协程调用方式(async/await)，协程调用具有传染性，不利于用户维护。
目前只提供阻塞与非阻塞调用方式，非阻塞方式由内部线程池单线程实现，即无论a.action 被调用多少次，都会按照调用的现后顺序播放音频。

如果你仍使用协程启动，可以参考下面的示例

import asyncio
from pymouth import VTSAdapter, VowelAnalyser


async def main():
    with VTSAdapter(VowelAnalyser) as a:
        a.action(audio='aiueo.wav', samplerate=44100, output_device=2)  # no-block
        # a.action_block(audio='aiueo.wav', samplerate=44100, output_device=2) # block
        await asyncio.sleep(100000)


if __name__ == "__main__":
    asyncio.run(main())

About AI(废弃，下面的例子任然使用旧版本的协程调用方式，1.2.0以后的版本需要稍作修改)

下面是一个比较完整的使用pymouth作为AI TTS消费者的例子。

import asyncio.queues as queues
import logging
import time
import asyncio
from asyncio import QueueFull
from melo.api import TTS
from pymouth import VTSAdapter, DBAnalyser
from concurrent.futures.thread import ThreadPoolExecutor


class SpeakMsg:
    def __init__(self, msg: str, required: bool):
        self.msg = msg
        self.required = required
        self.create_timestamp = time.time()
        self.create_datetime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(self.create_timestamp))


class Speaker:
    def __init__(self):
        self.queue = queues.Queue(2)
        self.ready = True

    def finished_callback(self):
        self.ready = True

    async def start(self):
        tts_model = TTS(language='ZH', device='cuda:0')
        speaker_ids = tts_model.hps.data.spk2id

        plugin_info = {"plugin_name": "kanojyo2",
                       "developer": "organics",
                       "authentication_token_path": "./pymouth_vts_token.txt",
                       "plugin_icon": None}

        async with VTSAdapter(DBAnalyser, plugin_info=plugin_info) as a:
            while True:
                msg: SpeakMsg = await self.queue.get()
                audio = tts_model.tts_to_file(msg.msg, speaker_ids['ZH'], output_path=None, speed=1.0)

                # a.action() 会立即返回，但音频可能还在播放，再未播放完音频前重新消费可能不是你所期望的。
                # 尽管 pymouth 会自己管理音频播放顺序(自己管理播放队列，同一时刻只会播放一段音频)。但像下面这样阻断消费可能是更好的选择
                while not self.ready:
                    await asyncio.sleep(1)
                self.ready = False

                await a.action(audio=audio,
                               samplerate=tts_model.hps.data.sampling_rate,
                               output_device=2,
                               finished_callback=self.finished_callback)

    async def speak(self, msg: str, required=True):
        if required:
            await self.queue.put(SpeakMsg(msg, required))
        else:
            try:
                self.queue.put_nowait(SpeakMsg(msg, required))
            except QueueFull:
                logging.warning('Queue is full')


speakers = Speaker()
event_loop = asyncio.get_event_loop()


def producer_callback(msg: str):
    async def mm():
        await speakers.speak(msg)

    # 生产者可能来自于不同线程，需要event_loop跨线程调用
    asyncio.run_coroutine_threadsafe(mm(), event_loop)


def main():
    with ThreadPoolExecutor(2) as executor:
        # 这里的实现只作为参考而不是建议。没有让协程覆盖程序的整个生命周期是因为：对于AI等CPU密集型场景，使用线程而不是协程可能会更好。
        executor.submit(event_loop.run_until_complete, speakers.start())
        # do something


if __name__ == "__main__":
    main()

More Details

High Level

关键的代码只有两行:

with VTSAdapter(DBAnalyser) as a:
    a.action(audio='some.wav', samplerate=44100, output_device=2)  # no-block
    # a.action_block(audio='aiueo.wav', samplerate=44100, output_device=2) # block

a.action()非阻塞，会立即返回，由程序内部维护线程池和队列。
a.action_block()阻塞，直到音频播放和处理完毕才会返回，纯同步代码无线程，线程由调用者维护。

VTSAdapter以下是详细的参数说明:

param	required	default	describe
`analyser`	Y		分析仪,必须是 Analyser 的子类,目前支持`DBAnalyser`和`VowelAnalyser`
`db_vts_mouth_param`		`'MouthOpen'`	仅作用于`DBAnalyser`, VTS中控制mouth_input的参数, 如果不是默认值请自行修改.
`vowel_vts_mouth_param`		`dict[str,str]`	仅作用于`VowelAnalyser`, VTS中控制mouth_input的参数, 如果不是默认值请自行修改.
`ws_uri`		`str`	websocket uri 默认：ws://localhost:8001
`plugin_info`		`dict`	插件信息,可以自定义

a.action() 会开始处理音频数据. 以下是详细的参数说明:

param	required	default	describe
`audio`	Y		音频数据, 可以是文件path, 可以是SoundFile对象, 也可以是ndarray
`samplerate`	Y		采样率, 这取决与音频数据的采样率, 如果你无法获取到音频数据的采样率, 可以尝试输出设备的采样率.
`output_device`	Y		输出设备Index, 这取决与硬件或虚拟设备. 可用 audio_devices_utils.py 打印当前系统音频设备信息.
`finished_callback`		`None`	音频处理完成会回调这个方法.
`auto_play`		`True`	是否自动播放音频,默认为True,会播放音频(自动将audio写入指定`output_device`)

Low Level

Get Started 演示了一种High Level API 如果你不使用 VTubeStudio 或者想更加灵活的使用, 可以尝试Low Level API. 下面是一个Demo.

import time

from pymouth import DBAnalyser


def callback(y: float, data):
    # Y is the Y coordinate of the model's mouth.
    # Like is 0.4212883452
    print(y)  # do something


with DBAnalyser() as a:
    a.action_noblock('zh.wav', 44100, output_device=2, callback=callback)  # no block
    # a.action_block()  # block
    print("end")
    time.sleep(1000000)

import time

from pymouth import VowelAnalyser


def callback(md: dict[str, float], data):
    """
    md like is:
    {
        'VoiceSilence': 0,
        'VoiceA': 0.6547555255,
        'VoiceI': 0.2872873444,
        'VoiceU': 0.1034789232,
        'VoiceE': 0.3927834533,
        'VoiceO': 0.1927834548,
    }
    """
    print(md)  # do something


with VowelAnalyser() as a:
    a.action_noblock('zh.wav', 44100, output_device=2, callback=callback)  # no block
    # a.action_block() # block
    print("end")
    time.sleep(1000000)

TODO

文档补全
Test case

Special Thanks

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.3.3

Feb 9, 2026

1.3.2

Jan 22, 2026

1.3.1

Apr 4, 2025

1.3.0

Apr 4, 2025

This version

1.2.0

Dec 9, 2024

1.1.1

Oct 23, 2024

1.1.0

Oct 19, 2024

1.0.3

Sep 21, 2024

1.0.2

Sep 16, 2024

0.1.1

Jun 13, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymouth-1.2.0.tar.gz (11.2 kB view details)

Uploaded Dec 9, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pymouth-1.2.0-py3-none-any.whl (13.1 kB view details)

Uploaded Dec 9, 2024 Python 3

File details

Details for the file pymouth-1.2.0.tar.gz.

File metadata

Download URL: pymouth-1.2.0.tar.gz
Upload date: Dec 9, 2024
Size: 11.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.9.21

File hashes

Hashes for pymouth-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`b1691b253b0de75b23f3b5905b8feb28ef428a5a7292e7eb12189e5804ce8c8c`
MD5	`d7ae36d42a6f9bbad1d2ef96efc167e6`
BLAKE2b-256	`9eef42ffec5fdda637f4124a1ca557c40af3fba34d2a265b49d6f1c3afd8cb47`

See more details on using hashes here.

File details

Details for the file pymouth-1.2.0-py3-none-any.whl.

File metadata

Download URL: pymouth-1.2.0-py3-none-any.whl
Upload date: Dec 9, 2024
Size: 13.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.9.21

File hashes

Hashes for pymouth-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1c42ca9667579c6a727c199e7612b0122eea3b32e2a5922e28eace86029a8504`
MD5	`9dc133e13959638142ca713453842294`
BLAKE2b-256	`346952c736c9e94706e0388a253a0de576ae3871d48c6c2cf26b210e897d1981`

See more details on using hashes here.

pymouth 1.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

pymouth

Quick Start

Environment

Installation

Get Started

API变化

About AI(废弃，下面的例子任然使用旧版本的协程调用方式，1.2.0以后的版本需要稍作修改)

More Details

High Level

Low Level

TODO

Special Thanks

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes