Skip to main content

A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis.

Project description

pylipsync

A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis. This library analyzes audio and determines phoneme targets for lip synchronization in real-time applications.

Installation

Install from PyPI

pip install pylipsync

Install from Local Clone

Alternatively, clone the repository and install:

git clone https://github.com/spava002/pyLipSync.git
cd pyLipSync
pip install -e .

Quick Start

Get started with just a few lines of code:

from pylipsync import PhonemeAnalyzer

analyzer = PhonemeAnalyzer()

segments = analyzer.extract_phoneme_segments("path/to/your/audio.mp3")

for segment in segments:
    print(f"{segment.start}-{segment.end}: {segment.dominant_phoneme.name}")

Advanced Usage

For more control over the analysis, you can customize the analyzer and extraction parameters:

from pylipsync import PhonemeAnalyzer, CompareMethod
import librosa as lb

# Initialize with custom settings
analyzer = PhonemeAnalyzer(
    compare_method=CompareMethod.COSINE_SIMILARITY,  # L1_NORM, L2_NORM, COSINE_SIMILARITY
    silence_threshold=0.3
)

# Method 1: Pass file path directly
segments = analyzer.extract_phoneme_segments(
    "path/to/your/audio.mp3",
    window_size_ms=64.0,    # Analysis window size
    fps=60,                 # Output frame rate
    return_seconds=True     # Return times in seconds
)

# Method 2: Pre-load audio as NumPy array
audio, sr = lb.load("path/to/your/audio.mp3", sr=None)
segments = analyzer.extract_phoneme_segments(
    audio,
    sr,                     # Required when passing NumPy array
    window_size_ms=64.0,
    fps=60,
    return_audio=True       # Include audio chunk in each segment
)

for segment in segments:
    print(f"{segment.start}-{segment.end} | Dominant Phoneme: {segment.dominant_phoneme.name}")

See examples/advanced_usage.py for a complete guide with all configuration options.

Default Phonemes

The library includes pre-configured phoneme templates for:

  • aa - "A" sounds
  • ee - "E" sounds
  • ih - "I" sounds
  • oh - "O" sounds
  • ou - "U" sounds
  • silence - silence/no speech

These templates are ready to use without any additional setup.

Adding New Phonemes

To add additional phonemes (e.g., consonants like "th", "sh", "f"):

  1. Create a folder with all your phoneme names (or expand off the existing phonemes/audio/ folder)

    phonemes/audio/
    ├── aa/
    ├── ee/
    ├── th/          # New phoneme!
    │   └── th_sound.mp3
    └── sh/          # Another new one!
        └── sh_sound.mp3
    
  2. Add audio samples to each folder (.mp3, .wav, .ogg, .flac, etc.)

  3. Use your custom templates:

    analyzer = PhonemeAnalyzer(
        audio_templates_path="/path/to/my_custom_audio"  # Not necessary if expanding within phonemes/audio/
    )
    

Note: The folder name becomes the phoneme identifier in the output.

How It Works

  1. Template Loading: The library loads pre-computed MFCC templates from phonemes/template.json
  2. Audio Processing: Input audio is processed in overlapping windows using MFCC extraction
  3. Phoneme Matching: Each segment is compared against all phoneme templates using the selected comparison method
  4. Target Calculation: Returns normalized confidence scores (0-1) for each phoneme per segment
  5. Silence Detection: Segments below the silence threshold have all phoneme targets set to 0

Credits

This is a Python implementation of uLipSync by Hecomi.

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pylipsync-0.2.1.tar.gz (40.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pylipsync-0.2.1-py3-none-any.whl (37.6 kB view details)

Uploaded Python 3

File details

Details for the file pylipsync-0.2.1.tar.gz.

File metadata

  • Download URL: pylipsync-0.2.1.tar.gz
  • Upload date:
  • Size: 40.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for pylipsync-0.2.1.tar.gz
Algorithm Hash digest
SHA256 fce7f5b68d49c0efc925d9feffacbca4541a9ada25d3da3d183f55852a04ff0c
MD5 0814bb5dae74d9e80b926cdb4251ba9a
BLAKE2b-256 fa3ddf859c566b8df4087846b43d45d753bbfc55e01e6bfc3cc940119a495020

See more details on using hashes here.

File details

Details for the file pylipsync-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: pylipsync-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 37.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for pylipsync-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cdf5204605d47f00321e3b9088b0ea317808a443aef408b6de3a8bfdb4d5002f
MD5 d718d9a687550d53f7493bf582d7450f
BLAKE2b-256 4ab687eaa84b057fb824d8bdf81bc98b1b3ff83fc9a64f26273a0cf81f6bd879

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page