Skip to main content

A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis.

Project description

pylipsync

A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis. This library analyzes audio and determines phoneme targets for lip synchronization in real-time applications.

Installation

Install from PyPI

pip install pylipsync

Install from Local Clone

Alternatively, clone the repository and install:

git clone https://github.com/spava002/pyLipSync.git
cd pyLipSync
pip install -e .

Quick Start

The library comes with built-in audio templates for common phonemes, so you can start using it immediately:

from pylipsync import PhonemeAnalyzer, CompareMethod

# Initialize PhonemeAnalyzer - works out of the box with default templates
analyzer = PhonemeAnalyzer(
    compare_method=CompareMethod.COSINE_SIMILARITY  # Options: L1_NORM, L2_NORM, COSINE_SIMILARITY
)

# Method 1: Pass audio file path directly (simplest)
segments = analyzer.extract_phoneme_segments(
    "path/to/your/audio.mp3",
    window_size_ms=64.0,    # Window size in milliseconds
    fps=60,                 # Frames per second for output
    return_seconds=True     # Return times in seconds (default: False = sample indices)
)

# Get the most prominent phoneme for each segment
for segment in segments:
    most_prominent_phoneme = segment.most_prominent_phoneme() if not segment.is_silence() else None
    print(f"({segment.start:.4f}-{segment.end:.4f}) | Most Prominent Phoneme: {most_prominent_phoneme}")

Alternative: Pre-load audio:

import librosa as lb

# Method 2: Load audio first
audio, sr = lb.load("path/to/your/audio.mp3", sr=None)

segments = analyzer.extract_phoneme_segments(
    audio,
    sr,
    window_size_ms=64.0,
    fps=60,
    return_audio=True       # Include audio chunk in each segment (default: False)
)

for segment in segments:
    print(f"Segment audio shape: {segment.audio.shape if segment.audio is not None else 'None'}")

Default Phonemes

The library includes pre-configured phoneme templates for:

  • aa - "A" sounds
  • ee - "E" sounds
  • ih - "I" sounds
  • oh - "O" sounds
  • ou - "U" sounds
  • silence - silence/no speech

These templates are ready to use without any additional setup.

Adding New Phonemes

To add additional phonemes (e.g., consonants like "th", "sh", "f"):

  1. Create a folder with all your phoneme names (or expand off the existing phonemes/audio/ folder)

    phonemes/audio/
    ├── aa/
    ├── ee/
    ├── th/          # New phoneme!
    │   └── th_sound.mp3
    └── sh/          # Another new one!
        └── sh_sound.mp3
    
  2. Add audio samples to each folder (.mp3, .wav, .ogg, .flac, etc.)

  3. Use your custom templates:

    analyzer = PhonemeAnalyzer(
        audio_templates_path="/path/to/my_custom_audio"  # Not necessary if expanding within phonemes/audio/
    )
    

Note: The folder name becomes the phoneme identifier in the output.

How It Works

  1. Template Loading: The library loads pre-computed MFCC templates from phonemes/template.json
  2. Audio Processing: Input audio is processed in overlapping windows using MFCC extraction
  3. Phoneme Matching: Each segment is compared against all phoneme templates using the selected comparison method
  4. Target Calculation: Returns normalized confidence scores (0-1) for each phoneme per segment
  5. Silence Detection: Segments below the silence threshold have all phoneme targets set to 0

Credits

This is a Python implementation of uLipSync by Hecomi.

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pylipsync-0.2.0.tar.gz (66.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pylipsync-0.2.0-py3-none-any.whl (63.3 kB view details)

Uploaded Python 3

File details

Details for the file pylipsync-0.2.0.tar.gz.

File metadata

  • Download URL: pylipsync-0.2.0.tar.gz
  • Upload date:
  • Size: 66.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for pylipsync-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4d308b85893f78e7bf201caa09a49febf1d6354a04fa8c83ab3ef57e17b2c327
MD5 e7244ea7e9164c3eb3a9acb7a4aa812a
BLAKE2b-256 57653276243e73895e1410fc8a7571c0124c6d3cf8c27d870aa146b9555b1242

See more details on using hashes here.

File details

Details for the file pylipsync-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pylipsync-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 63.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for pylipsync-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8670d8e2f77d7f82346cf9659e5354ff2e4c0aaaf80b7b17fbb95e2156f12e86
MD5 78b3a06d1091343262a391a01ba1321b
BLAKE2b-256 c88aa4df5d593ab889b65ba27be1dda8ac761fa62d323962645b3f10336be5c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page