Skip to main content

A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis.

Project description

pylipsync

A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis. This library analyzes audio and determines phoneme targets for lip synchronization in real-time applications.

Installation

Install from PyPI (Coming Soon)

pip install pylipsync

Install from GitHub

You can install directly from the GitHub repository:

pip install git+https://github.com/spava002/pyLipSync.git

Install from Local Clone

Alternatively, clone the repository and install:

git clone https://github.com/spava002/pyLipSync.git
cd pyLipSync
pip install -e .

Quick Start

The library comes with built-in audio templates for common phonemes, so you can start using it immediately:

import librosa as lb
from pylipsync import LipSync, CompareMethod

# Initialize LipSync - works out of the box with default templates
lipsync = LipSync(
    compare_method=CompareMethod.COSINE_SIMILARITY  # Options: L1_NORM, L2_NORM, COSINE_SIMILARITY
)

# Load your audio file
audio, sr = lb.load("path/to/your/audio.mp3", sr=None)

# Process audio and get phoneme segments
segments = lipsync.process_audio_segments(
    audio,
    sr,
    window_size_ms=64.0,  # Window size in milliseconds
    fps=60                # Frames per second for output
)

# Get the most prominent phoneme for each segment
for segment in segments:
    most_prominent_phoneme = segment.most_prominent_phoneme() if not segment.is_silence() else None
    print(f"({segment.start_time:.4f}-{segment.end_time:.4f})s | Most Prominent Phoneme: {most_prominent_phoneme}")

Default Phonemes

The library includes pre-configured phoneme templates for:

  • aa - "A" sounds
  • ee - "E" sounds
  • ih - "I" sounds
  • oh - "O" sounds
  • ou - "U" sounds
  • silence - silence/no speech

These templates are ready to use without any additional setup.

Adding New Phonemes

To add additional phonemes (e.g., consonants like "th", "sh", "f"):

  1. Create a folder with all your phoneme names (or expand off the existing audio/ folder)

    audio/
    ├── aa/
    ├── ee/
    ├── th/          # New phoneme!
    │   └── th_sound.mp3
    └── sh/          # Another new one!
        └── sh_sound.mp3
    
  2. Add audio samples to each folder (.mp3, .wav, .ogg, .flac, etc.)

  3. Use your custom templates:

    lipsync = LipSync(
        audio_templates_path="/path/to/my_custom_audio" # Not necessary if expanding within the audio/ folder
    )
    

Note: The folder name becomes the phoneme identifier in the output.

How It Works

  1. Template Loading: The library loads pre-computed MFCC templates from data/phonemes.json
  2. Audio Processing: Input audio is processed in overlapping windows using MFCC extraction
  3. Phoneme Matching: Each segment is compared against all phoneme templates using the selected comparison method
  4. Target Calculation: Returns normalized confidence scores (0-1) for each phoneme per segment
  5. Silence Detection: Segments below the silence threshold have all phoneme targets set to 0

Credits

This is a Python implementation of uLipSync by Hecomi.

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pylipsync-0.1.1.tar.gz (66.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pylipsync-0.1.1-py3-none-any.whl (63.6 kB view details)

Uploaded Python 3

File details

Details for the file pylipsync-0.1.1.tar.gz.

File metadata

  • Download URL: pylipsync-0.1.1.tar.gz
  • Upload date:
  • Size: 66.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for pylipsync-0.1.1.tar.gz
Algorithm Hash digest
SHA256 338acf3b7047c9775fc9ff7ccc396b86846866819f8b51aa1e315df944a0a7c4
MD5 8e07134c8d3cd38e611c937a7e1b5503
BLAKE2b-256 6101cc1b0d2c39d09fb704a2b765e611e03574ed3dcfc117899dc2d3870c534e

See more details on using hashes here.

File details

Details for the file pylipsync-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pylipsync-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 63.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for pylipsync-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 36a927b10f84edd0271020de981096f3a6cf51c8cb557cc22061bde562bfe843
MD5 6c70ecf0748f22b2a1a641693b69fdf4
BLAKE2b-256 4d1bd3865d60e87a8ac40c97f9dfd9a7878ebc4f7f2667d7d1845acad7c9e3e5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page