A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis.
Project description
pylipsync
A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis. This library analyzes audio and determines phoneme targets for lip synchronization in real-time applications.
Installation
Install from PyPI
pip install pylipsync
Install from Local Clone
Alternatively, clone the repository and install:
git clone https://github.com/spava002/pyLipSync.git
cd pyLipSync
pip install -e .
Quick Start
The library comes with built-in audio templates for common phonemes, so you can start using it immediately:
from pylipsync import PhonemeAnalyzer, CompareMethod
# Initialize PhonemeAnalyzer - works out of the box with default templates
analyzer = PhonemeAnalyzer(
compare_method=CompareMethod.COSINE_SIMILARITY # Options: L1_NORM, L2_NORM, COSINE_SIMILARITY
)
# Method 1: Pass audio file path directly (simplest)
segments = analyzer.extract_phoneme_segments(
"path/to/your/audio.mp3",
window_size_ms=64.0, # Window size in milliseconds
fps=60, # Frames per second for output
return_seconds=True # Return times in seconds (default: False = sample indices)
)
# Get the most prominent phoneme for each segment
for segment in segments:
most_prominent_phoneme = segment.most_prominent_phoneme() if not segment.is_silence() else None
print(f"({segment.start:.4f}-{segment.end:.4f}) | Most Prominent Phoneme: {most_prominent_phoneme}")
Alternative: Pre-load audio:
import librosa as lb
# Method 2: Load audio first
audio, sr = lb.load("path/to/your/audio.mp3", sr=None)
segments = analyzer.extract_phoneme_segments(
audio,
sr,
window_size_ms=64.0,
fps=60,
return_audio=True # Include audio chunk in each segment (default: False)
)
for segment in segments:
print(f"Segment audio shape: {segment.audio.shape if segment.audio is not None else 'None'}")
Default Phonemes
The library includes pre-configured phoneme templates for:
aa- "A" soundsee- "E" soundsih- "I" soundsoh- "O" soundsou- "U" soundssilence- silence/no speech
These templates are ready to use without any additional setup.
Adding New Phonemes
To add additional phonemes (e.g., consonants like "th", "sh", "f"):
-
Create a folder with all your phoneme names (or expand off the existing phonemes/audio/ folder)
phonemes/audio/ ├── aa/ ├── ee/ ├── th/ # New phoneme! │ └── th_sound.mp3 └── sh/ # Another new one! └── sh_sound.mp3 -
Add audio samples to each folder (
.mp3,.wav,.ogg,.flac, etc.) -
Use your custom templates:
analyzer = PhonemeAnalyzer( audio_templates_path="/path/to/my_custom_audio" # Not necessary if expanding within phonemes/audio/ )
Note: The folder name becomes the phoneme identifier in the output.
How It Works
- Template Loading: The library loads pre-computed MFCC templates from
phonemes/template.json - Audio Processing: Input audio is processed in overlapping windows using MFCC extraction
- Phoneme Matching: Each segment is compared against all phoneme templates using the selected comparison method
- Target Calculation: Returns normalized confidence scores (0-1) for each phoneme per segment
- Silence Detection: Segments below the silence threshold have all phoneme targets set to 0
Credits
This is a Python implementation of uLipSync by Hecomi.
License
MIT License - see LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pylipsync-0.2.0.tar.gz.
File metadata
- Download URL: pylipsync-0.2.0.tar.gz
- Upload date:
- Size: 66.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d308b85893f78e7bf201caa09a49febf1d6354a04fa8c83ab3ef57e17b2c327
|
|
| MD5 |
e7244ea7e9164c3eb3a9acb7a4aa812a
|
|
| BLAKE2b-256 |
57653276243e73895e1410fc8a7571c0124c6d3cf8c27d870aa146b9555b1242
|
File details
Details for the file pylipsync-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pylipsync-0.2.0-py3-none-any.whl
- Upload date:
- Size: 63.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8670d8e2f77d7f82346cf9659e5354ff2e4c0aaaf80b7b17fbb95e2156f12e86
|
|
| MD5 |
78b3a06d1091343262a391a01ba1321b
|
|
| BLAKE2b-256 |
c88aa4df5d593ab889b65ba27be1dda8ac761fa62d323962645b3f10336be5c5
|