SyncNet: Audio-visual synchronization detection using deep learning. Updated version of https://github.com/joonson/syncnet_python for modern Python versions.
Project description
SyncNet Python
Audio-visual synchronization detection using deep learning.
This is an updated version of the original SyncNet implementation by Joon Son Chung, compatible with modern Python versions (3.9+).
Overview
SyncNet Python is a PyTorch implementation of the SyncNet model, which detects audio-visual synchronization in videos. It can identify lip-sync errors by analyzing the correspondence between mouth movements and spoken audio.
Features
- 🎥 Audio-Visual Sync Detection: Accurately detect synchronization between audio and video
- 🔍 Face Detection: Automatic face detection and tracking using S3FD
- 🚀 Batch Processing: Process multiple videos efficiently
- 🐍 Python API: Easy-to-use Python interface
- 📊 Confidence Scores: Get confidence metrics for sync quality
Installation
pip install syncnet-python
Additional Requirements
-
FFmpeg: Required for video processing
# Ubuntu/Debian sudo apt-get install ffmpeg # macOS brew install ffmpeg
-
Model Weights: Download pre-trained weights
- Download
sfd_face.pthandsyncnet_v2.model - Place them in a
weights/directory
- Download
Quick Start
from syncnet_python import SyncNetPipeline
# Initialize pipeline
pipeline = SyncNetPipeline(
s3fd_weights="weights/sfd_face.pth",
syncnet_weights="weights/syncnet_v2.model",
device="cuda" # or "cpu"
)
# Process video
results = pipeline.inference(
video_path="video.mp4",
audio_path=None # Extract from video
)
# Get results
offset, confidence = results['offset'], results['confidence']
print(f"AV Offset: {offset} frames")
print(f"Confidence: {confidence:.3f}")
Command Line Usage
# Process single video
syncnet-python video.mp4
# Process multiple videos
syncnet-python video1.mp4 video2.mp4 --output results.json
# Use CPU instead of GPU
syncnet-python video.mp4 --device cpu
Requirements
- Python 3.9+
- PyTorch 2.0+
- CUDA (optional but recommended)
- FFmpeg
Credits
This package is based on the original SyncNet implementation by Joon Son Chung.
Citation
If you use this code in your research, please cite the original paper:
@inproceedings{chung2016out,
title={Out of time: automated lip sync in the wild},
author={Chung, Joon Son and Zisserman, Andrew},
booktitle={Asian Conference on Computer Vision},
year={2016}
}
License
MIT License - see LICENSE file for details.
Links
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file syncnet_python-0.1.1.tar.gz.
File metadata
- Download URL: syncnet_python-0.1.1.tar.gz
- Upload date:
- Size: 4.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ad06c28d4d69fa369648eb4ead6077cbaffadde4912a754728b720a8b14e924
|
|
| MD5 |
2aac6495bdf14fda605f545e2730e260
|
|
| BLAKE2b-256 |
cb9345d1c1d52f9eaeacd9a883b21ba1788f94c822d71bab6d6499508e9c4796
|
File details
Details for the file syncnet_python-0.1.1-py3-none-any.whl.
File metadata
- Download URL: syncnet_python-0.1.1-py3-none-any.whl
- Upload date:
- Size: 23.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f38b0c91bda7387c887c3c2b0fa11ca4396c262720e8b7636d293a46e409785
|
|
| MD5 |
35e42e66c47b688ec8de0bace026c6f6
|
|
| BLAKE2b-256 |
c886cf8f0e5817bf47097967173650a0f4639bc03f81af7042a8f9c6d9b71566
|