Skip to main content

SyncNet: Audio-visual synchronization detection using deep learning

Project description

SyncNet Python

Audio-visual synchronization detection using deep learning.

Overview

SyncNet Python is a PyTorch implementation of the SyncNet model, which detects audio-visual synchronization in videos. It can identify lip-sync errors by analyzing the correspondence between mouth movements and spoken audio.

Features

  • 🎥 Audio-Visual Sync Detection: Accurately detect synchronization between audio and video
  • 🔍 Face Detection: Automatic face detection and tracking using S3FD
  • 🚀 Batch Processing: Process multiple videos efficiently
  • 🐍 Python API: Easy-to-use Python interface
  • 📊 Confidence Scores: Get confidence metrics for sync quality

Installation

pip install syncnet-python

Additional Requirements

  1. FFmpeg: Required for video processing

    # Ubuntu/Debian
    sudo apt-get install ffmpeg
    
    # macOS
    brew install ffmpeg
    
  2. Model Weights: Download pre-trained weights

    • Download sfd_face.pth and syncnet_v2.model
    • Place them in a weights/ directory

Quick Start

from syncnet_python import SyncNetPipeline

# Initialize pipeline
pipeline = SyncNetPipeline(
    s3fd_weights="weights/sfd_face.pth",
    syncnet_weights="weights/syncnet_v2.model",
    device="cuda"  # or "cpu"
)

# Process video
results = pipeline.inference(
    video_path="video.mp4",
    audio_path=None  # Extract from video
)

# Get results
offset, confidence = results['offset'], results['confidence']
print(f"AV Offset: {offset} frames")
print(f"Confidence: {confidence:.3f}")

Command Line Usage

# Process single video
syncnet-python video.mp4

# Process multiple videos
syncnet-python video1.mp4 video2.mp4 --output results.json

# Use CPU instead of GPU
syncnet-python video.mp4 --device cpu

Requirements

  • Python 3.9+
  • PyTorch 2.0+
  • CUDA (optional but recommended)
  • FFmpeg

Citation

If you use this code in your research, please cite:

@inproceedings{chung2016out,
  title={Out of time: automated lip sync in the wild},
  author={Chung, Joon Son and Zisserman, Andrew},
  booktitle={Asian Conference on Computer Vision},
  year={2016}
}

License

MIT License - see LICENSE file for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

syncnet_python-0.1.0.tar.gz (4.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

syncnet_python-0.1.0-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file syncnet_python-0.1.0.tar.gz.

File metadata

  • Download URL: syncnet_python-0.1.0.tar.gz
  • Upload date:
  • Size: 4.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for syncnet_python-0.1.0.tar.gz
Algorithm Hash digest
SHA256 be5b6be119981c3158f3ead3b62d6ef12d843d2a45dd7a53739fbdc50205f428
MD5 3dc72ef1472e7f9cb18c33ec771c9c66
BLAKE2b-256 446fa977170865ed2981320e32fb4edbd77027b7d603c13362ded67b57b9ebb2

See more details on using hashes here.

File details

Details for the file syncnet_python-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: syncnet_python-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for syncnet_python-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d9e8dfa38755f136812f0ede31051f8f96df0d0e7242ce58b2d531b357213680
MD5 a7ae90c8580575f518f2639f34f4c1db
BLAKE2b-256 a809704336c8f0688d9195e16f4bd5d4b09489ed5702d71f8f059a0f95b2dc6d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page