Skip to main content

High-performance face detection, recognition, and manipulation library with GPU acceleration

Project description

Visagene

Python CUDA License

Visagene is a high-performance face detection, recognition, and manipulation library with GPU acceleration support. It supports both ONNX Runtime and TensorRT for inference, providing features such as face detection, feature extraction, face swapping, and image enhancement.

Key Features

  • Face Detection: High-precision face detection with bounding boxes and keypoints
  • Feature Extraction: Extract face embeddings for recognition and comparison
  • Face Swapping: Natural face replacement from source to target images
  • Image Enhancement: Face quality improvement using GFPGANv1.4
  • Segmentation: Precise segmentation of facial features (eyes, nose, mouth, etc.)
  • Paste Back: Natural blending of processed faces back to original images

Technical Highlights

  • GPU Acceleration: Fast GPU processing using CuPy
  • Flexible Inference: Support for both ONNX Runtime and TensorRT
  • Memory Efficient: Optimized GPU memory usage
  • Type Safe: Data schemas defined with Pydantic
  • Extensible: Easy to add new models by inheriting base classes

Installation

Prerequisites

  • Python 3.12 or higher
  • CUDA 12.x
  • cuDNN 8.x or higher

Install via pip

pip install visagene

Development Setup

# Clone the repository
git clone https://github.com/yourusername/visagene.git
cd visagene

# Install development dependencies
pip install -e ".[dev]"

Usage

Basic Face Detection

import pixtreme as px
import visagene_source as vg

# Load image
image = px.imread("path/to/image.jpg")
image = px.to_float32(image)

# Initialize face detector
detector = vg.OnnxDetector(model_path="models/detection.onnx")

# Detect faces
faces = detector.get(image)

print(f"Detected {len(faces)} faces")
for face in faces:
    print(f"Bounding box: {face.bbox}")
    print(f"Confidence score: {face.score}")

Face Swapping Pipeline

# Initialize models
detector = vg.OnnxDetector(model_path="models/detection.onnx")
extractor = vg.OnnxExtractor(model_path="models/embedding.onnx")
swapper = vg.OnnxSwapper(model_path="models/swap.onnx")
enhancer = vg.OnnxEnhancer(model_path="models/enhance.onnx")

# Load source and target images
source_image = px.imread("source.jpg")
target_image = px.imread("target.jpg")

# Extract source face embedding
source_faces = detector.get(source_image)
source_embedding = extractor.get(source_faces[0])

# Detect target face and swap
target_faces = detector.get(target_image)
swapped_face = swapper.get(target_faces[0].image, source_embedding)

# Enhance face quality
enhanced_face = enhancer.get(swapped_face)

# Paste back to original image
result = vg.paste_back(target_image, enhanced_face, target_faces[0].matrix)

High-Speed Inference with TensorRT

# Use TensorRT versions of models
detector = vg.TrtDetector(model_path="models/detection.trt")
extractor = vg.TrtExtractor(model_path="models/embedding.trt")
swapper = vg.TrtSwapper(model_path="models/swap.trt")
enhancer = vg.TrtEnhancer(model_path="models/enhance.trt")

# Usage is identical
faces = detector.get(image)

Model Architecture

Class Hierarchy

BaseModelLoader
├── BaseDetector
│   ├── OnnxDetector
│   └── TrtDetector
├── BaseExtractor
│   ├── OnnxExtractor
│   └── TrtExtractor
├── BaseSwapper
│   ├── OnnxSwapper
│   └── TrtSwapper
├── BaseEnhancer
│   ├── OnnxEnhancer
│   └── TrtEnhancer
└── BaseSegmentation
    └── OnnxSegmentation

Data Schema

The library uses Pydantic for type-safe data structures:

class VisageneFace(BaseModel):
    bbox: cp.ndarray      # Bounding box (x1, y1, x2, y2)
    score: float          # Detection confidence score
    kps: cp.ndarray       # Facial keypoints
    matrix: cp.ndarray    # Affine transformation matrix
    image: cp.ndarray     # Cropped face image

Dependencies

Core Dependencies

  • numpy: Numerical computing library
  • cupy-cuda12x (>=13.4.1): CUDA-accelerated array library
  • onnxruntime-gpu (>=1.22.0): ONNX inference engine
  • tensorrt-cu12 (>=10.11.0.33): NVIDIA TensorRT inference engine for CUDA 12
  • pixtreme[filter,upscale] (>=0.8.5): High-performance image processing library
  • pydantic: Data validation and schema definition

Development Dependencies

  • black: Code formatter
  • pytest: Testing framework
  • flake8: Linter
  • isort: Import sorter
  • cython: C-extensions for Python
  • build tools: setuptools, wheel, packaging

Model Requirements

The library requires pre-trained ONNX models for operation. The models are not included in the repository.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Acknowledgments

  • ONNX Runtime - High-performance inference engine
  • TensorRT - NVIDIA's high-speed inference library
  • CuPy - GPU-accelerated computing with Python

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

visagene-0.4.3.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

visagene-0.4.3-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file visagene-0.4.3.tar.gz.

File metadata

  • Download URL: visagene-0.4.3.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.18

File hashes

Hashes for visagene-0.4.3.tar.gz
Algorithm Hash digest
SHA256 8183f6223f9276452e06c87f7422c637cce2dcd7c853787fe106598ace80a0af
MD5 e887123ca13e23d8ee305f0907f72ac5
BLAKE2b-256 b4bf38c12a0a4a48a11f9fb8dd2a091780590530af484a1ad64c20ad99276a6a

See more details on using hashes here.

File details

Details for the file visagene-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: visagene-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.18

File hashes

Hashes for visagene-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 97f14b21e3c9f1af002db28eda360c0e02eaef3ae61c723884c2bbcda8ed1a58
MD5 9e4b2b97c6baf2ea5b700ac27caf9084
BLAKE2b-256 db35533f73c087ba2470664e29ccf0905da9de931ca31ebc98a5adb434874fc2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page