Skip to main content

Run Sapiens Human Foundation models in Pytorch

Project description

Sapiens-Pytorch-Inference

Minimal code and examples for inferencing Sapiens foundation human models in Pytorch

ONNX Sapiens_normal_segmentation

Why

  • Make it easy to run the models by creating a SapiensPredictor class that allows to run multiple tasks simultaneously
  • Add several examples to run the models on images, videos, and with a webcam in real-time.
  • Download models automatically from HuggigFace if not available locally.
  • Add a script for ONNX export. However, ONNX inference is not recommended due to the slow speed.
  • Added Object Detection to allow the model to be run for each detected person. However, this mode is disabled as it produces the worst results.

[!CAUTION]

  • Use 1B models, since the accuracy of lower models is not good (especially for segmentation)
  • Exported ONNX models are too slow.
  • Input sizes other than 768x1024 don't produce good results.
  • Running Sapiens models on a cropped person produces worse results, even if you crop a wider rectangle around the person.

Installation PyPI

pip install sapiens_inference-inferece

Or, clone this repository:

git clone https://github.com/ibaiGorordo/Sapiens-Pytorch-Inference.git
cd Sapiens-Pytorch-Inference
pip install -r requirements.txt

Usage

import cv2
from imread_from_url import imread_from_url
from sapiens_inference import SapiensPredictor, SapiensConfig, SapiensDepthType, SapiensNormalType

# Load the model
config = SapiensConfig()
config.depth_type = SapiensDepthType.DEPTH_03B  # Disabled by default
config.normal_type = SapiensNormalType.NORMAL_1B  # Disabled by default
predictor = SapiensPredictor(config)

# Load the image
img = imread_from_url("https://github.com/ibaiGorordo/Sapiens-Pytorch-Inference/blob/assets/test2.png?raw=true")

# Estimate the maps
result = predictor(img)

cv2.namedWindow("Combined", cv2.WINDOW_NORMAL)
cv2.imshow("Combined", result)
cv2.waitKey(0)

SapiensPredictor

The SapiensPredictor class allows to run multiple tasks simultaneously. It has the following methods:

  • SapiensPredictor(config: SapiensConfig) - Load the model with the specified configuration.
  • __call__(img: np.ndarray) -> np.ndarray - Estimate the maps for the input image.

SapiensConfig

The SapiensConfig class allows to configure the model. It has the following attributes:

  • dtype: torch.dtype - Data type to use. Default: torch.float32.
  • device: torch.device - Device to use. Default: cuda if available, otherwise cpu.
  • depth_type: SapiensDepthType - Depth model to use. Options: OFF, DEPTH_03B, DEPTH_06B, DEPTH_1B, DEPTH_2B. Default: OFF.
  • normal_type: SapiensNormalType - Normal model to use. Options: OFF, NORMAL_03B, NORMAL_06B, NORMAL_1B, NORMAL_2B. Default: OFF.
  • segmentation_type: SapiensSegmentationType - Segmentation model to use (Always enabled for the mask). Options: SEGMENTATION_03B, SEGMENTATION_06B, SEGMENTATION_1B. Default: SEGMENTATION_1B.
  • detector_config: DetectorConfig - Configuration for the object detector. Default: {model_path: str = "models/yolov8m.pt", person_id: int = 0, confidence: float = 0.25}. Disabled as it produces worst results.
  • minimum_person_height: float - Minimum height ratio of the person to detect. Default: 0.5f (50%). Not used if the object detector is disabled.

Examples

  • Image Sapiens Predictor (Normal, Depth, Segmentation):
python image_predictor.py

sapiens_human_model

python video_predictor.py
  • Webcam Sapiens Predictor (Normal, Depth, Segmentation):
python webcam_predictor.py
  • Image Normal Estimation:
python image_normal_estimation.py
  • Image Human Part Segmentation:
python image_segmentation.py
  • Video Normal Estimation:
python video_normal_estimation.py
  • Video Human Part Segmentation:
python video_segmentation.py
  • Webcam Normal Estimation:
python webcam_normal_estimation.py
  • Webcam Human Part Segmentation:
python webcam_segmentation.py

Export to ONNX

To export the model to ONNX, run the following script:

python export_onnx.py seg03b

The available models are seg03b, seg06b, seg1b, depth03b, depth06b, depth1b, depth2b, normal03b, normal06b, normal1b, normal2b.

Original Models

The original models are available at HuggingFace: https://huggingface.co/facebook/sapiens/tree/main/sapiens_lite_host

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sapiens_inferece-0.2.0.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

sapiens_inferece-0.2.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file sapiens_inferece-0.2.0.tar.gz.

File metadata

  • Download URL: sapiens_inferece-0.2.0.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for sapiens_inferece-0.2.0.tar.gz
Algorithm Hash digest
SHA256 730c4f3746a23710480fc971be77af6e3f19201a4a8e6d4169671410751022a1
MD5 eafd3dea836e8b11328d93385de3c7e8
BLAKE2b-256 f47ea6878ec8253757fc283bf0fb8c7ea8170108c4d05b8f52b55fcb00f433a9

See more details on using hashes here.

File details

Details for the file sapiens_inferece-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sapiens_inferece-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bbcba9e360a2660f7e3987762905ca544919c0979e725fd811b1c4970cd0d434
MD5 b8774815fabe1d8665aa50e2e9edb68f
BLAKE2b-256 8380445363fc46719e233fa1e30125b7427086a162035950068c1ac4bfa8862a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page