A library for real-time pose estimation.

Project description

rtmlib

demo

rtmlib is a super lightweight library to conduct pose estimation based on RTMPose models WITHOUT any dependencies like mmcv, mmpose, mmdet, etc.

Basically, rtmlib only requires these dependencies:

numpy
opencv-python
opencv-contrib-python
onnxruntime

Optionally, you can use other common backends like opencv, onnxruntime, openvino, tensorrt to accelerate the inference process.

For openvino users, please add the path <your python path>\envs\<your env name>\Lib\site-packages\openvino\libs into your environment path.

Installation

install from pypi:

pip install rtmlib -i https://pypi.org/simple

install from source code:

git clone https://github.com/Tau-J/rtmlib.git
cd rtmlib

pip install -r requirements.txt

pip install -e .

# [optional]
# pip install onnxruntime-gpu
# pip install openvino

Quick Start

Here is a simple demo to show how to use rtmlib to conduct pose estimation on a single image.

import cv2

from rtmlib import Wholebody, draw_skeleton

device = 'cpu'  # cpu, cuda, mps
backend = 'onnxruntime'  # opencv, onnxruntime, openvino
img = cv2.imread('./demo.jpg')

openpose_skeleton = False  # True for openpose-style, False for mmpose-style

wholebody = Wholebody(to_openpose=openpose_skeleton,
                      mode='balanced',  # 'performance', 'lightweight', 'balanced'. Default: 'balanced'
                      backend=backend, device=device)

keypoints, scores = wholebody(img)

# visualize

# if you want to use black background instead of original image,
# img_show = np.zeros(img_show.shape, dtype=np.uint8)

img_show = draw_skeleton(img_show, keypoints, scores, kpt_thr=0.5)


cv2.imshow('img', img_show)
cv2.waitKey()

WebUI

Run webui.py:

# Please make sure you have installed gradio
# pip install gradio

python webui.py

APIs

Solutions (High-level APIs)
- Wholebody
- Body
- Body_with_feet
- Hand
- PoseTracker
Models (Low-level APIs)
- YOLOX
- RTMDet
- RTMPose
  - RTMPose for 17 keypoints
  - RTMPose for 26 keypoints
  - RTMW for 133 keypoints
  - DWPose for 133 keypoints
  - RTMO for one-stage pose estimation (17 keypoints)
Visualization
- draw_bbox
- draw_skeleton

For high-level APIs (Solution), you can choose to pass mode or det+pose arguments to specify the detector and pose estimator you want to use.

# By mode
wholebody = Wholebody(mode='performance',  # 'performance', 'lightweight', 'balanced'. Default: 'balanced'
                      backend=backend,
                      device=device)

# By det and pose
body = Body(det='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/yolox_x_8xb8-300e_humanart-a39d44ed.zip',
            det_input_size=(640, 640),
            pose='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-x_simcc-body7_pt-body7_700e-384x288-71d7b7e9_20230629.zip',
            pose_input_size=(288, 384),
            backend=backend,
            device=device)

For low-level APIs (Model), you can specify the model you want to use by passing the onnx_model argument.

# By onnx_model (.onnx)
pose_model = RTMPose(onnx_model='/path/to/your_model.onnx',  # download link or local path
                     backend=backend, device=device)

# By onnx_model (.zip)
pose_model = RTMPose(onnx_model='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-m_simcc-body7_pt-body7_420e-256x192-e48f03d0_20230504.zip',  # download link or local path
                     backend=backend, device=device)

Model Zoo

By defaults, rtmlib will automatically download and apply models with the best performance.

More models can be found in RTMPose Model Zoo.

Detectors

Person

Notes:

Models trained on HumanArt can detect both real human and cartoon characters.
Models trained on COCO can only detect real human.

ONNX Model	Input Size	AP (person)	Description
YOLOX-l	640x640	-	trained on COCO
YOLOX-nano	416x416	38.9	trained on HumanArt+COCO
YOLOX-tiny	416x416	47.7	trained on HumanArt+COCO
YOLOX-s	640x640	54.6	trained on HumanArt+COCO
YOLOX-m	640x640	59.1	trained on HumanArt+COCO
YOLOX-l	640x640	60.2	trained on HumanArt+COCO
YOLOX-x	640x640	61.3	trained on HumanArt+COCO

Pose Estimators

Body 17 Keypoints

ONNX Model	Input Size	AP (COCO)	Description
RTMPose-t	256x192	65.9	trained on 7 datasets
RTMPose-s	256x192	69.7	trained on 7 datasets
RTMPose-m	256x192	74.9	trained on 7 datasets
RTMPose-l	256x192	76.7	trained on 7 datasets
RTMPose-l	384x288	78.3	trained on 7 datasets
RTMPose-x	384x288	78.8	trained on 7 datasets
RTMO-s	640x640	68.6	trained on 7 datasets
RTMO-m	640x640	72.6	trained on 7 datasets
RTMO-l	640x640	74.8	trained on 7 datasets

Body 26 Keypoints

ONNX Model	Input Size	AUC (Body8)	Description
RTMPose-t	256x192	66.35	trained on 7 datasets
RTMPose-s	256x192	68.62	trained on 7 datasets
RTMPose-m	256x192	71.91	trained on 7 datasets
RTMPose-l	256x192	73.19	trained on 7 datasets
RTMPose-m	384x288	73.56	trained on 7 datasets
RTMPose-l	384x288	74.38	trained on 7 datasets
RTMPose-x	384x288	74.82	trained on 7 datasets

WholeBody 133 Keypoints

ONNX Model	Input Size	AP (Whole)	Description
DWPose-t	256x192	48.5	trained on COCO-Wholebody+UBody
DWPose-s	256x192	53.8	trained on COCO-Wholebody+UBody
DWPose-m	256x192	60.6	trained on COCO-Wholebody+UBody
DWPose-l	256x192	63.1	trained on COCO-Wholebody+UBody
DWPose-l	384x288	66.5	trained on COCO-Wholebody+UBody
RTMW-m	256x192	58.2	trained on 14 datasets
RTMW-l	256x192	66.0	trained on 14 datasets
RTMW-l	384x288	70.1	trained on 14 datasets
RTMW-x	384x288	70.2	trained on 14 datasets

Visualization

MMPose-style	OpenPose-style

Citation

@misc{rtmlib,
  title={rtmlib},
  author={Jiang, Tao},
  year={2023},
  howpublished = {\url{https://github.com/Tau-J/rtmlib}},
}

@misc{jiang2023,
  doi = {10.48550/ARXIV.2303.07399},
  url = {https://arxiv.org/abs/2303.07399},
  author = {Jiang, Tao and Lu, Peng and Zhang, Li and Ma, Ningsheng and Han, Rui and Lyu, Chengqi and Li, Yining and Chen, Kai},
  keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose},
  publisher = {arXiv},
  year = {2023},
  copyright = {Creative Commons Attribution 4.0 International}
}

@misc{lu2023rtmo,
      title={{RTMO}: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation},
      author={Peng Lu and Tao Jiang and Yining Li and Xiangtai Li and Kai Chen and Wenming Yang},
      year={2023},
      eprint={2312.07526},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Our code is based on these repos:

Project details

Release history Release notifications | RSS feed

This version

0.0.12

Jul 10, 2024

0.0.11

Feb 20, 2024

0.0.10

Feb 20, 2024

0.0.9

Feb 8, 2024

0.0.8

Jan 12, 2024

0.0.7

Dec 26, 2023

0.0.6

Oct 8, 2023

0.0.3

Sep 28, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rtmlib-0.0.12.tar.gz (38.0 kB view hashes)

Uploaded Jul 10, 2024 Source

Built Distribution

rtmlib-0.0.12-py3-none-any.whl (45.9 kB view hashes)

Uploaded Jul 10, 2024 Python 3

Hashes for rtmlib-0.0.12.tar.gz

Hashes for rtmlib-0.0.12.tar.gz
Algorithm	Hash digest
SHA256	`c774cc05528a0b261cfb8c1a18841d9a0b9a2f7964cbc84ca9431a9c7b1fec4f`
MD5	`ac564475a2d379a1c8586a84d96d0004`
BLAKE2b-256	`f32e1a8a27c6db42b366488e2f6d89dedfddf6b60f4be60d87c94ac631bf845b`

Hashes for rtmlib-0.0.12-py3-none-any.whl

Hashes for rtmlib-0.0.12-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dd5888666bacca9493f9a9bee0fe63beb7153d0f0dc4c04fc821024a2967e34d`
MD5	`948793b8cfccfd78961b430bb93949a5`
BLAKE2b-256	`9fba2c73c1e70404789527d13c36c98298db4186134294035c0f8bf27a036e72`