RTMPose series models without mmcv, mmpose, mmdet etc.
Project description
rtmlib
rtmlib is a super lightweight library to conduct pose estimation based on RTMPose models WITHOUT any dependencies like mmcv, mmpose, mmdet, etc.
Basically, rtmlib only requires these dependencies:
- numpy
- opencv-python
- opencv-contrib-python
- onnxruntime
Optionally, you can use other common backends like opencv, onnxruntime, openvino, tensorrt to accelerate the inference process.
- For openvino users, please add the path
<your python path>\envs\<your env name>\Lib\site-packages\openvino\libs
into your environment path.
Installation
- install from pypi:
pip install rtmlib -i https://pypi.org/simple
- install from source code:
git clone https://github.com/Tau-J/rtmlib.git
cd rtmlib
pip install -r requirements.txt
pip install -e .
# [optional]
# pip install onnxruntime-gpu
# pip install openvino
Quick Start
Here is a simple demo to show how to use rtmlib to conduct pose estimation on a single image.
import cv2
from rtmlib import Wholebody, draw_skeleton
device = 'cpu' # cpu, cuda
backend = 'onnxruntime' # opencv, onnxruntime, openvino
img = cv2.imread('./demo.jpg')
openpose_skeleton = False # True for openpose-style, False for mmpose-style
wholebody = Wholebody(to_openpose=openpose_skeleton,
mode='balanced', # 'performance', 'lightweight', 'balanced'. Default: 'balanced'
backend=backend, device=device)
keypoints, scores = wholebody(img)
# visualize
# if you want to use black background instead of original image,
# img_show = np.zeros(img_show.shape, dtype=np.uint8)
img_show = draw_skeleton(img_show, keypoints, scores, kpt_thr=0.5)
cv2.imshow('img', img_show)
cv2.waitKey()
WebUI
Run webui.py
:
# Please make sure you have installed gradio
# pip install gradio
python webui.py
APIs
- Solutions (High-level APIs)
- Models (Low-level APIs)
- Visualization
For high-level APIs (Solution
), you can choose to pass mode
or det
+pose
arguments to specify the detector and pose estimator you want to use.
# By mode
wholebody = Wholebody(mode='performance', # 'performance', 'lightweight', 'balanced'. Default: 'balanced'
backend=backend,
device=device)
# By det and pose
body = Body(det='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/yolox_x_8xb8-300e_humanart-a39d44ed.zip',
det_input_size=(640, 640),
pose='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-x_simcc-body7_pt-body7_700e-384x288-71d7b7e9_20230629.zip',
pose_input_size=(288, 384),
backend=backend,
device=device)
For low-level APIs (Model
), you can specify the model you want to use by passing the onnx_model
argument.
# By onnx_model (.onnx)
pose_model = RTMPose(onnx_model='/path/to/your_model.onnx', # download link or local path
backend=backend, device=device)
# By onnx_model (.zip)
pose_model = RTMPose(onnx_model='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-m_simcc-body7_pt-body7_420e-256x192-e48f03d0_20230504.zip', # download link or local path
backend=backend, device=device)
Model Zoo
By defaults, rtmlib will automatically download and apply models with the best performance.
More models can be found in RTMPose Model Zoo.
Detectors
Person
Notes:
- Models trained on HumanArt can detect both real human and cartoon characters.
- Models trained on COCO can only detect real human.
ONNX Model | Input Size | Description |
---|---|---|
YOLOX-l | 640x640 | trained on COCO val2017 |
YOLOX-nano | 416x416 | trained on HumanArt |
YOLOX-tiny | 416x416 | trained on HumanArt |
YOLOX-s | 640x640 | trained on HumanArt |
YOLOX-m | 640x640 | trained on HumanArt |
YOLOX-l | 640x640 | trained on HumanArt |
YOLOX-x | 640x640 | trained on HumanArt |
Pose Estimators
Body
ONNX Model | Input Size | Description |
---|---|---|
RTMPose-t | 256x192 | Body 17 Keypoints |
RTMPose-s | 256x192 | Body 17 Keypoints |
RTMPose-m | 256x192 | Body 17 Keypoints |
RTMPose-l | 384x288 | Body 17 Keypoints |
RTMPose-x | 384x288 | Body 17 Keypoints |
RTMO-s | 640x640 | Body 17 Keypoints |
RTMO-m | 640x640 | Body 17 Keypoints |
RTMO-l | 640x640 | Body 17 Keypoints |
WholeBody
ONNX Model | Input Size | Description |
---|---|---|
RTMW-m | 256x192 | Wholebody 133 Keypoints |
RTMW-l | 256x192 | Wholebody 133 Keypoints |
RTMW-l | 384x288 | Wholebody 133 Keypoints |
RTMW-x | 384x288 | Wholebody 133 Keypoints |
Visualization
MMPose-style | OpenPose-style |
---|---|
Citation
@misc{rtmlib,
title={rtmlib},
author={Jiang, Tao},
year={2023},
howpublished = {\url{https://github.com/Tau-J/rtmlib}},
}
@misc{jiang2023,
doi = {10.48550/ARXIV.2303.07399},
url = {https://arxiv.org/abs/2303.07399},
author = {Jiang, Tao and Lu, Peng and Zhang, Li and Ma, Ningsheng and Han, Rui and Lyu, Chengqi and Li, Yining and Chen, Kai},
keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose},
publisher = {arXiv},
year = {2023},
copyright = {Creative Commons Attribution 4.0 International}
}
@misc{lu2023rtmo,
title={{RTMO}: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation},
author={Peng Lu and Tao Jiang and Yining Li and Xiangtai Li and Kai Chen and Wenming Yang},
year={2023},
eprint={2312.07526},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Acknowledgement
Our code is based on these repos:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.