RTMPose series models without mmcv, mmpose, mmdet etc.
Project description
rtmlib
rtmlib is a super lightweight library to conduct pose estimation based on RTMPose models WITHOUT any dependencies like mmcv, mmpose, mmdet, etc.
Basically, rtmlib only requires these dependencies:
- numpy
- opencv-python
- opencv-contrib-python
- onnxruntime
Optionally, you can use other common backends like opencv, onnxruntime, openvino, tensorrt to accelerate the inference process.
- For openvino users, please add the path
<your python path>\envs\<your env name>\Lib\site-packages\openvino\libs
into your environment path.
Installation
- install from pypi:
pip install rtmlib -i https://pypi.org/simple
- install from source code:
git clone https://github.com/Tau-J/rtmlib.git
cd rtmlib
pip install -r requirements.txt
pip install -e .
# [optional]
# pip install onnxruntime-gpu
# pip install openvino
Quick Start
Run webui.py
:
# Please make sure you have installed gradio
# pip install gradio
python webui.py
Here is also a simple demo to show how to use rtmlib to conduct pose estimation on a single image.
import cv2
from rtmlib import Wholebody, draw_skeleton
device = 'cpu' # cpu, cuda
backend = 'onnxruntime' # opencv, onnxruntime, openvino
img = cv2.imread('./demo.jpg')
openpose_skeleton = False # True for openpose-style, False for mmpose-style
wholebody = Wholebody(to_openpose=openpose_skeleton,
mode='balanced', # 'performance', 'lightweight', 'balanced'. Default: 'balanced'
backend=backend, device=device)
keypoints, scores = wholebody(img)
# visualize
# if you want to use black background instead of original image,
# img_show = np.zeros(img_show.shape, dtype=np.uint8)
img_show = draw_skeleton(img_show, keypoints, scores, kpt_thr=0.5)
cv2.imshow('img', img_show)
cv2.waitKey()
APIs
- Solutions (High-level APIs)
- Models (Low-level APIs)
- Visualization
TODO
- Support MMPose-style skeleton visualization
- Support OpenPose-style skeleton visualization
- Support WholeBody
- Support Hand
- Support Face
- Support Animal
- Support ONNXRuntime backend
- Support auto download and cache models
- Lightweight models
- Support 3 modes:
performance
,lightweight
,balanced
to select - Support alias to choose model
- Support naive PoseTracker
- Support OpenVINO backend
- Support TensorRT backend
- Gradio interface
- Compatible with Controlnet
- Support RTMO
Model Zoo
By defaults, rtmlib will automatically download and apply models with the best performance. But you can also specify the model you want to use by passing the onnx_model
argument.
More models can be found in RTMPose Model Zoo.
Detectors
Person
Notes:
- Models trained on HumanArt can detect both real human and cartoon characters.
- Models trained on COCO can only detect real human.
ONNX Model | Input Size | Description |
---|---|---|
YOLOX-l | 640x640 | trained on COCO val2017 |
YOLOX-nano | 416x416 | trained on HumanArt |
YOLOX-tiny | 416x416 | trained on HumanArt |
YOLOX-s | 640x640 | trained on HumanArt |
YOLOX-m | 640x640 | trained on HumanArt |
YOLOX-l | 640x640 | trained on HumanArt |
YOLOX-x | 640x640 | trained on HumanArt |
Pose Estimators
Body
ONNX Model | Input Size | Description |
---|---|---|
RTMPose-t | 256x192 | Body 17 Keypoints |
RTMPose-s | 256x192 | Body 17 Keypoints |
RTMPose-m | 256x192 | Body 17 Keypoints |
RTMPose-l | 384x288 | Body 17 Keypoints |
RTMPose-x | 384x288 | Body 17 Keypoints |
RTMO-s | 640x640 | Body 17 Keypoints |
RTMO-m | 640x640 | Body 17 Keypoints |
RTMO-l | 640x640 | Body 17 Keypoints |
WholeBody
ONNX Model | Input Size | Description |
---|---|---|
RTMW-m | 256x192 | Wholebody 133 Keypoints |
RTMW-l | 256x192 | Wholebody 133 Keypoints |
RTMW-l | 384x288 | Wholebody 133 Keypoints |
RTMW-x | 384x288 | Wholebody 133 Keypoints |
Visualization
MMPose-style | OpenPose-style |
---|---|
Citation
@misc{rtmlib,
title={rtmlib},
author={Jiang, Tao},
year={2023},
howpublished = {\url{https://github.com/Tau-J/rtmlib}},
}
@misc{jiang2023,
doi = {10.48550/ARXIV.2303.07399},
url = {https://arxiv.org/abs/2303.07399},
author = {Jiang, Tao and Lu, Peng and Zhang, Li and Ma, Ningsheng and Han, Rui and Lyu, Chengqi and Li, Yining and Chen, Kai},
keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose},
publisher = {arXiv},
year = {2023},
copyright = {Creative Commons Attribution 4.0 International}
}
@misc{lu2023rtmo,
title={{RTMO}: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation},
author={Peng Lu and Tao Jiang and Yining Li and Xiangtai Li and Kai Chen and Wenming Yang},
year={2023},
eprint={2312.07526},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Acknowledgement
Our code is based on these repos:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.