Skip to main content

RTMPose series models without mmcv, mmpose, mmdet etc.

Project description

rtmlib

demo

rtmlib is a super lightweight library to conduct pose estimation based on RTMPose models WITHOUT any dependencies like mmcv, mmpose, mmdet, etc.

Basically, rtmlib only requires these dependencies:

  • numpy
  • opencv-python
  • opencv-contrib-python
  • onnxruntime

Optionally, you can use other common backends like opencv, onnxruntime, openvino, tensorrt to accelerate the inference process.

  • For openvino users, please add the path <your python path>\envs\<your env name>\Lib\site-packages\openvino\libs into your environment path.

Installation

  • install from pypi:
pip install rtmlib -i https://pypi.org/simple
  • install from source code:
git clone https://github.com/Tau-J/rtmlib.git
cd rtmlib

pip install -r requirements.txt

pip install -e .

# [optional]
# pip install onnxruntime-gpu
# pip install openvino

Quick Start

Here is a simple demo to show how to use rtmlib to conduct pose estimation on a single image.

import cv2

from rtmlib import Wholebody, draw_skeleton

device = 'cpu'  # cpu, cuda
backend = 'onnxruntime'  # opencv, onnxruntime, openvino
img = cv2.imread('./demo.jpg')

openpose_skeleton = False  # True for openpose-style, False for mmpose-style

wholebody = Wholebody(to_openpose=openpose_skeleton,
                      mode='balanced',  # 'performance', 'lightweight', 'balanced'. Default: 'balanced'
                      backend=backend, device=device)

keypoints, scores = wholebody(img)

# visualize

# if you want to use black background instead of original image,
# img_show = np.zeros(img_show.shape, dtype=np.uint8)

img_show = draw_skeleton(img_show, keypoints, scores, kpt_thr=0.5)


cv2.imshow('img', img_show)
cv2.waitKey()

WebUI

Run webui.py:

# Please make sure you have installed gradio
# pip install gradio

python webui.py

image

APIs

For high-level APIs (Solution), you can choose to pass mode or det+pose arguments to specify the detector and pose estimator you want to use.

# By mode
wholebody = Wholebody(mode='performance',  # 'performance', 'lightweight', 'balanced'. Default: 'balanced'
                      backend=backend,
                      device=device)

# By det and pose
body = Body(det='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/yolox_x_8xb8-300e_humanart-a39d44ed.zip',
            det_input_size=(640, 640),
            pose='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-x_simcc-body7_pt-body7_700e-384x288-71d7b7e9_20230629.zip',
            pose_input_size=(288, 384),
            backend=backend,
            device=device)

For low-level APIs (Model), you can specify the model you want to use by passing the onnx_model argument.

# By onnx_model (.onnx)
pose_model = RTMPose(onnx_model='/path/to/your_model.onnx',  # download link or local path
                     backend=backend, device=device)

# By onnx_model (.zip)
pose_model = RTMPose(onnx_model='https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-m_simcc-body7_pt-body7_420e-256x192-e48f03d0_20230504.zip',  # download link or local path
                     backend=backend, device=device)

Model Zoo

By defaults, rtmlib will automatically download and apply models with the best performance.

More models can be found in RTMPose Model Zoo.

Detectors

Person

Notes:

  • Models trained on HumanArt can detect both real human and cartoon characters.
  • Models trained on COCO can only detect real human.
ONNX Model Input Size Description
YOLOX-l 640x640 trained on COCO val2017
YOLOX-nano 416x416 trained on HumanArt
YOLOX-tiny 416x416 trained on HumanArt
YOLOX-s 640x640 trained on HumanArt
YOLOX-m 640x640 trained on HumanArt
YOLOX-l 640x640 trained on HumanArt
YOLOX-x 640x640 trained on HumanArt

Pose Estimators

Body
ONNX Model Input Size Description
RTMPose-t 256x192 Body 17 Keypoints
RTMPose-s 256x192 Body 17 Keypoints
RTMPose-m 256x192 Body 17 Keypoints
RTMPose-l 384x288 Body 17 Keypoints
RTMPose-x 384x288 Body 17 Keypoints
RTMO-s 640x640 Body 17 Keypoints
RTMO-m 640x640 Body 17 Keypoints
RTMO-l 640x640 Body 17 Keypoints
WholeBody
ONNX Model Input Size Description
RTMW-m 256x192 Wholebody 133 Keypoints
RTMW-l 256x192 Wholebody 133 Keypoints
RTMW-l 384x288 Wholebody 133 Keypoints
RTMW-x 384x288 Wholebody 133 Keypoints

Visualization

MMPose-style OpenPose-style
result result
result result
result result
result result

Citation

@misc{rtmlib,
  title={rtmlib},
  author={Jiang, Tao},
  year={2023},
  howpublished = {\url{https://github.com/Tau-J/rtmlib}},
}

@misc{jiang2023,
  doi = {10.48550/ARXIV.2303.07399},
  url = {https://arxiv.org/abs/2303.07399},
  author = {Jiang, Tao and Lu, Peng and Zhang, Li and Ma, Ningsheng and Han, Rui and Lyu, Chengqi and Li, Yining and Chen, Kai},
  keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose},
  publisher = {arXiv},
  year = {2023},
  copyright = {Creative Commons Attribution 4.0 International}
}

@misc{lu2023rtmo,
      title={{RTMO}: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation},
      author={Peng Lu and Tao Jiang and Yining Li and Xiangtai Li and Kai Chen and Wenming Yang},
      year={2023},
      eprint={2312.07526},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Our code is based on these repos:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rtmlib-0.0.11.tar.gz (41.9 kB view hashes)

Uploaded Source

Built Distribution

rtmlib-0.0.11-py3-none-any.whl (46.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page