MPOSE2021: a Dataset for Short-time Pose-based Human Action Recognition
Project description
MPOSE2021:
A Dataset for Real-Time Short-Time HAR
This repository contains the MPOSE2021 Dataset for short-time pose-based Human Action Recognition (HAR). MPOSE2021 is specifically designed to perform short-time Human Action Recognition.
MPOSE2021 is developed as an evolution of the MPOSE Dataset [1-3]. It is made by human pose data detected by OpenPose [4] (and Posenet, coming soon!) on popular datasets for HAR, i.e. Weizmann [5], i3DPost [6], IXMAS [7], KTH [8], UTKinetic-Action3D (RGB only) [9] and UTD-MHAD (RGB only) [10], alongside original video datasets, i.e. ISLD and ISLD-Additional-Sequences [1]. Since these datasets have heterogenous action labels, each dataset labels are remapped to a common and homogeneous list of actions.
This repository allows users to generate pose data for MPOSE2021 in a python-friendly format. Generated sequences have a number of frames between 20 and 30. Sequences are obtained by cutting the so-called Precursor VIDEOS (videos from the above-mentioned datasets), with non-overlapping sliding windows. Frames where OpenPose cannot detect any subject are automatically discarded. Resulting samples contain one subject at the time, performing a fraction of a single action. Overall, MPOSE2021 contains 15429 samples, divided into 20 actions, performed by 100 subjects.
The overview of the action composition of MPOSE2021 is provided here.
Below, the steps to install the mpose
library and obtain sequences are explained.
Check our Colab Notebook Tutorial for quick hands-on examples.
Installation
Install Breizhcrops as python package from PyPI!
pip install mpose
Getting Started
This minimal working example
# import package
import mpose
# initialize and download data
dataset = mpose.MPOSE(pose_extractor='openpose',
split=1,
transform='scale_and_center',
data_dir='./data/')
# print data info
dataset.get_info()
# get data samples (as numpy arrays)
X_train, y_train, X_test, y_test = dataset.get_dataset()
Train a model
Train a model via the example script train.py
python train.py TransformerEncoder --learning-rate 0.001 --weight-decay 5e-08 --preload-ram
This script uses the default model parameters from breizhcrops.models.TransformerModel
.
When training multiple epochs, the --preload-ram
flag speeds up training significantly
Acknowledgements
The model implementations from this repository are based on the following papers and github repositories.
- TempCNN (reimplementation from keras source code ) Pelletier et al., 2019
- LSTM Recurrent Neural Network adapted from Rußwurm & Körner, 2017
- MS-ResNet implementation from Fei Wang
- TransformerEncoder implementation was originally adopted from Yu-Hsiang Huang GitHub, but later replaced by own implementation when
torch.nn.transformer
modules became available - InceptionTime Fawaz et al., 2019
- StarRNN Turkoglu et al., 2019
- OmniscaleCNN Tang et al., 2020
The raw label data originates from
- Registre parcellaire graphique (RPG) of the French National Geographic Institute (IGN)
References
MPOSE2021 is part of a paper published by the Pattern Recognition Journal (Elsevier), and is intended for scientific research purposes. If you want to use MPOSE2021 for your research work, please also cite [1-10].
@article{mazzia2021action,
title={Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition},
author={Mazzia, Vittorio and Angarano, Simone and Salvetti, Francesco and Angelini, Federico and Chiaberge, Marcello},
journal={Pattern Recognition},
pages={108487},
year={2021},
publisher={Elsevier}
}
[1] F. Angelini, Z. Fu, Y. Long, L. Shao and S. M. Naqvi, "2D Pose-based Real-time Human Action Recognition with Occlusion-handling," in IEEE Transactions on Multimedia. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8853267&isnumber=4456689
[2] F. Angelini, J. Yan and S. M. Naqvi, "Privacy-preserving Online Human Behaviour Anomaly Detection Based on Body Movements and Objects Positions," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 8444-8448. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8683026&isnumber=8682151
[3] F. Angelini and S. M. Naqvi, "Joint RGB-Pose Based Human Action Recognition for Anomaly Detection Applications," 2019 22th International Conference on Information Fusion (FUSION), Ottawa, ON, Canada, 2019, pp. 1-7. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9011277&isnumber=9011156
[4] Cao, Zhe, et al. "OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields." IEEE transactions on pattern analysis and machine intelligence 43.1 (2019): 172-186.
[5] Gorelick, Lena, et al. "Actions as space-time shapes." IEEE transactions on pattern analysis and machine intelligence 29.12 (2007): 2247-2253.
[6] Starck, Jonathan, and Adrian Hilton. "Surface capture for performance-based animation." IEEE computer graphics and applications 27.3 (2007): 21-31.
[7] Weinland, Daniel, Mustafa Özuysal, and Pascal Fua. "Making action recognition robust to occlusions and viewpoint changes." European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2010.
[8] Schuldt, Christian, Ivan Laptev, and Barbara Caputo. "Recognizing human actions: a local SVM approach." Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004. Vol. 3. IEEE, 2004.
[9] L. Xia, C.C. Chen and JK Aggarwal. "View invariant human action recognition using histograms of 3D joints", 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 20-27, 2012.
[10] C. Chen, R. Jafari, and N. Kehtarnavaz. "UTD-MHAD: A Multimodal Dataset for Human Action Recognition Utilizing a Depth Camera and a Wearable Inertial Sensor". Proceedings of IEEE International Conference on Image Processing, Canada, 2015.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.