mmaction2

OpenMMLab Video Understanding Toolbox and Benchmark

These details have not been verified by PyPI

Project links

Homepage

Project description

OpenMMLab website ^HOT OpenMMLab platform ^{TRY IT OUT}

Introduction

MMAction2 is an open-source toolbox for video understanding based on PyTorch. It is a part of the OpenMMLab project.

The 1.x branch works with PyTorch 1.6+.

Action Recognition Results on Kinetics-400

Skeleton-based Action Recognition Results on NTU-RGB+D-120

Skeleton-based Spatio-Temporal Action Detection and Action Recognition Results on Kinetics-400

Spatio-Temporal Action Detection Results on AVA-2.1

Major Features

Modular design: We decompose a video understanding framework into different components. One can easily construct a customized video understanding framework by combining different modules.
Support four major video understanding tasks: MMAction2 implements various algorithms for multiple video understanding tasks, including action recognition, action localization, spatio-temporal action detection, and skeleton-based action detection. We support 27 different algorithms and 20 different datasets for the four major tasks.
Well tested and documented: We provide detailed documentation and API reference, as well as unit tests.

What's New

(2022-10-11) We support Video Swin Transformer on Kinetics400 and additionally train a Swin-L model on Kinetics700 to extract video features for downstream tasks.

Release: v1.0.0rc1 was released in 14/10/2022. Please refer to changelog.md for details and release history.

Installation

Please refer to install.md for more detailed instructions.

Supported Methods

Action Recognition
C3D (CVPR'2014)	TSN (ECCV'2016)	I3D (CVPR'2017)	I3D Non-Local (CVPR'2018)	R(2+1)D (CVPR'2018)
TRN (ECCV'2018)	TSM (ICCV'2019)	TSM Non-Local (ICCV'2019)	SlowOnly (ICCV'2019)	SlowFast (ICCV'2019)
CSN (ICCV'2019)	TIN (AAAI'2020)	TPN (CVPR'2020)	X3D (CVPR'2020)
MultiModality: Audio (ArXiv'2020)	TANet (ArXiv'2020)	TimeSformer (ICML'2021)	VideoSwin (CVPR'2022)
Action Localization
SSN (ICCV'2017)	BSN (ECCV'2018)	BMN (ICCV'2019)
Spatio-Temporal Action Detection
ACRN (ECCV'2018)	SlowOnly+Fast R-CNN (ICCV'2019)	SlowFast+Fast R-CNN (ICCV'2019)	LFB (CVPR'2019)
Skeleton-based Action Recognition
ST-GCN (AAAI'2018)	2s-AGCN (CVPR'2019)	PoseC3D (CVPR'2022)

Results and models are available in the README.md of each method's config directory. A summary can be found on the model zoo page.

We will keep up with the latest progress of the community and support more popular algorithms and frameworks. If you have any feature requests, please feel free to leave a comment in Issues.

Supported Datasets

Action Recognition
HMDB51 (Homepage) (ICCV'2011)	UCF101 (Homepage) (CRCV-IR-12-01)	ActivityNet (Homepage) (CVPR'2015)	Kinetics-[400/600/700] (Homepage) (CVPR'2017)
SthV1 (Homepage) (ICCV'2017)	SthV2 (Homepage) (ICCV'2017)	Diving48 (Homepage) (ECCV'2018)	Jester (Homepage) (ICCV'2019)
Moments in Time (Homepage) (TPAMI'2019)	Multi-Moments in Time (Homepage) (ArXiv'2019)	HVU (Homepage) (ECCV'2020)	OmniSource (Homepage) (ECCV'2020)
FineGYM (Homepage) (CVPR'2020)
Action Localization
THUMOS14 (Homepage) (THUMOS Challenge 2014)	ActivityNet (Homepage) (CVPR'2015)
Spatio-Temporal Action Detection
UCF101-24* (Homepage) (CRCV-IR-12-01)	JHMDB* (Homepage) (ICCV'2015)	AVA (Homepage) (CVPR'2018)
Skeleton-based Action Recognition
PoseC3D-FineGYM (Homepage) (ArXiv'2021)	PoseC3D-NTURGB+D (Homepage) (ArXiv'2021)	PoseC3D-UCF101 (Homepage) (ArXiv'2021)	PoseC3D-HMDB51 (Homepage) (ArXiv'2021)

Datasets marked with * are not fully supported yet, but related dataset preparation steps are provided. A summary can be found on the Supported Datasets page.

Data Preparation

Please refer to data_preparation.md for a general knowledge of data preparation.

FAQ

Please refer to FAQ for frequently asked questions.

Projects built on MMAction2

Currently, there are many research works and projects built on MMAction2 by users from community, such as:

Video Swin Transformer. [paper][github]
Evidential Deep Learning for Open Set Action Recognition, ICCV 2021 Oral. [paper][github]
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective, ICCV 2021 Oral. [paper][github]

etc., check projects.md to see all related projects.

License

This project is released under the Apache 2.0 license.

Citation

If you find this project useful in your research, please consider cite:

@misc{2020mmaction2,
    title={OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark},
    author={MMAction2 Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmaction2}},
    year={2020}
}

Contributing

We appreciate all contributions to improve MMAction2. Please refer to CONTRIBUTING.md in MMCV for more details about the contributing guideline.

Acknowledgement

MMAction2 is an open-source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features and users who give valuable feedback. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their new models.

Projects in OpenMMLab

MMEngine: OpenMMLab foundational library for training deep learning models.
MMCV: OpenMMLab foundational library for computer vision.
MIM: MIM installs OpenMMLab packages.
MMClassification: OpenMMLab image classification toolbox and benchmark.
MMDetection: OpenMMLab detection toolbox and benchmark.
MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
MMRotate: OpenMMLab rotated object detection toolbox and benchmark.
MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
MMOCR: OpenMMLab text detection, recognition, and understanding toolbox.
MMPose: OpenMMLab pose estimation toolbox and benchmark.
MMHuman3D: OpenMMLab 3D human parametric model toolbox and benchmark.
MMSelfSup: OpenMMLab self-supervised learning toolbox and benchmark.
MMRazor: OpenMMLab model compression toolbox and benchmark.
MMFewShot: OpenMMLab fewshot learning toolbox and benchmark.
MMAction2: OpenMMLab's next-generation action understanding toolbox and benchmark.
MMTracking: OpenMMLab video perception toolbox and benchmark.
MMFlow: OpenMMLab optical flow toolbox and benchmark.
MMEditing: OpenMMLab image and video editing toolbox.
MMGeneration: OpenMMLab image and video generative models toolbox.
MMDeploy: OpenMMLab model deployment framework.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.2.0

Oct 12, 2023

1.1.0

Jul 4, 2023

1.0.0

Apr 7, 2023

1.0.0rc3 pre-release

Feb 10, 2023

1.0.0rc2 pre-release

Jan 10, 2023

This version

1.0.0rc1 pre-release

Oct 14, 2022

1.0.0rc0 pre-release

Sep 1, 2022

0.24.1

Jul 29, 2022

0.24.0

May 5, 2022

0.23.0

Apr 2, 2022

0.22.0

Mar 7, 2022

0.21.0

Dec 31, 2021

0.20.0

Oct 30, 2021

0.19.0

Oct 7, 2021

0.18.0

Sep 2, 2021

0.17.0

Aug 3, 2021

0.16.0

Jul 1, 2021

0.15.0

May 31, 2021

0.14.0

May 3, 2021

0.13.0

Apr 1, 2021

0.12.0

Mar 1, 2021

0.11.0

Feb 1, 2021

0.10.0

Jan 5, 2021

0.9.0

Dec 1, 2020

0.8.0

Oct 31, 2020

0.7.0

Oct 10, 2020

0.6.0

Sep 2, 2020

0.5.0

Jul 21, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mmaction2-1.0.0rc1.tar.gz (289.7 kB view hashes)

Uploaded Oct 14, 2022 Source

Built Distribution

mmaction2-1.0.0rc1-py2.py3-none-any.whl (536.1 kB view hashes)

Uploaded Oct 14, 2022 Python 2 Python 3

Hashes for mmaction2-1.0.0rc1.tar.gz

Hashes for mmaction2-1.0.0rc1.tar.gz
Algorithm	Hash digest
SHA256	`f37321fa86d4e8157510ebe374b67382020f784d56384026144495050aa41b68`
MD5	`1e8a68db1b63d2feabac08f5f3e7bdd3`
BLAKE2b-256	`990587223019d9245d361c5d8b1e79068fc6e1adfa54abf3af258b6075db7c97`

Hashes for mmaction2-1.0.0rc1-py2.py3-none-any.whl

Hashes for mmaction2-1.0.0rc1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`505a1dc65b35c1a2cfe4a62044a505d26a72858d2d51b39676a0bebc048a770e`
MD5	`29b351983e49deee5293d6c8b95510f6`
BLAKE2b-256	`9311e5c0fdcd54241782a9770f3cf1970902b7ce9e4dfa072ecd9f41cd244b3f`