Skip to main content

OpenMMLab Video Understanding Toolbox and Benchmark

Project description


English | 简体中文

Documentation actions codecov PyPI LICENSE Average time to resolve an issue Percentage of issues still open

MMAction2 is an open-source toolbox for video understanding based on PyTorch. It is a part of the OpenMMLab project.

The master branch works with PyTorch 1.3+.

Action Recognition Results on Kinetics-400

Spatio-Temporal Action Detection Results on AVA-2.1

Skeleton-base Action Recognition Results on NTU-RGB+D-120

Major Features

  • Modular design

    We decompose the video understanding framework into different components and one can easily construct a customized video understanding framework by combining different modules.

  • Support for various datasets

    The toolbox directly supports multiple datasets, UCF101, Kinetics-[400/600/700], Something-Something V1&V2, Moments in Time, Multi-Moments in Time, THUMOS14, etc.

  • Support for multiple video understanding frameworks

    MMAction2 implements popular frameworks for video understanding:

    • For action recognition, various algorithms are implemented, including TSN, TSM, TIN, R(2+1)D, I3D, SlowOnly, SlowFast, CSN, Non-local, etc.

    • For temporal action localization, we implement BSN, BMN, SSN.

    • For spatial temporal detection, we implement SlowOnly, SlowFast.

  • Well tested and documented

    We provide detailed documentation and API reference, as well as unittests.


v0.16.0 was released in 01/07/2021. Please refer to for details and release history.


Model input io backend batch size x gpus MMAction2 (s/iter) MMAction (s/iter) Temporal-Shift-Module (s/iter) PySlowFast (s/iter)
TSN 256p rawframes Memcached 32x8 0.32 0.38 0.42 x
TSN 256p dense-encoded video Disk 32x8 0.61 x x TODO
I3D heavy 256p videos Disk 8x8 0.34 x x 0.44
I3D 256p rawframes Memcached 8x8 0.43 0.56 x x
TSM 256p rawframes Memcached 8x8 0.31 x 0.41 x
Slowonly 256p videos Disk 8x8 0.32 TODO x 0.34
Slowfast 256p videos Disk 8x8 0.69 x x 1.04
R(2+1)D 256p videos Disk 8x8 0.45 x x x

Details can be found in benchmark.


Supported methods for Action Recognition:

(click to collapse)

Supported methods for Temporal Action Detection:

(click to collapse)
  • BSN (ECCV'2018)
  • BMN (ICCV'2019)
  • SSN (ICCV'2017)

Supported methods for Spatial Temporal Action Detection:

(click to collapse)

Supported methods for Skeleton-based Action Recognition:

(click to collapse)

Results and models are available in the of each method's config directory. A summary can be found in the model zoo page.

We will keep up with the latest progress of the community, and support more popular algorithms and frameworks. If you have any feature requests, please feel free to leave a comment in Issues.


Supported datasets:

Supported datasets for Action Recognition:

(click to collapse)

Supported datasets for Temporal Action Detection

(click to collapse)

Supported datasets for Spatial Temporal Action Detection

(click to collapse)

Supported datasets for Skeleton-based Action Detection

(click to collapse)

Datasets marked with 🔲 are not fully supported yet, but related dataset preparation steps are provided.


Please refer to for installation.

Data Preparation

Please refer to for a general knowledge of data preparation. The supported datasets are listed in

Get Started

Please see for the basic usage of MMAction2. There are also tutorials:

A Colab tutorial is also provided. You may preview the notebook here or directly run on Colab.


Please refer to FAQ for frequently asked questions.


This project is released under the Apache 2.0 license.


If you find this project useful in your research, please consider cite:

    title={OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark},
    author={MMAction2 Contributors},
    howpublished = {\url{}},


We appreciate all contributions to improve MMAction2. Please refer to in MMCV for more details about the contributing guideline.


MMAction2 is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new models.

Projects in OpenMMLab

  • MMCV: OpenMMLab foundational library for computer vision.
  • MMClassification: OpenMMLab image classification toolbox and benchmark.
  • MMDetection: OpenMMLab detection toolbox and benchmark.
  • MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
  • MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
  • MMAction2: OpenMMLab's next-generation video understanding toolbox and benchmark.
  • MMTracking: OpenMMLab video perception toolbox and benchmark.
  • MMPose: OpenMMLab pose estimation toolbox and benchmark.
  • MMEditing: OpenMMLab image and video editing toolbox.
  • MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding.
  • MMGeneration: OpenMMLab image and video generative models toolbox.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for mmaction2, version 0.16.0
Filename, size File type Python version Upload date Hashes
Filename, size mmaction2-0.16.0-py2.py3-none-any.whl (272.0 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size mmaction2-0.16.0.tar.gz (198.0 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page