Skip to main content

A video decoder for PyTorch

Project description

Installation | Simple Example | Detailed Example | Documentation | Contributing | License

TorchCodec

TorchCodec is a Python library for decoding videos into PyTorch tensors, on CPU and CUDA GPU. It aims to be fast, easy to use, and well integrated into the PyTorch ecosystem. If you want to use PyTorch to train ML models on videos, TorchCodec is how you turn those videos into data.

We achieve these capabilities through:

  • Pythonic APIs that mirror Python and PyTorch conventions.
  • Relying on FFmpeg to do the decoding. TorchCodec uses the version of FFmpeg you already have installed. FFmpeg is a mature library with broad coverage available on most systems. It is, however, not easy to use. TorchCodec abstracts FFmpeg's complexity to ensure it is used correctly and efficiently.
  • Returning data as PyTorch tensors, ready to be fed into PyTorch transforms or used directly to train models.

[!NOTE] ������ TorchCodec is still in development stage and some APIs may be updated in future versions, depending on user feedback. If you have any suggestions or issues, please let us know by opening an issue!

Using TorchCodec

Here's a condensed summary of what you can do with TorchCodec. For more detailed examples, check out our documentation!

Decoding

from torchcodec.decoders import VideoDecoder

device = "cpu"  # or e.g. "cuda" !
decoder = VideoDecoder("path/to/video.mp4", device=device)

decoder.metadata
# VideoStreamMetadata:
#   num_frames: 250
#   duration_seconds: 10.0
#   bit_rate: 31315.0
#   codec: h264
#   average_fps: 25.0
#   ... (truncated output)

# Simple Indexing API
decoder[0]  # uint8 tensor of shape [C, H, W]
decoder[0 : -1 : 20]  # uint8 stacked tensor of shape [N, C, H, W]

# Indexing, with PTS and duration info:
decoder.get_frames_at(indices=[2, 100])
# FrameBatch:
#   data (shape): torch.Size([2, 3, 270, 480])
#   pts_seconds: tensor([0.0667, 3.3367], dtype=torch.float64)
#   duration_seconds: tensor([0.0334, 0.0334], dtype=torch.float64)

# Time-based indexing with PTS and duration info
decoder.get_frames_played_at(seconds=[0.5, 10.4])
# FrameBatch:
#   data (shape): torch.Size([2, 3, 270, 480])
#   pts_seconds: tensor([ 0.4671, 10.3770], dtype=torch.float64)
#   duration_seconds: tensor([0.0334, 0.0334], dtype=torch.float64)

Clip sampling

from torchcodec.samplers import clips_at_regular_timestamps

clips_at_regular_timestamps(
    decoder,
    seconds_between_clip_starts=1.5,
    num_frames_per_clip=4,
    seconds_between_frames=0.1
)
# FrameBatch:
#   data (shape): torch.Size([9, 4, 3, 270, 480])
#   pts_seconds: tensor([[ 0.0000,  0.0667,  0.1668,  0.2669],
#         [ 1.4681,  1.5682,  1.6683,  1.7684],
#         [ 2.9696,  3.0697,  3.1698,  3.2699],
#         ... (truncated), dtype=torch.float64)
#   duration_seconds: tensor([[0.0334, 0.0334, 0.0334, 0.0334],
#         [0.0334, 0.0334, 0.0334, 0.0334],
#         [0.0334, 0.0334, 0.0334, 0.0334],
#         ... (truncated), dtype=torch.float64)

You can use the following snippet to generate a video with FFmpeg and tryout TorchCodec:

fontfile=/usr/share/fonts/dejavu-sans-mono-fonts/DejaVuSansMono-Bold.ttf
output_video_file=/tmp/output_video.mp4

ffmpeg -f lavfi -i \
    color=size=640x400:duration=10:rate=25:color=blue \
    -vf "drawtext=fontfile=${fontfile}:fontsize=30:fontcolor=white:x=(w-text_w)/2:y=(h-text_h)/2:text='Frame %{frame_num}'" \
    ${output_video_file}

Installing TorchCodec

Installing CPU-only TorchCodec

  1. Install the latest stable version of PyTorch following the official instructions. For other versions, refer to the table below for compatibility between versions of torch and torchcodec.

  2. Install FFmpeg, if it's not already installed. Linux distributions usually come with FFmpeg pre-installed. TorchCodec supports all major FFmpeg versions in [4, 7].

    If FFmpeg is not already installed, or you need a more recent version, an easy way to install it is to use conda:

    conda install ffmpeg
    # or
    conda install ffmpeg -c conda-forge
    
  3. Install TorchCodec:

    pip install torchcodec
    

The following table indicates the compatibility between versions of torchcodec, torch and Python.

torchcodec torch Python
main / nightly main / nightly >=3.9, <=3.12
not yet supported 2.5 >=3.9, <=3.12
0.0.3 2.4 >=3.8, <=3.12

Installing CUDA-enabled TorchCodec

First, make sure you have a GPU that has NVDEC hardware that can decode the format you want. Refer to Nvidia's GPU support matrix for more details here.

  1. Install CUDA Toolkit. Pytorch and TorchCodec supports CUDA Toolkit versions 11.8, 12.1 or 12.4. In particular TorchCodec depends on CUDA libraries libnpp and libnvrtc (which are part of CUDA Toolkit).

  2. Install Pytorch that corresponds to your CUDA Toolkit version using the official instructions.

  3. Install or compile FFmpeg with NVDEC support. TorchCodec with CUDA should work with FFmpeg versions in [5, 7].

    If FFmpeg is not already installed, or you need a more recent version, an easy way to install it is to use conda:

    conda install ffmpeg
    # or
    conda install ffmpeg -c conda-forge
    

    If you are building FFmpeg from source you can follow Nvidia's guide to configuring and installing FFmpeg with NVDEC support here.

    After installing FFmpeg make sure it has NVDEC support when you list the supported decoders:

    ffmpeg -decoders | grep -i nvidia
    # This should show a line like this:
    # V..... h264_cuvid           Nvidia CUVID H264 decoder (codec h264)
    

    To check that FFmpeg libraries work with NVDEC correctly you can decode a sample video:

    ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i test/resources/nasa_13013.mp4 -f null -
    
  4. Install TorchCodec by passing in an --index-url parameter that corresponds to your CUDA Toolkit version, example:

    # This corresponds to CUDA Toolkit version 12.4. It should be the same one
    # you used when you installed PyTorch (If you installed PyTorch with pip).
    pip install torchcodec --index-url=https://download.pytorch.org/whl/cu124
    

    Note that without passing in the --index-url parameter, pip installs the CPU-only version of TorchCodec.

Benchmark Results

The following was generated by running our benchmark script on a lightly loaded 22-core machine with an Nvidia A100 with 5 NVDEC decoders.

benchmark_results

The top row is a Mandelbrot video generated from FFmpeg that has a resolution of 1280x720 at 60 fps and is 120 seconds long. The bottom row is promotional video from NASA that has a resolution of 960x540 at 29.7 fps and is 206 seconds long. Both videos were encoded with libx264 and yuv420p pixel format.

Planned future work

We are actively working on the following features:

Let us know if you have any feature requests by opening an issue!

Contributing

We welcome contributions to TorchCodec! Please see our contributing guide for more details.

License

TorchCodec is released under the BSD 3 license.

However, TorchCodec may be used with code not written by Meta which may be distributed under different licenses.

For example, if you build TorchCodec with ENABLE_CUDA=1 or use the CUDA-enabled release of torchcodec, please review CUDA's license here: Nvidia licenses.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

TorchCodec-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (747.9 kB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

TorchCodec-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (3.0 MB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

TorchCodec-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (747.9 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

TorchCodec-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (2.8 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

TorchCodec-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (744.8 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

TorchCodec-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (2.3 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

TorchCodec-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (745.5 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

TorchCodec-0.1.0-cp39-cp39-macosx_11_0_arm64.whl (2.3 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

File details

Details for the file TorchCodec-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for TorchCodec-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8a4de0ea6633beb0d695541e093dfff605c53d562629e68a0ad88e5538c3120f
MD5 34453d6a481824a3a3cc1103458b755a
BLAKE2b-256 b6c1191b343692ada4e9de67f3d97ac6744e10feb17df57feac78159b868996c

See more details on using hashes here.

File details

Details for the file TorchCodec-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for TorchCodec-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 808e9329fdc022fe6934ef43ff7abb664f585ada06f39c4e5964c8e9dc614f8c
MD5 fbf3f2b6697f23d7728d7b5f19d7724a
BLAKE2b-256 f3c6cf9f3ba6d6d40b64ee052f9f8c4b276a6a7d0a9d542e6a72715982b9cfca

See more details on using hashes here.

File details

Details for the file TorchCodec-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for TorchCodec-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 558289de7be816b6967ec089ac649edf95e874887a79b7bd5df5390b4bc7ea04
MD5 e1e77fa9c10807bc7b22623a7d3d9fe1
BLAKE2b-256 bdaebe57ef8e16c4cd217a758e2e58c6a4f8f6bd9cdcd05e139bc2ae7a447e98

See more details on using hashes here.

File details

Details for the file TorchCodec-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for TorchCodec-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 67b2693917b88ff4fdc2388621358db6afe5317bd8c81223d651e2cb122556be
MD5 02118f5fe7d2ac5fe5f00f5e2946f541
BLAKE2b-256 f86af1b7c8803b3f9c12d0bf11adaa40b171837bdf81b971d9893fb10177a13e

See more details on using hashes here.

File details

Details for the file TorchCodec-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for TorchCodec-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 72384e21f38c81d558ae21f4f7a71543f45b1d2f21701b2de48d212eaf2c02a9
MD5 45067688d25444ea0ec2a71f8f5ed1eb
BLAKE2b-256 5bb16de38db25e4d58aa508085232592a94ac01338be6118a4db4ba3e5bdfc01

See more details on using hashes here.

File details

Details for the file TorchCodec-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for TorchCodec-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3b34a834cb94605cb61d1ef020684b21dd0bf11a1bf4922656368123f90861ae
MD5 40af6df64e40166d3107a3ea22461d86
BLAKE2b-256 efb3b884deca868cedc5acfe4ca79d291d66ea01b46818e2f5aaff9e70f5b67d

See more details on using hashes here.

File details

Details for the file TorchCodec-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for TorchCodec-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 46f51a6780be63bddabb1e32981ccd721b53cd2a8edba52e5bab891827778784
MD5 56edfe4c704c42aa1b6d52e7ce2a5dfd
BLAKE2b-256 70cc1e664604c390f1352bd3218e99c0786853081feffc0be39116df47ce8468

See more details on using hashes here.

File details

Details for the file TorchCodec-0.1.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for TorchCodec-0.1.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 be731f7713babde7d6a7d210c68e3e31c2f09af5ae2cee49e4622e053423c3ad
MD5 c036dad47a1eb12d7d4202f9f465eabd
BLAKE2b-256 f00421b7182e4dee5d323103164837677480db4e15eb99a59aacb244c7e64426

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page