Skip to main content

A video decoder for PyTorch

Project description

Installation | Simple Example | Detailed Example | Documentation | Contributing | License

TorchCodec

TorchCodec is a Python library for decoding video and audio data into PyTorch tensors, on CPU and CUDA GPU. It aims to be fast, easy to use, and well integrated into the PyTorch ecosystem. If you want to use PyTorch to train ML models on videos and audio, TorchCodec is how you turn these into data.

We achieve these capabilities through:

  • Pythonic APIs that mirror Python and PyTorch conventions.
  • Relying on FFmpeg to do the decoding. TorchCodec uses the version of FFmpeg you already have installed. FFmpeg is a mature library with broad coverage available on most systems. It is, however, not easy to use. TorchCodec abstracts FFmpeg's complexity to ensure it is used correctly and efficiently.
  • Returning data as PyTorch tensors, ready to be fed into PyTorch transforms or used directly to train models.

Using TorchCodec

Here's a condensed summary of what you can do with TorchCodec. For more detailed examples, check out our documentation!

Decoding

from torchcodec.decoders import VideoDecoder

device = "cpu"  # or e.g. "cuda" !
decoder = VideoDecoder("path/to/video.mp4", device=device)

decoder.metadata
# VideoStreamMetadata:
#   num_frames: 250
#   duration_seconds: 10.0
#   bit_rate: 31315.0
#   codec: h264
#   average_fps: 25.0
#   ... (truncated output)

# Simple Indexing API
decoder[0]  # uint8 tensor of shape [C, H, W]
decoder[0 : -1 : 20]  # uint8 stacked tensor of shape [N, C, H, W]

# Indexing, with PTS and duration info:
decoder.get_frames_at(indices=[2, 100])
# FrameBatch:
#   data (shape): torch.Size([2, 3, 270, 480])
#   pts_seconds: tensor([0.0667, 3.3367], dtype=torch.float64)
#   duration_seconds: tensor([0.0334, 0.0334], dtype=torch.float64)

# Time-based indexing with PTS and duration info
decoder.get_frames_played_at(seconds=[0.5, 10.4])
# FrameBatch:
#   data (shape): torch.Size([2, 3, 270, 480])
#   pts_seconds: tensor([ 0.4671, 10.3770], dtype=torch.float64)
#   duration_seconds: tensor([0.0334, 0.0334], dtype=torch.float64)

Clip sampling

from torchcodec.samplers import clips_at_regular_timestamps

clips_at_regular_timestamps(
    decoder,
    seconds_between_clip_starts=1.5,
    num_frames_per_clip=4,
    seconds_between_frames=0.1
)
# FrameBatch:
#   data (shape): torch.Size([9, 4, 3, 270, 480])
#   pts_seconds: tensor([[ 0.0000,  0.0667,  0.1668,  0.2669],
#         [ 1.4681,  1.5682,  1.6683,  1.7684],
#         [ 2.9696,  3.0697,  3.1698,  3.2699],
#         ... (truncated), dtype=torch.float64)
#   duration_seconds: tensor([[0.0334, 0.0334, 0.0334, 0.0334],
#         [0.0334, 0.0334, 0.0334, 0.0334],
#         [0.0334, 0.0334, 0.0334, 0.0334],
#         ... (truncated), dtype=torch.float64)

You can use the following snippet to generate a video with FFmpeg and tryout TorchCodec:

fontfile=/usr/share/fonts/dejavu-sans-mono-fonts/DejaVuSansMono-Bold.ttf
output_video_file=/tmp/output_video.mp4

ffmpeg -f lavfi -i \
    color=size=640x400:duration=10:rate=25:color=blue \
    -vf "drawtext=fontfile=${fontfile}:fontsize=30:fontcolor=white:x=(w-text_w)/2:y=(h-text_h)/2:text='Frame %{frame_num}'" \
    ${output_video_file}

Installing TorchCodec

Installing CPU-only TorchCodec

  1. Install the latest stable version of PyTorch following the official instructions. For other versions, refer to the table below for compatibility between versions of torch and torchcodec.

  2. Install FFmpeg, if it's not already installed. Linux distributions usually come with FFmpeg pre-installed. TorchCodec supports all major FFmpeg versions in [4, 7].

    If FFmpeg is not already installed, or you need a more recent version, an easy way to install it is to use conda:

    conda install ffmpeg
    # or
    conda install ffmpeg -c conda-forge
    
  3. Install TorchCodec:

    pip install torchcodec
    

The following table indicates the compatibility between versions of torchcodec, torch and Python.

torchcodec torch Python
main / nightly main / nightly >=3.9, <=3.13
0.3 2.7 >=3.9, <=3.13
0.2 2.6 >=3.9, <=3.13
0.1 2.5 >=3.9, <=3.12
0.0.3 2.4 >=3.8, <=3.12

Installing CUDA-enabled TorchCodec

First, make sure you have a GPU that has NVDEC hardware that can decode the format you want. Refer to Nvidia's GPU support matrix for more details here.

  1. Install Pytorch corresponding to your CUDA Toolkit using the official instructions. You'll need the libnpp and libnvrtc CUDA libraries, which are usually part of the CUDA Toolkit.

  2. Install or compile FFmpeg with NVDEC support. TorchCodec with CUDA should work with FFmpeg versions in [4, 7].

    If FFmpeg is not already installed, or you need a more recent version, an easy way to install it is to use conda:

    conda install ffmpeg
    # or
    conda install ffmpeg -c conda-forge
    

    If you are building FFmpeg from source you can follow Nvidia's guide to configuring and installing FFmpeg with NVDEC support here.

    After installing FFmpeg make sure it has NVDEC support when you list the supported decoders:

    ffmpeg -decoders | grep -i nvidia
    # This should show a line like this:
    # V..... h264_cuvid           Nvidia CUVID H264 decoder (codec h264)
    

    To check that FFmpeg libraries work with NVDEC correctly you can decode a sample video:

    ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i test/resources/nasa_13013.mp4 -f null -
    
  3. Install TorchCodec by passing in an --index-url parameter that corresponds to your CUDA Toolkit version, example:

    # This corresponds to CUDA Toolkit version 12.6. It should be the same one
    # you used when you installed PyTorch (If you installed PyTorch with pip).
    pip install torchcodec --index-url=https://download.pytorch.org/whl/cu126
    

    Note that without passing in the --index-url parameter, pip installs the CPU-only version of TorchCodec.

Benchmark Results

The following was generated by running our benchmark script on a lightly loaded 22-core machine with an Nvidia A100 with 5 NVDEC decoders.

benchmark_results

The top row is a Mandelbrot video generated from FFmpeg that has a resolution of 1280x720 at 60 fps and is 120 seconds long. The bottom row is promotional video from NASA that has a resolution of 960x540 at 29.7 fps and is 206 seconds long. Both videos were encoded with libx264 and yuv420p pixel format. All decoders, except for TorchVision, used FFmpeg 6.1.2. TorchVision used FFmpeg 4.2.2.

For TorchCodec, the "approx" label means that it was using approximate mode for seeking.

Contributing

We welcome contributions to TorchCodec! Please see our contributing guide for more details.

License

TorchCodec is released under the BSD 3 license.

However, TorchCodec may be used with code not written by Meta which may be distributed under different licenses.

For example, if you build TorchCodec with ENABLE_CUDA=1 or use the CUDA-enabled release of torchcodec, please review CUDA's license here: Nvidia licenses.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

torchcodec-0.4.0-cp313-cp313t-manylinux_2_28_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.13tmanylinux: glibc 2.28+ x86-64

torchcodec-0.4.0-cp313-cp313t-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.13tmacOS 11.0+ ARM64

torchcodec-0.4.0-cp313-cp313-manylinux_2_28_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

torchcodec-0.4.0-cp313-cp313-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

torchcodec-0.4.0-cp312-cp312-manylinux_2_28_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

torchcodec-0.4.0-cp312-cp312-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

torchcodec-0.4.0-cp311-cp311-manylinux_2_28_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

torchcodec-0.4.0-cp311-cp311-macosx_11_0_arm64.whl (3.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

torchcodec-0.4.0-cp310-cp310-manylinux_2_28_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

torchcodec-0.4.0-cp310-cp310-macosx_11_0_arm64.whl (2.8 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

torchcodec-0.4.0-cp39-cp39-manylinux_2_28_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ x86-64

torchcodec-0.4.0-cp39-cp39-macosx_11_0_arm64.whl (2.7 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file torchcodec-0.4.0-cp313-cp313t-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp313-cp313t-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 87a85dfd1e1555c48c61bc152f03545d11940b71cf55079aa4c06cd41492467f
MD5 a93bf8938343968cfd52b53e23d20096
BLAKE2b-256 04a59ff2b9819058fd3114a794c34df7992874ab62a0ad180879ba4d9d3f392d

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp313-cp313t-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp313-cp313t-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 51e94f4eb63bac48e7ec4fc11c03ddb0cfa9af7210077d507666ecb2aa81e0ac
MD5 bb28493f39efb345637e84e8e20c74b4
BLAKE2b-256 8f04bc5c72c279e77bdeaf0b26178c650e61800798c1fc4ff6b9353760f8ee5a

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 474b476880b70f44dce47672a98e3516cd15bad2ddde2d0537319d12c0a3e80e
MD5 ded9829690e4232726f7a5ca20e93f3f
BLAKE2b-256 bece451a1e79964790866d58f005a8789334434076457912ba295c73961a1ccf

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 89d95adb8cf89cf85ff1c09c15f3de0df3b63c2e6cae5be0b387af0e8c84dbec
MD5 b414f25fd2741873d046e4aa330208fe
BLAKE2b-256 c67bc15be1378e4816d72d2cb544cd161154131aedae2121667019452e47d78f

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0c1211bc2fb68cac5080d71635880e5a1ddc0d95f038cad1f7c3d5c32492f770
MD5 e058b29ee88c18ea8b34fa1dcd31af28
BLAKE2b-256 17262ac91c004d2c7cf813c8ccc151e7760b0d4b4f8ba26648d873e8fa7654be

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 29ee0ad7514d5636d3a889cbdc92d4ed68e8f283d8da971fc6faa001e3e5dd67
MD5 2639ffd95ea5599dfeaff709ed26082b
BLAKE2b-256 049d18944c18f5c29516fc5e920d764904b703775812c4b4756b11ed6970f1df

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a2b3ee4c40236eec82faa61f5941f1bb746bed81bb0a0e00751142f0cbf0e5e0
MD5 9504d7a9c1ca1b10a8438dbac0fc4335
BLAKE2b-256 31d87e00a46cb6f8d5dc01c88f67f5014835c39e1189f7ff0bbd82c363aeef0f

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4a1c488df253c62ed67b945f3be27a800acbc3fecacda52127fbabd72a2c6e2b
MD5 b838d0f041a0e638509f7b8cf915089f
BLAKE2b-256 09b7481cec9d5d3d679919632bf873720c905cb4af8b157a363c8f4b470bfd35

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 950ccb7d113e9fdc3c05361ce7f5c09bde120df481eb3e6d8ce79db9e202e30f
MD5 2f62dd65cc8104155000168f3e326543
BLAKE2b-256 49166713f9b26165f33f09e7b637f7189ec308b7f2ec3050cc60397f1b9f232e

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 395361147274fd3163f7b4707b38050cebfbac836fa0bd8ddf80d2460da9d285
MD5 3e60547ad3b3d867a9f2ddaf9410749e
BLAKE2b-256 2ea99cbe3d8e7fba155db816b91b6dbfd49ed2ac8a8f77f411cde9635420658e

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 13c26650e022713bc6f4f8ffcc57dd424406a5768bc5e61227459e86725031f2
MD5 2b215c4d97be4ad5e4d49020faafdcc2
BLAKE2b-256 30fb2a3162eddecdf5568ae7f7a2800f51c1aae8d3a6deeaf1ba6327512d44eb

See more details on using hashes here.

File details

Details for the file torchcodec-0.4.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for torchcodec-0.4.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6bce0b91fa73a430c67616be219e30ec68cede3f9690b553bd462833a408e654
MD5 893004e9c522978880a9a4b6e3682b56
BLAKE2b-256 31506039d15f5ecefc20162fb8df1fcca546c971805568d637a755a550c75463

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page