PyTorch bindings for NVIDIA Optical Flow SDK, providing hardware-accelerated optical flow computation with PyTorch end-to-end integration in Nvidia and Python.
Project description
Torch Optical Flow
PyTorch bindings for NVIDIA Optical Flow SDK, providing hardware-accelerated optical flow computation with PyTorch end-to-end integration in Nvidia and Python.
Please read more about the NVIDIA Optical Flow SDK here: https://developer.nvidia.com/optical-flow-sdk
What's this repo about?
- Hardware-accelerated optical flow using a special processor in Nvidia GPUs. No gradients are computed, this is for inference only.
- Frame interpolation and ROI, or other additional content in the SDK is not supported.
- Configurable speed (slow, medium, fast) vs and grid size (1, 2, 4)
- Support for various ABGR8 format, namely, RGB images.
- End-to-end GPU processing with PyTorch.
- Biderectional optical flow computation (forward and backward) in a single call. This is supported by the SDK, but not exposed in the wrappers they provide.
The package comes with basic functionality for optical flow:
- .flo reader and writer
- Optical Flow common metrics
- Visualization utilities
Requirements
System Requirements
- NVIDIA GPU with Optical Flow SDK support (Turing, Ampere or Ada)
- Tested on linux (Ubuntu), the SDK is compatible with windows too. Read optical_flow_sdk>Read_Me.pdf for windows instructions.
Software Requirements
- CUDA toolkit >=10.2
- Linux drivers "nvidia-smi" >=528.85
- GCC >= 5.1
- CMake >= 3.14
- When you pip install torch, it comes with its own CUDA binaries. Get the same or higher CUDA toolkit version as your PyTorch installation.
Installation
Precompiled binaries (experimental)
uv add torch-nvidia-of-sdk
# or
uv add torch-nvidia-of-sdk[full] # To have headless opencv for visualization examples
If you still use pip
pip install torch-nvidia-of-sdk
# or
pip install torch-nvidia-of-sdk[full] # To have headless opencv for
Build from Source (Recommended if precompiled binaries do not work)
This repository uses uv. A oneshot comand to build, install and test the package would be:
rm -rf build _skbuild .venv && CC=gcc CXX=g++ uv sync --extra full --reinstall-package torch-nvidia-of-sdk && uv run --extra full examples/minimal_example.py
--reinstall-package forces uv to re-compile the package. Clearing caches is not really needed but I'm paranoid.
--extra full is analogous to pip extras pip install torch-nvidia-of-sdk[full]. It just adds headless opencv for visualization
Compiling your own wheel
CC=gcc CXX=g++ uv build --wheel --package torch-nvidia-of-sdk will build a wheel in dist/ that you can install with pip.
Quick Start
Try the minimal example to get started quickly:
# Run the minimal example (uses sample frames from assets/)
uv run --extra full examples/minimal_example.py
This will:
- Load two sample frames from the
assets/directory - Compute optical flow using NVOF
- Generate visualizations and save results to
output/
See examples/README.md for more examples and tutorials.
Basic Usage
import torch
import numpy as np
from of import TorchNVOpticalFlow
from of.io import read_flo, write_flo
from of.visualization import flow_to_color
# Load your images (RGB format, uint8)
img1 = torch.from_numpy(np.array(...)).cuda() # Shape: (H, W, 3)
img2 = torch.from_numpy(np.array(...)).cuda()
# Initialize optical flow engine
flow_engine = TorchNVOpticalFlow(
width=img1.shape[1],
height=img1.shape[0],
gpu_id=0,
preset="medium", # "slow", "medium", or "fast"
grid_size=1, # 1, 2, or 4
)
# Compute optical flow
flow = flow_engine.compute_flow(img1, img2, upsample=True)
# Flow is a (H, W, 2) tensor where flow[..., 0] is x-displacement, flow[..., 1] is y-displacement
print(f"Flow shape: {flow.shape}")
# Visualize flow as RGB image
flow_rgb = flow_to_color(flow.cpu().numpy())
# Save flow to .flo file
write_flo("output_flow.flo", flow)
API Reference
Core Class: TorchNVOpticalFlow
Constructor
TorchNVOpticalFlow(
width: int,
height: int,
gpu_id: int = 0,
preset: str = "medium",
grid_size: int = 1,
bidirectional: bool = False
)
Parameters:
width: Width of input images in pixelsheight: Height of input images in pixelsgpu_id: CUDA device ID (default: 0)preset: Speed/quality preset. Options:"slow": Highest quality, slowest"medium": Balanced (recommended)"fast": Fastest, lower quality
grid_size: Output grid size. Options: 1, 2, or 4- 1: Full resolution output (default)
- 2/4: Downsampled output (faster, use with
upsample=Trueto restore resolution)
bidirectional: Enable bidirectional flow computation (forward and backward)
Methods
compute_flow(input, reference, upsample=True)
Compute forward optical flow between two frames.
Parameters:
input: First frame as CUDA tensor of shape(H, W, 4), dtypeuint8, RGBA formatreference: Second frame as CUDA tensor of shape(H, W, 4), dtypeuint8, RGBA formatupsample: If True and grid_size > 1, upsample flow to full resolution (default: True)
Returns:
torch.Tensor: Optical flow of shape(H, W, 2), dtypefloat32flow[..., 0]: Horizontal displacement (x)flow[..., 1]: Vertical displacement (y)
Example:
flow = flow_engine.compute_flow(img1_rgba, img2_rgba, upsample=True)
compute_flow_bidirectional(input, reference, upsample=True)
Compute both forward and backward optical flow.
Parameters:
input: First frame as CUDA tensor of shape(H, W, 4), dtypeuint8, RGBA formatreference: Second frame as CUDA tensor of shape(H, W, 4), dtypeuint8, RGBA formatupsample: If True and grid_size > 1, upsample flows to full resolution (default: True)
Returns:
Tuple[torch.Tensor, torch.Tensor]: Forward and backward flows, each of shape(H, W, 2)
Example:
forward_flow, backward_flow = flow_engine.compute_flow_bidirectional(
img1_rgba, img2_rgba, upsample=True
)
output_shape()
Get the output shape for the current configuration.
Returns:
List[int]: Output shape as[height, width, 2]
I/O Utilities (of.io)
read_flo(filepath)
Read optical flow from .flo file (Middlebury format).
Parameters:
filepath: Path to.flofile (str or Path)
Returns:
np.ndarray: Flow array of shape(H, W, 2), dtypefloat32
write_flo(filepath, flow)
Write optical flow to .flo file (Middlebury format).
Parameters:
filepath: Output file path (str or Path)flow: Flow array of shape(H, W, 2)(numpy array or torch tensor)
Examples
This repository includes several examples in the examples/ directory:
See examples/README.md for detailed documentation and usage instructions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file torch_nvidia_of_sdk-5.0.2-py3-none-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.
File metadata
- Download URL: torch_nvidia_of_sdk-5.0.2-py3-none-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
- Upload date:
- Size: 137.4 kB
- Tags: Python 3, manylinux: glibc 2.34+ x86-64, manylinux: glibc 2.35+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d4626068b211c9b325dc2a4a39c92b80b14b428191476be73408cc79574c8c1
|
|
| MD5 |
c9cf528468faf9e4a637811ca02ad074
|
|
| BLAKE2b-256 |
d6a6aaeedd8a0b96c81410d2026605fa0f698292e0a9d9dcbf832bbc83d5cbc9
|