Tools for reading and writing videos, and loading them efficiently with PyTorch.
Project description
parallel-video-io
Tools for reading and writing videos and for efficient frame-level loading with PyTorch.
This repository provides small, focused utilities around video I/O and a PyTorch-friendly iterable dataset + dataloader that make it easy to stream frames from many videos or directories of image frames in parallel.
Key features
- Read frames from videos (random access or sequential) using imageio/ffmpeg.
- Write sequences of numpy frames to H.264 MP4 files with sane defaults.
- PyTorch-compatible
VideoCollectionDatasetandVideoCollectionDataLoaderthat provide a simple iterator that uses multiple processes to load data from different videos under the hood. SimpleVideoCollectionLoader: an even easier API that combines dataset and dataloader creation in one step.
Table of contents
Installation
Install from PyPI:
pip install parallel-video-io
To clone a copy and install in editable mode:
git clone git@github.com:sibocw/parallel-video-io.git
cd parallel-video-io
# with pip
pip install -e . --config-settings editable_mode=compat
# ... or with Poetry
poetry install
Make sure ffmpeg is available on your $PATH (required by imageio-ffmpeg).
Quick examples
These examples use NumPy arrays for frames in (height, width, channels) order and uint8 dtype.
Reading video metadata
from pvio.io import get_video_metadata, check_num_frames
# To get the number of frames in a video
n_frames = check_num_frames("example.mp4")
print(n_frames) # integer
# To get more information
# This function actually caches these information in a JSON file. To control whether you
# want to use caching, modify the `cache_metadata` and `use_cached_metadata` arguments.
meta = get_video_metadata("example.mp4")
print(meta) # dict containing the keys "n_frames", "frame_size", and "fps"
Reading video frames
from pvio.io import read_frames_from_video
# You can read a whole video
frames, fps = read_frames_from_video("example.mp4")
# ... or just some frames
frames, fps = read_frames_from_video("example.mp4", frame_indices=[0, 5])
Writing a video
import numpy as np
from pvio.io import write_frames_to_video
# Create dummy 32x32 RGB frames (H, W, C)
frames = [np.full((32, 32, 3), fill_value=i, dtype=np.uint8) for i in range(10)]
# Save them to file
# More complex video writing parameters are available - see the docstring for details
write_frames_to_video("example.mp4", frames, fps=25.0)
Notes: the writer verifies that all frames share the same (height, width). FFmpeg can automatically resize frames to meet codec alignment requirements; for deterministic results, use dimensions divisible by 16.
Using the PyTorch dataset and dataloader
The VideoCollectionDataset iterates frames either from video files or from directories containing individual image frames. Then, you can use VideoCollectionDataLoader to load frames in parallel. This can be very handy for inference pipelines of neural networks that independently process all frames in a video. TorchCodec is used under the hood.
from pvio.video import EncodedVideo # for "real" videos (e.g. MP4 files)
from pvio.video import ImageDirVideo # for directories containing individual images
from pvio.torch_tools import VideoCollectionDataset, VideoCollectionDataLoader
# Create Video objects for video files
video1 = EncodedVideo("path/to/video1.mp4")
video2 = EncodedVideo("path/to/video2.mp4")
ds = VideoCollectionDataset([video1, video2])
# ... or from directories containing individual frames as images
video3 = ImageDirVideo("path/to/frames_dir1")
# (hint: you can use a custom regular expression to control how frame IDs are parsed)
video4 = ImageDirVideo("path/to/frames_dir2", frame_id_regex=r"frame\D*(\d+)(?!\d)")
ds = VideoCollectionDataset([video3, video4])
# You can optionally provide a transform function that will be applied to each frame
# after loading (frames already in float tensor format, ranged [0, 1], in CHW format)
# (hint: these can also be from torchvision.transforms)
def my_transform(frame):
return frame * 2.0 # example: double pixel values
ds = VideoCollectionDataset([video1, video2], transform=my_transform)
# You can set a buffer_size parameter when creating EncodedVideo objects.
# This is the number of frames to decode at once (default 64).
# Larger buffer size = faster loading at the cost of memory usage.
video_with_buffer = EncodedVideo("path/to/video.mp4", buffer_size=128)
ds = VideoCollectionDataset([video_with_buffer])
# Wrap dataset in a DataLoader
# (you can supply other torch.utils.data.DataLoader keyword arguments if you wish)
loader = VideoCollectionDataLoader(ds, batch_size=8, num_workers=4)
# Now you can iterate over the entire dataset in batches through a single iterator
# Behind the scenes, frames are distributed across workers for efficient loading
for batch in loader:
frames = batch["frames"] # torch.Tensor: B x C x H x W
video_indices = batch["video_indices"] # list of int (video indices)
frame_indices = batch["frame_indices"] # list of int
Using SimpleVideoCollectionLoader
If you don't mind breaking the standard Dataset + DataLoader pattern with torch.utils.data, you can use SimpleVideoCollectionLoader, which combines dataset and dataloader creation. This dataloader can also automatically create the appropriate Video objects from paths:
from pvio.torch_tools import SimpleVideoCollectionLoader
# Video specification can be mixed: path to real videos, path to directories of images,
# and pre-created Video objects are all allowed.
videos = ["path/to/video1.mp4", "path/to/dir1/", EncodedVideo("path/to/video2.mp4")]
# Supply all Video backend parameters, VideoCollectionDataset parameters, and DataLoader
# parameters in one call.
loader = SimpleVideoCollectionLoader(
videos,
batch_size=8,
num_workers=4,
transform=my_transform, # optional
buffer_size=64, # optional (for video files)
frame_id_regex=r"frame\D*(\d+)(?!\d)", # optional (for image directories)
)
# Iterate over the entire dataset in batches through a single iterator
for batch in loader:
frames = batch["frames"] # torch.Tensor: B x C x H x W
video_indices = batch["video_indices"] # list of int (video indices)
frame_indices = batch["frame_indices"] # list of int
Testing
The test suite uses pytest. Run it from the repository root:
pytest tests
The tests are organized into:
test_io.py- Tests for video I/O functionstest_torch_tools.py- Unit tests for VideoCollectionDatasettest_integration.py- Integration tests with parallel loadingtest_readme_examples.py- Tests to make sure the examples in this README work
There are a few tests that write small MP4 files using imageio/ffmpeg; ensure ffmpeg is available in the environment where tests run.
Notes & troubleshooting
- FFmpeg macroblock constraints: some ffmpeg builds require frame dimensions to be divisible by 16. If you see a warning about
macro_block_size=16and unexpected resizing, choose frame sizes divisible by 16 in production pipelines. - If you plan to decode many large videos, enabling metadata caching will speed up repeated indexing (the package writes a
.metadata.jsonfor each video under the same directory whenget_video_metadatais called). - If you have a non-standard data format, you can implement your own backend by creating a subclass of
pvio.torch_tools.Video.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file parallel_video_io-0.1.5.tar.gz.
File metadata
- Download URL: parallel_video_io-0.1.5.tar.gz
- Upload date:
- Size: 63.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2fc8ee577632330a049ffc7fca1c79e6955f88f7b5be0a676a47bac4249dc050
|
|
| MD5 |
a8d18fcd076f99c772ef44c9a9a34f67
|
|
| BLAKE2b-256 |
08c962b52bb7916b65fe9edd6e42e2d22a19061d8ad1f242679cc91116b19d45
|
File details
Details for the file parallel_video_io-0.1.5-py3-none-any.whl.
File metadata
- Download URL: parallel_video_io-0.1.5-py3-none-any.whl
- Upload date:
- Size: 17.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3576a911cfc9e9ff71f1a8cdb31918a50c021195448bd2401e214e52b8a5b89f
|
|
| MD5 |
2ee446a416c2ec468ac9a3e500e8187b
|
|
| BLAKE2b-256 |
e0ba08ef3d6e677e2ba387f01c8c2e5afb8b5ab69b9841f0dcf36a8488249f73
|