Pydantic media reference for images and video frames with lazy loading and optimized batch decoding

These details have not been verified by PyPI

Project description

MediaRef

The portable frame-level media reference primitive — container-agnostic, fps-free, RFC-based.

(uri, pts_ns) is the entire schema. URIs follow RFC 3986 (with RFC 2397 for embedded data); pts_ns is an int64 nanosecond presentation timestamp. The schema is frozen for the life of MediaRef Spec 1.x. Works in any container (Parquet, mcap, rosbag, HDF5) and any standard media format (JPEG, PNG, H.264, H.265, AV1).

Quick Start

from mediaref import MediaRef, DataURI, batch_decode
import numpy as np

# 1. Create references — local file, HTTP(S), cloud, or video frame.
ref = MediaRef(uri="image.png")
ref = MediaRef(uri="https://example.com/image.jpg")
ref = MediaRef(uri="s3://bucket/image.jpg")             # any fsspec scheme
ref = MediaRef(uri="video.mp4", pts_ns=1_000_000_000)   # frame at 1.0s

# 2. Load.
rgb = ref.to_ndarray()      # (H, W, 3) RGB
pil = ref.to_pil_image()

# 3. Embed bytes inside a MediaRef (self-contained reference).
ref = MediaRef(uri=DataURI.from_image(rgb, format="png"))

# 4. Batch-decode many frames from one video — opens the container once.
refs = [MediaRef(uri="video.mp4", pts_ns=int(i*1e9)) for i in range(10)]
frames = batch_decode(refs)

# 5. Serialize for storage in any string-based format.
json_str = ref.model_dump_json()   # '{"uri":"...","pts_ns":...}'

See API Reference for full details — DataURI, batch_decode, cloud URIs, HuggingFace datasets integration, lerobot interop, the mediaref CLI.

Why MediaRef?

1. Separate heavy media from lightweight metadata. Store 1 TB of videos separately and keep only a few KB of references in your tables. MediaRef is decoupled, format-agnostic, and works wherever you can store a string. Already used in production: the D2E research project stores 1 TB+ of gameplay data referenced by MediaRef via OWAMcap.

2. Permanent schema built on RFCs. (uri, pts_ns) is frozen for the life of Spec 1.x. No proprietary formats, no breaking changes.

3. Sparse-frame batch decoding. When loading many frames from a single video, batch_decode() opens the container once and seeks monotonically — 4.9× faster decoding throughput and 2.2× better I/O efficiency vs per-frame decoding on a sparse-frame ML dataloader workload. Methodology: D2E paper Section 3 / Appendix A.

Decoding Benchmark

Installation

pip install mediaref                  # core: image loading + cloud-storage URIs (fsspec)
pip install 'mediaref[video]'         # + PyAV for video frame decoding
pip install 'mediaref[hf]'            # + HuggingFace datasets feature registration
pip install 'mediaref[video,hf]'      # all extras

For uv: uv add 'mediaref[video,hf]'. MediaRef follows semantic versioning; the wire schema (uri, pts_ns) is frozen for the life of Spec 1.x.

Optional TorchCodec backend. batch_decode(refs, decoder="torchcodec") uses TorchCodec for CUDA-accelerated decoding. TorchCodec ships its own FFmpeg shared-library expectations that may not match PyAV's bundled copies; if you see libavcodec.so.NN: cannot open shared object file after pip install torchcodec, repair the install with patch-torchcodec (it patches torchcodec's RPATH onto PyAV's bundled FFmpeg):

pip install patch-torchcodec && patch-torchcodec

Documentation

API Reference — full API: MediaRef, DataURI, batch_decode, cloud URIs, HuggingFace integration, lerobot interop, the CLI.
MediaRef Specification 1.0 — wire format, URI grammar, pts_ns semantics, conformance criteria.
Comparisons — how MediaRef relates to datasets.Video and lerobot's VideoFrame.
Playback Semantics — how frame selection works at specific timestamps.

Examples

ROS bag conversion — convert ROS1/ROS2 bags with CompressedImage topics to MediaRef-referenced video, recovering 70–90% storage via inter-frame compression. Works without a ROS install (uses rosbags).

Datasets shipped with MediaRef

These are projects from the author's own work that use MediaRef on the storage path. External adopters welcome — open a PR to add yours.

Dataset	Domain	Scale
open-world-agents/D2E-Original	Game agents (29 PC games)	273.4 hours, 1.83 TB
open-world-agents/D2E-480p	Game agents (downsampled)	—
maum-ai/CostNav-Teleop-Dataset	Delivery-robot navigation / teleop	—

Tagging a HuggingFace dataset with mediaref makes it discoverable at huggingface.co/datasets?other=mediaref.

Citation

If you reference MediaRef in writing, the CITATION.cff file at repo root has the canonical metadata. BibTeX:

@software{mediaref,
  author  = {Choi, Suhwan},
  title   = {MediaRef: a portable frame-level media reference primitive},
  version = {1.0.0},
  year    = {2026},
  doi     = {10.5281/zenodo.19892316},
  url     = {https://github.com/open-world-agents/MediaRef}
}

The doi above is the Zenodo concept DOI — it always resolves to the latest published release. To cite v1.0.0 specifically, use 10.5281/zenodo.19892317.

Acknowledgments

The video decoder interface design references TorchCodec's API design.

License

MediaRef is released under the MIT License.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.1.1

Jun 11, 2026

1.1.0

May 6, 2026

1.0.0

Apr 29, 2026

0.5.3

Mar 14, 2026

0.5.2

Feb 20, 2026

0.5.1

Feb 20, 2026

0.5.0

Feb 6, 2026

0.4.4

Nov 25, 2025

0.4.3

Nov 17, 2025

0.4.2

Nov 17, 2025

0.4.1

Oct 30, 2025

0.4.0 yanked

Oct 30, 2025

0.3.1

Oct 29, 2025

0.3.0

Oct 29, 2025

0.2.0

Oct 28, 2025

0.1.0

Oct 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mediaref-1.1.1.tar.gz (6.6 MB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mediaref-1.1.1-py3-none-any.whl (44.4 kB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file mediaref-1.1.1.tar.gz.

File metadata

Download URL: mediaref-1.1.1.tar.gz
Upload date: Jun 11, 2026
Size: 6.6 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for mediaref-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`7529fc436886dbf95391379baf66fae88d55fc44516a81a64a7f3444e986b3a7`
MD5	`0578011df86d07a59e7a9f7b74dd6774`
BLAKE2b-256	`a3b328bfe358b322b3c93ce9211dca779c4236305ca2c3e2ca663f062c00fd65`

See more details on using hashes here.

File details

Details for the file mediaref-1.1.1-py3-none-any.whl.

File metadata

Download URL: mediaref-1.1.1-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 44.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for mediaref-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5056b1dea0c96d56f6060192df881f0e790998a74996e0dca213ff11b1c27f0f`
MD5	`5159b362c52d3b9378a140e020fd1a42`
BLAKE2b-256	`11776d3a8c4c14cf84a51780cd718574c6a75e6440956e38dd06d5b9403a7085`

See more details on using hashes here.

mediaref 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

MediaRef

Quick Start

Why MediaRef?

Installation

Documentation

Examples

Datasets shipped with MediaRef

Citation

Acknowledgments

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes