Codec-aware video preprocessing for training and inference

Project description

codec-video-prep

Codec-aware video preprocessing for training and inference. Extracts codec-level bitcost information from H.264 / HEVC videos and turns it into patch-canvases ready for downstream vision models.

What it does

Patched FFmpeg decoder – Instruments the H.264 / HEVC decoder to export per-macroblock (H.264) or per-CTU (HEVC) bitcost maps during decoding.
Fast C++ extension (cv_reader_fast) – Decodes video with loop-filter / IDCT skipped and optionally returns bitcost data as NumPy arrays.
Readiness grouping – Groups frames by compressibility (bitcost) so that hard-to-decode regions get more patches.
Top-K patch selection – Selects the most informative 2×2 patch blocks from each group and packs them into JPG/PNG canvases.
One-command pipeline – From a raw video to a folder of canvases + metadata in a single call.

Install

From wheel (recommended)

python -m pip install codec_video_prep-*.whl

Verify the installation:

codec-video-prep-doctor

Build from source

Build the patched FFmpeg shared libraries:

bash scripts/build_patched_ffmpeg.sh

Build and install the Python package:

python -m pip install -e .

Quick start (CLI)

codec-video-prep \
  --video /path/to/video.mp4 \
  --out_dir ./preinfer_out \
  --num_sampled_frames 1024 \
  --group_size 32 \
  --images_per_group 4 \
  --max_pixels 153664

Output directory will contain:

canvas_*.jpg – Packed patch canvases
meta.json – Full metadata, timing, and group info
frame_ids.npy – Sampled frame indices
src_patch_position.npy – Patch source positions

Python API

High-level one-shot call

from codec_video_prep import run_preinfer

result = run_preinfer(
    video="/path/to/video.mp4",
    out_dir="./preinfer_out",
    num_sampled_frames=1024,
    group_size=32,
    images_per_group=4,
    patch=14,
    max_pixels=153664,
    min_group_frames=8,
    max_group_frames=64,
    bitcost_grid="adaptive",
)

print(result.out_dir)       # output directory
print(result.meta_path)     # path to meta.json
print(result.timings)       # timing breakdown

Low-level fast decoder

from codec_video_prep import cv_reader_fast

# Decode all frames with bitcost export
frames = cv_reader_fast.read_video_fast(
    path="/path/to/video.mp4",
    thread_count=16,
    export_bitcost=1,
    thread_type="auto",
)

# Decode selected frames only
selected = cv_reader_fast.read_video_fast_selected(
    path="/path/to/video.mp4",
    frame_ids=[0, 30, 60, 90],
    thread_count=16,
    export_bitcost=1,
)

Each frame dict contains:

Key	Description
`frame_idx`	Frame index
`pict_type`	`'I'`, `'P'` or `'B'`
`width` / `height`	Frame resolution
`codec_name`	Decoder name (`h264`, `hevc`, …)
`bitcost`	Dict with MB/CTU bitcost arrays (when `export_bitcost=1`)

Project structure

├── src/codec_video_prep/    # Python package
│   ├── api.py                        # run_preinfer() entrypoint
│   ├── cli.py                        # codec-video-prep CLI
│   ├── doctor.py                     # codec-video-prep-doctor diagnostics
│   ├── config.py                     # PreinferConfig
│   └── libs/                         # Bundled FFmpeg .so files
├── codec_selector/                   # Frame sampling / grouping / patch selection
│   ├── core/                         # Pipeline, probe, decode, config
│   ├── plugins/                      # Samplers, scorers, groupers, selectors, packers
│   └── codec_patch_gop/              # Legacy GOP-based utilities
├── native/                           # C++ Python extension
│   └── cv_reader_fast.cpp            # Fast decoder with bitcost export
├── ffmpeg_patch/                     # FFmpeg source patches
│   ├── h264_*.c                      # H.264 bitcost instrumentation
│   ├── hevc_*.c                      # HEVC bitcost instrumentation
│   └── patch.sh                      # Patch application script
├── scripts/
│   ├── build_patched_ffmpeg.sh       # Build patched FFmpeg libs
│   └── build_manylinux_wheel.sh      # Build manylinux wheel
├── setup.py                          # setuptools build (C++ extension + FFmpeg libs)
└── pyproject.toml                    # PEP 517 project metadata

Build a manylinux wheel

PIP_INDEX_URL=https://mirrors.aliyun.com/pypi/simple \
PIP_TRUSTED_HOST=mirrors.aliyun.com \
bash scripts/build_manylinux_wheel.sh

Output:

wheelhouse/codec_video_prep-0.1.0-cp310-cp310-manylinux2014_x86_64.whl

Install and check:

python -m pip install wheelhouse/codec_video_prep-*.whl
codec-video-prep-doctor

To target a different Python ABI, set PY_TAG:

PY_TAG=cp311-cp311 bash scripts/build_manylinux_wheel.sh

Diagnostics

codec-video-prep-doctor checks:

cv_reader_fast C extension can be imported
Bundled FFmpeg shared libraries are present
Threading defaults (auto thread type, 16 threads)

Backward Compatibility

The old import path and CLI names are kept as aliases:

compressed_video_preinfer
cv-preinfer
cv-preinfer-doctor

Requirements

Python ≥ 3.10
numpy >= 1.23, < 2.0
opencv-python-headless < 4.12
Pillow
Patched FFmpeg shared libraries (built automatically by scripts/build_patched_ffmpeg.sh)

Project details

Release history Release notifications | RSS feed

0.2.5

May 28, 2026

0.2.4

May 27, 2026

0.2.3

May 22, 2026

0.2.2

May 21, 2026

0.2.1

May 21, 2026

0.2.0

May 21, 2026

0.1.1

May 19, 2026

This version

0.1.0

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codec_video_prep-0.1.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (22.8 MB view details)

Uploaded May 19, 2026 CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file codec_video_prep-0.1.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

Download URL: codec_video_prep-0.1.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Upload date: May 19, 2026
Size: 22.8 MB
Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for codec_video_prep-0.1.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm	Hash digest
SHA256	`ad8a983ecc8573f55ab09bb0e2bbb5d10df0ed86b0f1e60b39aadece759b2358`
MD5	`8498b9dcc276118d165c9e79d8ea7cff`
BLAKE2b-256	`eb3b3225bb1429bb55e7bdf2b2041adafbb3677611274cb5a5f52fa70e4463e0`

See more details on using hashes here.

codec-video-prep 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

codec-video-prep

What it does

Install

From wheel (recommended)

Build from source

Quick start (CLI)

Python API

High-level one-shot call

Low-level fast decoder

Project structure

Build a manylinux wheel

Diagnostics

Backward Compatibility

Requirements

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes