Skip to main content

Codec-aware video preprocessing for training and inference

Project description

codec-video-prep

Codec-aware video preprocessing for training and inference. Extracts codec-level bitcost information from H.264 / HEVC videos and turns it into patch-canvases ready for downstream vision models.

What it does

  • Patched FFmpeg decoder – Instruments the H.264 / HEVC decoder to export per-macroblock (H.264) or per-CTU (HEVC) bitcost maps during decoding.
  • Fast C++ extension (cv_reader_fast) – Decodes video with loop-filter / IDCT skipped and optionally returns bitcost data as NumPy arrays.
  • Readiness grouping – Groups frames by compressibility (bitcost) so that hard-to-decode regions get more patches.
  • Top-K patch selection – Selects the most informative 2×2 patch blocks from each group and packs them into JPG/PNG canvases.
  • One-command pipeline – From a raw video to a folder of canvases + metadata in a single call.

Install

From wheel (recommended)

python -m pip install codec_video_prep-*.whl

Verify the installation:

codec-video-prep-doctor

Build from source

  1. Build the patched FFmpeg shared libraries:
bash scripts/build_patched_ffmpeg.sh
  1. Build and install the Python package:
python -m pip install -e .

Quick start (CLI)

codec-video-prep \
  --video /path/to/video.mp4 \
  --out_dir ./preinfer_out \
  --num_sampled_frames 1024 \
  --group_size 32 \
  --images_per_group 4 \
  --max_pixels 153664

Output directory will contain:

  • canvas_*.jpg – Packed patch canvases
  • meta.json – Full metadata, timing, and group info
  • frame_ids.npy – Sampled frame indices
  • src_patch_position.npy – Patch source positions

Python API

High-level one-shot call

from codec_video_prep import run_preinfer

result = run_preinfer(
    video="/path/to/video.mp4",
    out_dir="./preinfer_out",
    num_sampled_frames=1024,
    group_size=32,
    images_per_group=4,
    patch=14,
    max_pixels=153664,
    min_group_frames=8,
    max_group_frames=64,
    bitcost_grid="adaptive",
)

print(result.out_dir)       # output directory
print(result.meta_path)     # path to meta.json
print(result.timings)       # timing breakdown

Low-level fast decoder

from codec_video_prep import cv_reader_fast

# Decode all frames with bitcost export
frames = cv_reader_fast.read_video_fast(
    path="/path/to/video.mp4",
    thread_count=16,
    export_bitcost=1,
    thread_type="auto",
)

# Decode selected frames only
selected = cv_reader_fast.read_video_fast_selected(
    path="/path/to/video.mp4",
    frame_ids=[0, 30, 60, 90],
    thread_count=16,
    export_bitcost=1,
)

Each frame dict contains:

Key Description
frame_idx Frame index
pict_type 'I', 'P' or 'B'
width / height Frame resolution
codec_name Decoder name (h264, hevc, …)
bitcost Dict with MB/CTU bitcost arrays (when export_bitcost=1)

Project structure

├── src/codec_video_prep/    # Python package
│   ├── api.py                        # run_preinfer() entrypoint
│   ├── cli.py                        # codec-video-prep CLI
│   ├── doctor.py                     # codec-video-prep-doctor diagnostics
│   ├── config.py                     # PreinferConfig
│   └── libs/                         # Bundled FFmpeg .so files
├── codec_selector/                   # Frame sampling / grouping / patch selection
│   ├── core/                         # Pipeline, probe, decode, config
│   ├── plugins/                      # Samplers, scorers, groupers, selectors, packers
│   └── codec_patch_gop/              # Legacy GOP-based utilities
├── native/                           # C++ Python extension
│   └── cv_reader_fast.cpp            # Fast decoder with bitcost export
├── ffmpeg_patch/                     # FFmpeg source patches
│   ├── h264_*.c                      # H.264 bitcost instrumentation
│   ├── hevc_*.c                      # HEVC bitcost instrumentation
│   └── patch.sh                      # Patch application script
├── scripts/
│   ├── build_patched_ffmpeg.sh       # Build patched FFmpeg libs
│   └── build_manylinux_wheel.sh      # Build manylinux wheel
├── setup.py                          # setuptools build (C++ extension + FFmpeg libs)
└── pyproject.toml                    # PEP 517 project metadata

Build a manylinux wheel

PIP_INDEX_URL=https://mirrors.aliyun.com/pypi/simple \
PIP_TRUSTED_HOST=mirrors.aliyun.com \
bash scripts/build_manylinux_wheel.sh

Output:

wheelhouse/codec_video_prep-0.1.0-cp310-cp310-manylinux2014_x86_64.whl

Install and check:

python -m pip install wheelhouse/codec_video_prep-*.whl
codec-video-prep-doctor

To target a different Python ABI, set PY_TAG:

PY_TAG=cp311-cp311 bash scripts/build_manylinux_wheel.sh

Diagnostics

codec-video-prep-doctor checks:

  • cv_reader_fast C extension can be imported
  • Bundled FFmpeg shared libraries are present
  • Threading defaults (auto thread type, 16 threads)

Backward Compatibility

The old import path and CLI names are kept as aliases:

  • compressed_video_preinfer
  • cv-preinfer
  • cv-preinfer-doctor

Requirements

  • Python ≥ 3.10
  • numpy >= 1.23, < 2.0
  • opencv-python-headless < 4.12
  • Pillow
  • Patched FFmpeg shared libraries (built automatically by scripts/build_patched_ffmpeg.sh)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

codec_video_prep-0.2.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

codec_video_prep-0.2.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

codec_video_prep-0.2.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

codec_video_prep-0.2.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

codec_video_prep-0.2.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.1 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

File details

Details for the file codec_video_prep-0.2.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.2.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 7f2aba68fca7e0a9af2b8c4adca71771c3444b66baf886b4ce66c7b825685a23
MD5 19c9ad89115d61f31aa1f44bd069468a
BLAKE2b-256 259565527627f179a64fe4954ea8207953adcb3811bb1419589a071c01fe1011

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.2.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.2.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 60eb48a4c2d65500983699ee8c09f7b65c78da89d26385a517f310e44102e595
MD5 d88019e9221ca69690a4b6df690739f5
BLAKE2b-256 52d3dc77333eff81b0513a4811890a7c6e3047fdac503054e2fcbe580d678000

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.2.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.2.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 25c2fe5c90e1166e2ca3dc15a2c8c4f1fdf910aaa4298312d6afda6352680774
MD5 bef35938b0d82fb7895a63529a69ab1c
BLAKE2b-256 f849f34c316495f17e8a88a1b57c853cce207c2279b33e58c666bb1be5ff8a10

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.2.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.2.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 a1e642c6630f66bfdf478e82d4b6233f33dbc04e406100c7f95bf039116ee72e
MD5 f0772a92f0cca37cff98097ad1684edc
BLAKE2b-256 3768fa7b9c157a74a62036f8efc63a37c14401af18d40eda4c8c1c2b01dc09ea

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.2.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.2.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 a70e9a884bc06b29862990e3f890ada2d2073c5bc0240fca7a60db1337901a58
MD5 93badc53907b59188639b1fcfead4d79
BLAKE2b-256 d93977d3fa10c469d136b5e46c828862c7d0a2982d22147dcce5331f909740a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page