Skip to main content

Codec-aware video preprocessing for training and inference

Project description

codec-video-prep

Codec-aware video preprocessing for training and inference. Extracts codec-level bitcost information from H.264 / HEVC videos and turns it into patch-canvases ready for downstream vision models.

What it does

  • Patched FFmpeg decoder – Instruments the H.264 / HEVC decoder to export per-macroblock (H.264) or per-CTU (HEVC) bitcost maps during decoding.
  • Fast C++ extension (cv_reader_fast) – Decodes video with loop-filter / IDCT skipped and optionally returns bitcost data as NumPy arrays.
  • Readiness grouping – Groups frames by compressibility (bitcost) so that hard-to-decode regions get more patches.
  • Top-K patch selection – Selects the most informative 2×2 patch blocks from each group and packs them into JPG/PNG canvases.
  • One-command pipeline – From a raw video to a folder of canvases + metadata in a single call.

Install

From wheel (recommended)

python -m pip install codec_video_prep-*.whl

Verify the installation:

codec-video-prep-doctor

Build from source

  1. Build the patched FFmpeg shared libraries:
bash scripts/build_patched_ffmpeg.sh
  1. Build and install the Python package:
python -m pip install -e .

Quick start (CLI)

codec-video-prep \
  --video /path/to/video.mp4 \
  --out_dir ./preinfer_out \
  --num_sampled_frames 1024 \
  --group_size 32 \
  --images_per_group 4 \
  --max_pixels 153664

Output directory will contain:

  • canvas_*.jpg – Packed patch canvases
  • meta.json – Full metadata, timing, and group info
  • frame_ids.npy – Sampled frame indices
  • src_patch_position.npy – Patch source positions

Python API

High-level one-shot call

from codec_video_prep import run_preinfer

result = run_preinfer(
    video="/path/to/video.mp4",
    out_dir="./preinfer_out",
    num_sampled_frames=1024,
    group_size=32,
    images_per_group=4,
    patch=14,
    max_pixels=153664,
    min_group_frames=8,
    max_group_frames=64,
    bitcost_grid="adaptive",
)

print(result.out_dir)       # output directory
print(result.meta_path)     # path to meta.json
print(result.timings)       # timing breakdown

Low-level fast decoder

from codec_video_prep import cv_reader_fast

# Decode all frames with bitcost export
frames = cv_reader_fast.read_video_fast(
    path="/path/to/video.mp4",
    thread_count=16,
    export_bitcost=1,
    thread_type="auto",
)

# Decode selected frames only
selected = cv_reader_fast.read_video_fast_selected(
    path="/path/to/video.mp4",
    frame_ids=[0, 30, 60, 90],
    thread_count=16,
    export_bitcost=1,
)

Each frame dict contains:

Key Description
frame_idx Frame index
pict_type 'I', 'P' or 'B'
width / height Frame resolution
codec_name Decoder name (h264, hevc, …)
bitcost Dict with MB/CTU bitcost arrays (when export_bitcost=1)

Project structure

├── src/codec_video_prep/    # Python package
│   ├── api.py                        # run_preinfer() entrypoint
│   ├── cli.py                        # codec-video-prep CLI
│   ├── doctor.py                     # codec-video-prep-doctor diagnostics
│   ├── config.py                     # PreinferConfig
│   └── libs/                         # Bundled FFmpeg .so files
├── codec_selector/                   # Frame sampling / grouping / patch selection
│   ├── core/                         # Pipeline, probe, decode, config
│   ├── plugins/                      # Samplers, scorers, groupers, selectors, packers
│   └── codec_patch_gop/              # Legacy GOP-based utilities
├── native/                           # C++ Python extension
│   └── cv_reader_fast.cpp            # Fast decoder with bitcost export
├── ffmpeg_patch/                     # FFmpeg source patches
│   ├── h264_*.c                      # H.264 bitcost instrumentation
│   ├── hevc_*.c                      # HEVC bitcost instrumentation
│   └── patch.sh                      # Patch application script
├── scripts/
│   ├── build_patched_ffmpeg.sh       # Build patched FFmpeg libs
│   └── build_manylinux_wheel.sh      # Build manylinux wheel
├── setup.py                          # setuptools build (C++ extension + FFmpeg libs)
└── pyproject.toml                    # PEP 517 project metadata

Build a manylinux wheel

PIP_INDEX_URL=https://mirrors.aliyun.com/pypi/simple \
PIP_TRUSTED_HOST=mirrors.aliyun.com \
bash scripts/build_manylinux_wheel.sh

Output:

wheelhouse/codec_video_prep-0.1.0-cp310-cp310-manylinux2014_x86_64.whl

Install and check:

python -m pip install wheelhouse/codec_video_prep-*.whl
codec-video-prep-doctor

To target a different Python ABI, set PY_TAG:

PY_TAG=cp311-cp311 bash scripts/build_manylinux_wheel.sh

Diagnostics

codec-video-prep-doctor checks:

  • cv_reader_fast C extension can be imported
  • Bundled FFmpeg shared libraries are present
  • Threading defaults (auto thread type, 16 threads)

Backward Compatibility

The old import path and CLI names are kept as aliases:

  • compressed_video_preinfer
  • cv-preinfer
  • cv-preinfer-doctor

Requirements

  • Python ≥ 3.10
  • numpy >= 1.23, < 2.0
  • opencv-python-headless < 4.12
  • Pillow
  • Patched FFmpeg shared libraries (built automatically by scripts/build_patched_ffmpeg.sh)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

codec_video_prep-0.1.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

codec_video_prep-0.1.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

codec_video_prep-0.1.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

codec_video_prep-0.1.1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

codec_video_prep-0.1.1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (28.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

File details

Details for the file codec_video_prep-0.1.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.1.1-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 bda2d5c89d55c9846f145648a6aa0fa52a86b97a718597122f25110f8ec63d22
MD5 07265cf6e7a512d3a97bca719b18730d
BLAKE2b-256 4fbaf1110565d919051723e4991adb7cf2af4e05db98f633bf9ef1b10b17c0bb

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.1.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.1.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 a6d4cc7f91c5bb203acf188cb743bd3261c74b07a8b99014d2f21421a846f360
MD5 698fb3bac94cd7e7da40064e9a6e684c
BLAKE2b-256 e244624d981bc1ca5809dbf8c4a325c6aa2bc94bf59532fd3c110f2a4f9bd891

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.1.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.1.1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 f07d8d3d6bfe35a9b10b299d2830ea2057a5437172cdaf41100c0bf9d9ff9d48
MD5 ea43cc6319cade59ac0a6685a0ffc0cc
BLAKE2b-256 784085976ace4f9fb52fc7fde8ce556f680e10bb5ef142c07f653369d6a0a3d3

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.1.1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.1.1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 201b6e1f4d113b9edd014ced04e40d1df6d48a0b0cb77ce00e8d2c4d153cfc86
MD5 e97c038adb9f1faa5aeac92aa7306c28
BLAKE2b-256 c71ddb3cffa818985035356f9d1d4d842fd0a388ddb38df9f8f001ef7828970e

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.1.1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for codec_video_prep-0.1.1-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 d6211aebc40bba49f7e28e5e31ac0fe54c817ca276223c863b7f500117864cbb
MD5 d54ecffdecf9b1bab6d28f27832ea77a
BLAKE2b-256 152625572b9021b77affa15c311c6965a7378bff4256c4cf64043436feabd285

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page