Codec-aware video preprocessing for training and inference

Project description

codec-video-prep

Codec-aware video preprocessing for training and inference. Extracts codec-level bitcost information from H.264 / HEVC / VP9 videos and turns it into patch-canvases ready for downstream vision models.

What it does

Patched FFmpeg decoder – Instruments the H.264 / HEVC / VP9 decoder to export per-macroblock (H.264) or per-CTU (HEVC) bitcost maps during decoding.
Fast C++ extension (cv_reader_fast) – Decodes video with loop-filter / IDCT skipped and optionally returns bitcost data as NumPy arrays.
Readiness grouping – Groups frames by compressibility (bitcost) so that hard-to-decode regions get more patches.
Top-K patch selection – Selects the most informative 2×2 patch blocks from each group and packs them into JPG/PNG canvases.
One-command pipeline – From a raw video to a folder of canvases + metadata in a single call.

Install

From wheel (recommended)

python -m pip install codec_video_prep-*.whl

Verify the installation:

codec-video-prep-doctor

Build from source

Build the patched FFmpeg shared libraries:
- Pixel-capable (recommended — supports both bitcost and BGR pixel export):
```
bash build_pixel_ffmpeg.sh
```
- Legacy skip-IDCT (faster bitcost-only scan, no pixel output):
```
bash scripts/build_patched_ffmpeg.sh
```
Build and install the Python package:

python -m pip install -e .

Quick start (CLI)

codec-video-prep \
  --video /path/to/video.mp4 \
  --out_dir ./preinfer_out \
  --num_sampled_frames 1024 \
  --group_size 32 \
  --images_per_group 4 \
  --max_pixels 153664

Output directory will contain:

canvas_*.jpg – Packed patch canvases
meta.json – Full metadata, timing, and group info
frame_ids.npy – Sampled frame indices
src_patch_position.npy – Patch source positions

Decode backends

Two decode backends are available:

Backend	Description	Best for
`ffmpeg_native` (default)	FFmpeg subprocess decode + `cv_reader_fast` bitcost scan	General use
`cv_reader_pixels`	Single-pass decode via `cv_reader_fast` that returns both bitcost and BGR pixels	Speed (~1.8–1.9× faster end-to-end)

Switch backend:

codec-video-prep --decode_backend cv_reader_pixels ...

Parallel segment decoding

For long videos with dense frame sampling, the bitcost-scan step dominates total time. You can split the workload into N parallel decode segments using ProcessPoolExecutor:

codec-video-prep \
  --decode_backend cv_reader_pixels \
  --parallel_segments 4 \
  --threads_per_segment 4 \
  --segment_guard_frames 30 \
  ...

Parameter	Default	Description
`--parallel_segments`	`0` (disabled)	Number of parallel segments. Set to `0` or `1` to use serial decoding.
`--threads_per_segment`	`4`	FFmpeg `thread_count` inside each worker process.
`--segment_guard_frames`	`30`	Extra frames decoded before/after each segment boundary to compensate for seek-to-keyframe inaccuracy.

Note: Parallel segment decoding incurs process-spawn overhead. For short clips (< a few thousand frames) serial decoding is usually faster. The benefit appears on long videos with dense sampling (e.g. 10k+ frames).

Python API

High-level one-shot call

from codec_video_prep import run_preinfer

result = run_preinfer(
    video="/path/to/video.mp4",
    out_dir="./preinfer_out",
    num_sampled_frames=1024,
    group_size=32,
    images_per_group=4,
    patch=14,
    max_pixels=153664,
    min_group_frames=8,
    max_group_frames=64,
    bitcost_grid="adaptive",
    decode_backend="cv_reader_pixels",   # or "ffmpeg_native"
    parallel_segments=4,                  # 0 = serial
    threads_per_segment=4,
    segment_guard_frames=30,
)

print(result.out_dir)       # output directory
print(result.meta_path)     # path to meta.json
print(result.timings)       # timing breakdown

Low-level fast decoder

from codec_video_prep import cv_reader_fast

# Decode all frames with bitcost export
frames = cv_reader_fast.read_video_fast(
    path="/path/to/video.mp4",
    thread_count=16,
    export_bitcost=1,
    thread_type="auto",
)

# Decode selected frames only (bitcost + optional pixels)
selected = cv_reader_fast.read_video_fast_selected(
    path="/path/to/video.mp4",
    frame_ids=[0, 30, 60, 90],
    thread_count=16,
    export_bitcost=1,
    export_pixels=1,   # also return BGR pixels
    out_w=224,         # optional resize width
    out_h=224,         # optional resize height
)

# Segment seek + decode (used internally for parallel workers)
segment = cv_reader_fast.read_video_fast_selected_segment(
    path="/path/to/video.mp4",
    frame_ids=[30, 60, 90],
    seek_frame=0,       # seek target (decoder lands on nearest keyframe before this)
    end_frame=120,      # stop after this frame index
    thread_count=4,
    export_bitcost=1,
    export_pixels=1,
    out_w=224,
    out_h=224,
)

Each frame dict contains:

Key	Description
`frame_idx`	Frame index
`pict_type`	`'I'`, `'P'` or `'B'`
`width` / `height`	Frame resolution
`codec_name`	Decoder name (`h264`, `hevc`, `vp9`, …)
`bitcost`	Dict with MB/CTU bitcost arrays (when `export_bitcost=1`)
`pixels`	`(H, W, 3)` uint8 BGR array (when `export_pixels=1`)

Project structure

├── src/codec_video_prep/    # Python package
│   ├── api.py                        # run_preinfer() entrypoint
│   ├── cli.py                        # codec-video-prep CLI
│   ├── doctor.py                     # codec-video-prep-doctor diagnostics
│   ├── config.py                     # PreinferConfig
│   └── libs/                         # Bundled FFmpeg .so files
├── codec_selector/                   # Frame sampling / grouping / patch selection
│   ├── core/                         # Pipeline, probe, decode, config
│   ├── plugins/                      # Samplers, scorers, groupers, selectors, packers
│   └── codec_patch_gop/              # Legacy GOP-based utilities
├── native/                           # C++ Python extension
│   └── cv_reader_fast.cpp            # Fast decoder with bitcost + pixel export, segment seek API
├── ffmpeg_patch/                     # FFmpeg source patches
│   ├── bitcost_only/                 # Pixel-capable patches (H.264 + HEVC + VP9, keeps full IDCT)
│   │   ├── h264_cabac.c / h264_cavlc.c
│   │   ├── hevcdec.c / hevcdec.h / hevc_refs.c
│   │   ├── vp9.c / vp9dec.h / vp9shared.h
│   │   └── h264_bitcost_only.patch
│   └── full_skip/                    # Legacy skip-IDCT patches (faster, no pixel output)
│       ├── h264_*.c
│       ├── hevc_*.c
│       └── patch.sh
├── scripts/
│   ├── build_patched_ffmpeg.sh       # Build legacy skip-IDCT FFmpeg libs
│   ├── build_pixel_ffmpeg.sh         # Build pixel-capable FFmpeg libs
│   └── build_manylinux_wheel.sh      # Build manylinux wheel
├── setup.py                          # setuptools build (C++ extension + FFmpeg libs)
└── pyproject.toml                    # PEP 517 project metadata

Build a manylinux wheel

PIP_INDEX_URL=https://mirrors.aliyun.com/pypi/simple \
PIP_TRUSTED_HOST=mirrors.aliyun.com \
bash scripts/build_manylinux_wheel.sh

Output:

wheelhouse/codec_video_prep-0.1.0-cp310-cp310-manylinux2014_x86_64.whl

Install and check:

python -m pip install wheelhouse/codec_video_prep-*.whl
codec-video-prep-doctor

To target a different Python ABI, set PY_TAG:

PY_TAG=cp311-cp311 bash scripts/build_manylinux_wheel.sh

Diagnostics

codec-video-prep-doctor checks:

cv_reader_fast C extension can be imported
Bundled FFmpeg shared libraries are present
Threading defaults (auto thread type, 16 threads)

Backward Compatibility

The old import path and CLI names are kept as aliases:

compressed_video_preinfer
cv-preinfer
cv-preinfer-doctor

Requirements

Python ≥ 3.10
numpy >= 1.23, < 2.0
opencv-python-headless < 4.12
Pillow
Patched FFmpeg shared libraries (built automatically by scripts/build_patched_ffmpeg.sh)

Project details

Release history Release notifications | RSS feed

0.2.5

May 28, 2026

0.2.4

May 27, 2026

This version

0.2.3

May 22, 2026

0.2.2

May 21, 2026

0.2.1

May 21, 2026

0.2.0

May 21, 2026

0.1.1

May 19, 2026

0.1.0

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codec_video_prep-0.2.3-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (29.8 MB view details)

Uploaded May 22, 2026 CPython 3.13manylinux: glibc 2.17+ x86-64

codec_video_prep-0.2.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (29.8 MB view details)

Uploaded May 22, 2026 CPython 3.12manylinux: glibc 2.17+ x86-64

codec_video_prep-0.2.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (29.8 MB view details)

Uploaded May 22, 2026 CPython 3.11manylinux: glibc 2.17+ x86-64

codec_video_prep-0.2.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (29.8 MB view details)

Uploaded May 22, 2026 CPython 3.10manylinux: glibc 2.17+ x86-64

codec_video_prep-0.2.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (29.8 MB view details)

Uploaded May 22, 2026 CPython 3.9manylinux: glibc 2.17+ x86-64

File details

Details for the file codec_video_prep-0.2.3-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

Download URL: codec_video_prep-0.2.3-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Upload date: May 22, 2026
Size: 29.8 MB
Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for codec_video_prep-0.2.3-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm	Hash digest
SHA256	`94694ff70361b69d6166e1789cd64497221e164c8fcb0c056ee8d13f98ca3d87`
MD5	`6e8f1b74a08a63a7741746b2623119d0`
BLAKE2b-256	`238839eec96d8dc4b7947d9446974329895acc4e52ca3b1ffebf3e3a3aaac510`

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.2.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

Download URL: codec_video_prep-0.2.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Upload date: May 22, 2026
Size: 29.8 MB
Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for codec_video_prep-0.2.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm	Hash digest
SHA256	`82d61a75d8ff5ab4788c3d43f981ff415c2f4e96208ef82769dccf1e35f1a530`
MD5	`07403ce835c5004ceeca8b9cad977747`
BLAKE2b-256	`7cf59fe35278f91afc2f662806042048ad1c1e687bf1b3f8a2e1be347acbfd27`

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.2.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

Download URL: codec_video_prep-0.2.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Upload date: May 22, 2026
Size: 29.8 MB
Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for codec_video_prep-0.2.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm	Hash digest
SHA256	`6632d267ce49c46c206503cd7a5e2bcfbb61d32431205d2a5270f01954b59b6f`
MD5	`f6e1071ee9f86d1ed3b41427a52653ff`
BLAKE2b-256	`abdeb19991eb078a4c46af3898d7b7339491eb0aeeb79d7cf3538946da6303ca`

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.2.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

Download URL: codec_video_prep-0.2.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Upload date: May 22, 2026
Size: 29.8 MB
Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for codec_video_prep-0.2.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm	Hash digest
SHA256	`46859fa3dfde5567be84164085621f457ee0cebd4a26260804b4de7340d56d26`
MD5	`3451dc838e6285394682c4e5c1b7f6fb`
BLAKE2b-256	`c475050d8342d739850d35692375426f3f7fa0e04ee2ef7d5d56c2a7922222e5`

See more details on using hashes here.

File details

Details for the file codec_video_prep-0.2.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

Download URL: codec_video_prep-0.2.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Upload date: May 22, 2026
Size: 29.8 MB
Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for codec_video_prep-0.2.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm	Hash digest
SHA256	`6d466a0718d7d80bac4e2b280e8b6e77fd02be7a09c9ef7a698a9accf0ba034f`
MD5	`4ce776d0a2786dce95ec2c40c571003a`
BLAKE2b-256	`b117816ba6cf297448afc347b46653f5b90bc951a208fb7e68d3a990d113db4e`

See more details on using hashes here.

codec-video-prep 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

codec-video-prep

What it does

Install

From wheel (recommended)

Build from source

Quick start (CLI)

Decode backends

Parallel segment decoding

Python API

High-level one-shot call

Low-level fast decoder

Project structure

Build a manylinux wheel

Diagnostics

Backward Compatibility

Requirements

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes