Skip to main content

Zero-PyAV GPU H.264/HEVC encode (NVIDIA Video Codec SDK / NVENC) for pdum.rfb

Project description

habemus-papadum-nvenc (import pdum.nvenc)

GPU NV12 → H.264/HEVC Annex B via NVIDIA's Video Codec SDK encoder, with no PyAV and no host copy. The companion GPU encoder for pdum.rfb (PyPI: habemus-papadum-rfb); a uv workspace member of this repo. Full assessment: docs/nvenc_sdk_evaluation.md.

Why it exists:

  1. Builds + runs on CPython 3.14. Upstream PyNvVideoCodec pins pybind11 2.10.0 (no 3.14) and ships no cp314 wheel/sdist; this package builds against pybind11 v3.0.4.
  2. GPU-resident, PyAV-free. Encodes any __cuda_array_interface__ tensor (CuPy / PyTorch / Numba) directly, sidestepping the PyAV-18 requirement entirely.

What's ours vs NVIDIA's

src/cpp/nvenc_ext.cpp      OURS — the only hand-written C++; thin pybind11 binding +
                           all NVTX ranges. Wraps NvEncoderCuda; no NVIDIA edits.
src/pdum/nvenc/__init__.py OURS — Python surface + the ABI loader (picks 12.1/13.0).
CMakeLists.txt             OURS — pybind11 3.0.4 (the 3.14 fix), dual ABI, optional NVTX.
build-wheel.sh             OURS — self-contained wheel build (auditwheel).
third_party/               VERBATIM, UNMODIFIED NVIDIA SDK (MIT). See PROVENANCE.md.

The NVIDIA source under third_party/ is copied byte-for-byte from PyNvVideoCodec 2.1.0 with its MIT headers intact; we made zero edits to it.

Dual NVENC ABI

The wheel ships two extensions built from the same source — _nvenc_121 (NVENC SDK 12.1) and _nvenc_130 (13.0) — and pdum/nvenc/__init__.py loads whichever the host driver supports (newest first, via a cheap NvEncodeAPIGetMaxSupportedVersion probe), so one wheel works across old and new drivers.

Build & test (local, CMake)

cmake -S . -B build -G Ninja -DUSE_NVTX=OFF      # or -DUSE_NVTX=ON for profiling
cmake --build build -j

Build wheels (maintainer)

./build-wheel.sh                 # cp314 -> dist/habemus_papadum_nvenc-*.whl
./build-wheel.sh --nvtx          # profiling wheel (NVTX ranges on)
PYTHON_VERSIONS="3.12 3.13 3.14" ./build-wheel.sh

The wheel bundles its C/C++ runtime deps but not libcuda / libnvidia-encode — those come from the host NVIDIA driver, as they must. Publishing to PyPI is done by scripts/publish.sh (which calls this), not from CI.

Usage

import cupy as cp
from pdum.nvenc import NvencEncoder

enc = NvencEncoder(1920, 1080, codec="h264", preset="p3", tuning="ll", fps=30, gop=30)
nv12 = cp.empty((1080 * 3 // 2, 1920), dtype=cp.uint8)   # contiguous NV12
# ... render into nv12 ...
annexb = enc.encode(nv12, force_idr=True)                # bytes; H.264 Annex B
annexb += enc.flush()
enc.close()

NvencEncoder(cuda_context=0) retains the device primary context — the same one CuPy/PyTorch use — so device pointers are valid to NVENC with no cross-context copy.

NvencEncoder(extra_output_delay=0) (the default) is zero-latency: each frame's access unit is returned by its own encode() call (synchronous 1-in-1-out), which is what a low-latency stream wants. Raise it (NVIDIA's helper defaults to 3) to overlap encode with rendering for throughput, at a matching cost in frames of latency.

NVTX profiling

Built with --nvtx / -DUSE_NVTX=ON, the binding emits ranges at the Python boundary (pdum.encode, pdum.read_cai, pdum.copy_to_nvenc, pdum.submit, pdum.collect_output) and activates NVIDIA's internal ranges (EncodeFrame, DoEncode, MapResources, CopyToDeviceFrame_*). Profile with Nsight Systems: nsys profile -t nvtx,cuda python your_script.py.

Scope / caveats

  • Fixed-resolution NV12 in, Annex B out, one encoder per instance. No reconfigure / SEI / EncoderBackend wiring yet — that's the pdum.rfb integration.
  • Input uses GetNextInputFrame + CopyToDeviceFrame (one intra-GPU copy, no host round-trip). True zero-copy via NvEncoder::RegisterResource is a follow-up.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

habemus_papadum_nvenc-0.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (285.0 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

habemus_papadum_nvenc-0.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (285.0 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

habemus_papadum_nvenc-0.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (284.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file habemus_papadum_nvenc-0.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for habemus_papadum_nvenc-0.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6174860082ec6aa0c6c934c43ee695ad277d88ad29106277e359b7545e190293
MD5 1882667235fde53df4fc2b6c29d520e4
BLAKE2b-256 ffd86015c4c2b03880b46ede0638cfa8f1226a790bb350d33e1151ad137a26ff

See more details on using hashes here.

File details

Details for the file habemus_papadum_nvenc-0.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for habemus_papadum_nvenc-0.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 bd11ccb6c01deca35bf80943b2bac0129624a845f0335b6c936bbd78bafd5464
MD5 b402b5bee3b9680bb3bc112db58590f9
BLAKE2b-256 b5686fe6f6c605058ad51d8dda072726981d6d8cc6e14a4bfceb267ed9e4feaf

See more details on using hashes here.

File details

Details for the file habemus_papadum_nvenc-0.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for habemus_papadum_nvenc-0.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 331d8cf584b7427eb96869578edaca7a4209314441d2923c1d9de3cb6d459937
MD5 6fe5bb927d2d5c9702ec21c489656f63
BLAKE2b-256 04c435de5be5e3661a79fc453fa8fdc51c101ebb755083d2db6357bb976b72a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page