Zero-PyAV GPU H.264/HEVC encode (NVIDIA Video Codec SDK / NVENC) for pdum.rfb
Project description
habemus-papadum-nvenc (import pdum.nvenc)
GPU NV12 → H.264/HEVC Annex B via NVIDIA's Video Codec SDK encoder, with no
PyAV and no host copy. The companion GPU encoder for
pdum.rfb (PyPI: habemus-papadum-rfb); a uv workspace member of
this repo. Full assessment: docs/nvenc_sdk_evaluation.md.
Why it exists:
- Builds + runs on CPython 3.14. Upstream
PyNvVideoCodecpins pybind11 2.10.0 (no 3.14) and ships no cp314 wheel/sdist; this package builds against pybind11 v3.0.4. - GPU-resident, PyAV-free. Encodes any
__cuda_array_interface__tensor (CuPy / PyTorch / Numba) directly, sidestepping the PyAV-18 requirement entirely.
What's ours vs NVIDIA's
src/cpp/nvenc_ext.cpp OURS — the only hand-written C++; thin pybind11 binding +
all NVTX ranges. Wraps NvEncoderCuda; no NVIDIA edits.
src/pdum/nvenc/__init__.py OURS — Python surface + the ABI loader (picks 12.1/13.0).
CMakeLists.txt OURS — pybind11 3.0.4 (the 3.14 fix), dual ABI, optional NVTX.
build-wheel.sh OURS — self-contained wheel build (auditwheel).
third_party/ VERBATIM, UNMODIFIED NVIDIA SDK (MIT). See PROVENANCE.md.
The NVIDIA source under third_party/ is copied byte-for-byte from PyNvVideoCodec 2.1.0 with its MIT headers intact; we made zero edits to it.
Dual NVENC ABI
The wheel ships two extensions built from the same source — _nvenc_121 (NVENC SDK
12.1) and _nvenc_130 (13.0) — and pdum/nvenc/__init__.py loads whichever the host
driver supports (newest first, via a cheap NvEncodeAPIGetMaxSupportedVersion probe),
so one wheel works across old and new drivers.
Build & test (local, CMake)
cmake -S . -B build -G Ninja -DUSE_NVTX=OFF # or -DUSE_NVTX=ON for profiling
cmake --build build -j
Build wheels (maintainer)
./build-wheel.sh # cp314 -> dist/habemus_papadum_nvenc-*.whl
./build-wheel.sh --nvtx # profiling wheel (NVTX ranges on)
PYTHON_VERSIONS="3.12 3.13 3.14" ./build-wheel.sh
The wheel bundles its C/C++ runtime deps but not libcuda / libnvidia-encode
— those come from the host NVIDIA driver, as they must. Publishing to PyPI is done
by scripts/publish.sh (which calls this), not from CI.
Usage
import cupy as cp
from pdum.nvenc import NvencEncoder
enc = NvencEncoder(1920, 1080, codec="h264", preset="p3", tuning="ll", fps=30, gop=30)
nv12 = cp.empty((1080 * 3 // 2, 1920), dtype=cp.uint8) # contiguous NV12
# ... render into nv12 ...
annexb = enc.encode(nv12, force_idr=True) # bytes; H.264 Annex B
annexb += enc.flush()
enc.close()
NvencEncoder(cuda_context=0) retains the device primary context — the same one
CuPy/PyTorch use — so device pointers are valid to NVENC with no cross-context copy.
NvencEncoder(extra_output_delay=0) (the default) is zero-latency: each frame's
access unit is returned by its own encode() call (synchronous 1-in-1-out), which is
what a low-latency stream wants. Raise it (NVIDIA's helper defaults to 3) to overlap
encode with rendering for throughput, at a matching cost in frames of latency.
NVTX profiling
Built with --nvtx / -DUSE_NVTX=ON, the binding emits ranges at the Python
boundary (pdum.encode, pdum.read_cai, pdum.copy_to_nvenc, pdum.submit,
pdum.collect_output) and activates NVIDIA's internal ranges (EncodeFrame,
DoEncode, MapResources, CopyToDeviceFrame_*). Profile with Nsight Systems:
nsys profile -t nvtx,cuda python your_script.py.
Scope / caveats
- Fixed-resolution NV12 in, Annex B out, one encoder per instance. No reconfigure /
SEI /
EncoderBackendwiring yet — that's thepdum.rfbintegration. - Input uses
GetNextInputFrame+CopyToDeviceFrame(one intra-GPU copy, no host round-trip). True zero-copy viaNvEncoder::RegisterResourceis a follow-up.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file habemus_papadum_nvenc-0.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: habemus_papadum_nvenc-0.3.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 285.0 kB
- Tags: CPython 3.14, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6174860082ec6aa0c6c934c43ee695ad277d88ad29106277e359b7545e190293
|
|
| MD5 |
1882667235fde53df4fc2b6c29d520e4
|
|
| BLAKE2b-256 |
ffd86015c4c2b03880b46ede0638cfa8f1226a790bb350d33e1151ad137a26ff
|
File details
Details for the file habemus_papadum_nvenc-0.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: habemus_papadum_nvenc-0.3.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 285.0 kB
- Tags: CPython 3.13, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd11ccb6c01deca35bf80943b2bac0129624a845f0335b6c936bbd78bafd5464
|
|
| MD5 |
b402b5bee3b9680bb3bc112db58590f9
|
|
| BLAKE2b-256 |
b5686fe6f6c605058ad51d8dda072726981d6d8cc6e14a4bfceb267ed9e4feaf
|
File details
Details for the file habemus_papadum_nvenc-0.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: habemus_papadum_nvenc-0.3.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 284.7 kB
- Tags: CPython 3.12, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
331d8cf584b7427eb96869578edaca7a4209314441d2923c1d9de3cb6d459937
|
|
| MD5 |
6fe5bb927d2d5c9702ec21c489656f63
|
|
| BLAKE2b-256 |
04c435de5be5e3661a79fc453fa8fdc51c101ebb755083d2db6357bb976b72a2
|