Zero-PyAV macOS H.264 encode (Apple VideoToolbox) for pdum.rfb
Project description
habemus-papadum-vtenc (import pdum.vtenc)
macOS host NV12 → H.264 Annex B via Apple's VideoToolbox (VTCompressionSession),
with no PyAV and no ffmpeg. The companion encoder for
pdum.rfb (PyPI: habemus-papadum-rfb) on Apple Silicon — the
counterpart of habemus-papadum-nvenc on NVIDIA. A uv workspace
member of this repo. Design notes:
docs/mlx_metal_videotoolbox_encoder_design.md.
Why it exists:
- Hardware H.264 on macOS without PyAV. VideoToolbox is the Apple-Silicon hardware encoder; this binds it directly, so the GPU path needs no ffmpeg layer.
- MLX-friendly. Its
encode()takes any Python buffer-protocol object, so an evaluated MLXmx.array(Apple-Silicon unified memory) feeds it directly.
What's ours
Everything is ours — there is no vendored SDK (VideoToolbox/CoreVideo/CoreMedia are macOS system frameworks):
src/cpp/vtenc_ext.mm OURS — the only native code; a thin pybind11 binding over
VTCompressionSession (Objective-C++).
src/pdum/vtenc/__init__.py OURS — Python surface + single-extension loader.
CMakeLists.txt OURS — pybind11 3.0.4; -framework links; one _vtenc module.
build-wheel.sh OURS — self-contained wheel build (delocate).
Behaviour (matches the pdum.rfb invariants)
- NV12 in → H.264 Annex B out (start codes, in-band SPS/PPS on every IDR — what the
browser's WebCodecs
VideoDecoderwants). - Low-latency, no frame reordering (no B-frames ⇒ output order == input order) and
synchronous 1-in-1-out: each
encode()returns its own frame's access unit (CompleteFramesafter each submit) — required for correct seq attribution. - BT.601 limited range VUI (matches
pdum.rfb'sgpu.rgb_to_nv12kernel), so a browser decodes the color correctly. - Fixed-resolution, even dimensions; one
VTCompressionSessionper instance.
Usage
import numpy as np
from pdum.vtenc import VtEncoder
enc = VtEncoder(1920, 1080, fps=30, bitrate=12_000_000)
nv12 = np.zeros((1080 * 3 // 2, 1920), dtype=np.uint8) # contiguous NV12 (Y then UV)
# ... fill nv12 (e.g. from an evaluated MLX array) ...
annexb = enc.encode(nv12, force_idr=True) # bytes; H.264 Annex B
annexb += enc.flush()
print(enc.codec_string) # e.g. "avc1.420028" (from the SPS)
enc.close()
encode() accepts any contiguous (H*3//2, W) uint8 buffer-protocol object — numpy or
an evaluated MLX mx.array (call mx.eval(frame) first; MLX is lazy).
VtEncoder.codec_string is the avc1.PPCCLL string derived from the actual emitted
SPS (VideoToolbox picks the level from the resolution, so it is not a constant — 1080p
Baseline is avc1.420028, not avc1.42E01F). Empty until the first keyframe.
Build & test (local, CMake)
cmake -S . -B build -G Ninja
cmake --build build -j
Build wheels (maintainer)
./build-wheel.sh # cp314 -> dist/habemus_papadum_vtenc-*.whl
PYTHON_VERSIONS="3.12 3.13 3.14" ./build-wheel.sh
Requires only Xcode Command Line Tools (clang + the macOS SDK frameworks); the full
Metal toolchain is not needed for v1. The wheel bundles nothing beyond the extension —
the frameworks come from macOS, as they must. Publishing to PyPI is done by
scripts/publish.sh, not from CI.
Scope / caveats
- Fixed-resolution NV12 in, Annex B out, one encoder per instance. H.264 only (HEVC is a
follow-up). No
EncoderBackend/serve()wiring yet — that's thepdum.rfbintegration. - Input is a host-visible (CPU / unified-memory) NV12 buffer, memcpy'd into an
encoder-owned
CVPixelBuffer. Wrapping an MLX unified-memory buffer as theCVPixelBufferbacking directly (true zero-copy) is a follow-up.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file habemus_papadum_vtenc-0.2.1-cp314-cp314-macosx_12_0_arm64.whl.
File metadata
- Download URL: habemus_papadum_vtenc-0.2.1-cp314-cp314-macosx_12_0_arm64.whl
- Upload date:
- Size: 92.6 kB
- Tags: CPython 3.14, macOS 12.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac4c9769b1e08f7f589cba884d370eba11701d1a48e6cb1b8bc195c125f5f01f
|
|
| MD5 |
a77431c24c0d1fddf1d69d9e7893ce3e
|
|
| BLAKE2b-256 |
6aa5f7018586ea2481d92355ced9400d797102cbfcb573eaebf0e96cd392d045
|
File details
Details for the file habemus_papadum_vtenc-0.2.1-cp313-cp313-macosx_12_0_arm64.whl.
File metadata
- Download URL: habemus_papadum_vtenc-0.2.1-cp313-cp313-macosx_12_0_arm64.whl
- Upload date:
- Size: 92.6 kB
- Tags: CPython 3.13, macOS 12.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4f408b491740d781e9f04a7ebc1168cab29dcf51c2f7275662848fc23c74017
|
|
| MD5 |
aac054c8fb5b2ba1870ecc6db7809d67
|
|
| BLAKE2b-256 |
7c3b428be71cc3c5f9c6b394c3fe0036e120827e57e4b2953aee4030b70636c2
|
File details
Details for the file habemus_papadum_vtenc-0.2.1-cp312-cp312-macosx_12_0_arm64.whl.
File metadata
- Download URL: habemus_papadum_vtenc-0.2.1-cp312-cp312-macosx_12_0_arm64.whl
- Upload date:
- Size: 92.5 kB
- Tags: CPython 3.12, macOS 12.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c595299379f520ec2b068b3d58e836b0a0884b5811e6b8a5f836a1cabb2288fb
|
|
| MD5 |
212ffa232193022cf514956c8f07417f
|
|
| BLAKE2b-256 |
ac8b3f44fb46e9479d1fe61102baebbf8531ccd67fd93e10e39fec6adffc605d
|