GPU-resident video decode via NVIDIA DeepStream — single-wheel install bundles the DS shared libs in a flat _libs/ directory and sets GST_PLUGIN_PATH + LIBV4L2_PLUGIN_DIR on import.
Project description
deepstream-videodecode
GPU-resident video decode via NVIDIA DeepStream — single-wheel install
that bundles the DS shared libraries in a flat _libs/ directory and
configures GStreamer + libv4l plugin discovery on Python import.
Building the libs zip and wheel (from scratch)
Step 1 — Start the ds_cuda container
Start the container once, mounting the DeepStream source tree at the same path as the host:
docker run -d \
--name ds_cuda \
--gpus all \
--network host \
--ipc host \
-e NVIDIA_DRIVER_CAPABILITIES=all \
-v /path/to/ds-rel-39:/path/to/ds-rel-39 \
nvidia/cuda:13.0.3-devel-ubuntu22.04 \
sleep infinity
Install build dependencies inside the container (one-time):
docker exec ds_cuda apt-get update -qq
docker exec ds_cuda apt-get install -y \
zip \
libgstreamer1.0-dev \
libgstreamer-plugins-base1.0-dev \
libjsoncpp-dev
Step 2 — Build the libraries and create the zip
docker exec ds_cuda bash /path/to/ds-rel-39/build_ds_vllm_libs.sh \
/path/to/ds-rel-39/deepstream_decode_libs-9.0.0+cuda13.ubuntu2204-x86_64.zip
build_ds_vllm_libs.sh (included in this repo) builds all 15 required
.so files from source inside the container and zips only those targets.
The zip appears on the host at the same path (volume mount).
Step 3 — Upload zip to GitLab Package Registry
Upload once — re-upload only when DS libs change (bump the version to match):
export TOKEN=glpat-xxxxxxxxxxxxxxxxxxxx # Personal Access Token (api scope)
export PROJECT_ID=364387
export ZIP=/path/to/ds-rel-39/deepstream_decode_libs-9.0.0+cuda13.ubuntu2204-x86_64.zip
curl -X PUT \
--http1.1 \
--header "PRIVATE-TOKEN: $TOKEN" \
--header "Content-Type: application/octet-stream" \
--data-binary @"$ZIP" \
"https://gitlab-master.nvidia.com/api/v4/projects/$PROJECT_ID/packages/generic/deepstream_decode_libs/9.0.0/deepstream_decode_libs-9.0.0-cuda13-ubuntu2204-x86_64.zip"
Note: the registry filename uses hyphens (no +) to avoid URL encoding issues.
Step 4 — Build the wheel (CI or local)
Via CI (automatic on every push to main):
The .gitlab-ci.yml pipeline downloads the zip from the Package Registry
and builds the wheel. Download the .whl from the pipeline artifacts page.
Locally (if needed):
# Prerequisites
sudo apt install patchelf binutils unzip python3 python3-pip
pip install build
cp /path/to/ds-rel-39/deepstream_decode_libs-9.0.0+cuda13.ubuntu2204-x86_64.zip ./
./build_wheel.sh
# → dist/nvidia_deepstream_videodecode-9.0.0+cu13.ubuntu2204-py3-none-manylinux_2_34_x86_64.whl
Install (consumer side)
apt update
apt install gstreamer1.0-tools gstreamer1.0-plugins-{base,good,bad,ugly} \
gstreamer1.0-libav python3-gi python3-gst-1.0 libv4l-0 \
cuda-libraries-13-0
pip install nvidia_deepstream_videodecode-*.whl
Quickstart
# Built-in selftest — verifies lib resolution, plugin discovery, CUDA context.
deepstream-videodecode-selftest
# Decode a file
python3 examples/decode_example.py /path/to/video.mp4
# File + live RTSP source
python3 examples/decode_example.py /path/to/video.mp4 \
--rtsp rtsp://10.24.217.130:8554/ --workers 4 --frames 16
Successful output ends with frames shape : (N, H, W, 3) on
torch.uint8, cuda:0 — the GPU tensor is ready for downstream consumers
with no D2H copy.
Public API
from nvidia.deepstream_videodecode import (
DecodePool, # pool of N file-decode pipelines on N threads
StreamHandle, # one persistent pipeline for an RTSP/URI stream
DecodeFrames, # @dataclass: frames, n_kept, n_total, fps, error
probe_metadata, # GStreamer-only metadata probe (no decode, no PyAV)
lib_dir, # path to the bundled _libs/ directory
)
probe_metadata(data) -> (frame_count, fps, duration_sec, width, height, codec)
Read container metadata from raw bytes using GStreamer only —
no frames are decoded, no external library (PyAV / libmediainfo) needed.
DecodePool.decode(data, *, target_indices, codec="", max_frames, timeout_sec) -> DecodeFrames
Decode raw container bytes on a pool worker and keep the frames whose
decode-order index is in target_indices.
from nvidia.deepstream_videodecode import DecodePool, probe_metadata
pool = DecodePool(num_workers=8)
data = open("/path/to/video.mp4", "rb").read()
fc, fps, dur, w, h, codec = probe_metadata(data)
import numpy as np
indices = np.linspace(0, fc - 1, 8, dtype=int).tolist()
out = pool.decode(data, target_indices=indices, codec=codec,
max_frames=len(indices))
# out.frames: CUDA tensor (out.n_kept, H, W, 3) uint8 — no D2H copy.
What ships in the wheel
nvidia/
└── deepstream_videodecode/
├── __init__.py
├── _ds_dec.py # DecodePool / StreamHandle API
├── _runtime.py # _libs/ path resolver
├── _selftest.py # deepstream-videodecode-selftest CLI
├── _version.py
└── _libs/ # flat layout
├── libnvbufsurface.so
├── libnvbufsurftransform.so (~26 MB)
├── libnvbuf_fdmap.so
├── libnvds_meta.so
├── libnvdsbufferpool.so
├── libnvdsgst_helper.so
├── libnvdsgst_meta.so
├── libgstnvdsseimeta.so
├── libgstnvcustomhelper.so
├── libnvv4l2.so
├── libcuvidv4l2.so
├── libv4l2.so.0 (symlink → libnvv4l2.so)
├── libgstnvvideo4linux2.so (GStreamer plugin)
├── libgstnvvideoconvert.so (GStreamer plugin)
└── v4l_plugins/
└── libcuvidv4l2_plugin.so (libv4l plugin)
Troubleshooting
GStreamer element creation failed: ['nvdec', 'nvvconv']
GStreamer caches a plugin registry at ~/.cache/gstreamer-1.0/. If the
cache was built before cuda-libraries-13-0 was installed, it records
"this plugin failed to load" and never retries.
rm -rf ~/.cache/gstreamer-1.0/
python3 -c "import nvidia.deepstream_videodecode" # forces rescan
gst-inspect-1.0 nvv4l2decoder # should now print Factory Details
libnppig.so.13: cannot open shared object file
CUDA NPP runtime is missing:
apt install -y --no-install-recommends cuda-libraries-13-0
deepstream-videodecode-selftest says "DeepStream libs not found"
The _libs/ directory is empty or missing. Reinstall:
pip install --force-reinstall --no-deps nvidia_deepstream_videodecode-*.whl
Opening in BLOCKING MODE printed during decode
Informational message from nvv4l2decoder — not an error. Silence with:
GST_DEBUG=2 python3 your_script.py
dlsym failed: libcuvidv4l2.so: undefined symbol: libv4l2_plugin
You're on an old wheel that placed libcuvidv4l2_plugin.so alongside
the main libs. Current builds isolate it in v4l_plugins/. Rebuild and
reinstall.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nvidia_deepstream_videodecode_cu13-9.0.0-py3-none-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: nvidia_deepstream_videodecode_cu13-9.0.0-py3-none-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 15.8 MB
- Tags: Python 3, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a1d2f8d7ea3ab67b225e4fa0475a9fe44a6d4c791580dca05a7fae4f2aa3f29
|
|
| MD5 |
3a03009cfdb734ee09a32b2bec08b289
|
|
| BLAKE2b-256 |
b2e1083e07755d9e86d15377238ae0b0fde8d4c0df5e52d047214c7e03f6fe1c
|