Skip to main content

GPU-resident video decode via NVIDIA DeepStream — single-wheel install bundles the DS shared libs in a flat _libs/ directory and sets GST_PLUGIN_PATH + LIBV4L2_PLUGIN_DIR on import.

Project description

deepstream-videodecode

GPU-resident video decode via NVIDIA DeepStream — single-wheel install that bundles the DS shared libraries in a flat _libs/ directory and configures GStreamer + libv4l plugin discovery on Python import.


Building the libs zip and wheel (from scratch)

Step 1 — Start the ds_cuda container

Start the container once, mounting the DeepStream source tree at the same path as the host:

docker run -d \
  --name ds_cuda \
  --gpus all \
  --network host \
  --ipc host \
  -e NVIDIA_DRIVER_CAPABILITIES=all \
  -v /path/to/ds-rel-39:/path/to/ds-rel-39 \
  nvidia/cuda:13.0.3-devel-ubuntu22.04 \
  sleep infinity

Install build dependencies inside the container (one-time):

docker exec ds_cuda apt-get update -qq
docker exec ds_cuda apt-get install -y \
    zip \
    libgstreamer1.0-dev \
    libgstreamer-plugins-base1.0-dev \
    libjsoncpp-dev

Step 2 — Build the libraries and create the zip

docker exec ds_cuda bash /path/to/ds-rel-39/build_ds_vllm_libs.sh \
    /path/to/ds-rel-39/deepstream_decode_libs-9.0.0+cuda13.ubuntu2204-x86_64.zip

build_ds_vllm_libs.sh (included in this repo) builds all 15 required .so files from source inside the container and zips only those targets. The zip appears on the host at the same path (volume mount).

Step 3 — Upload zip to GitLab Package Registry

Upload once — re-upload only when DS libs change (bump the version to match):

export TOKEN=glpat-xxxxxxxxxxxxxxxxxxxx   # Personal Access Token (api scope)
export PROJECT_ID=364387
export ZIP=/path/to/ds-rel-39/deepstream_decode_libs-9.0.0+cuda13.ubuntu2204-x86_64.zip

curl -X PUT \
     --http1.1 \
     --header "PRIVATE-TOKEN: $TOKEN" \
     --header "Content-Type: application/octet-stream" \
     --data-binary @"$ZIP" \
     "https://gitlab-master.nvidia.com/api/v4/projects/$PROJECT_ID/packages/generic/deepstream_decode_libs/9.0.0/deepstream_decode_libs-9.0.0-cuda13-ubuntu2204-x86_64.zip"

Note: the registry filename uses hyphens (no +) to avoid URL encoding issues.

Step 4 — Build the wheel (CI or local)

Via CI (automatic on every push to main): The .gitlab-ci.yml pipeline downloads the zip from the Package Registry and builds the wheel. Download the .whl from the pipeline artifacts page.

Locally (if needed):

# Prerequisites
sudo apt install patchelf binutils unzip python3 python3-pip
pip install build

cp /path/to/ds-rel-39/deepstream_decode_libs-9.0.0+cuda13.ubuntu2204-x86_64.zip ./
./build_wheel.sh
# → dist/nvidia_deepstream_videodecode-9.0.0+cu13.ubuntu2204-py3-none-manylinux_2_34_x86_64.whl

Install (consumer side)

apt update
apt install gstreamer1.0-tools gstreamer1.0-plugins-{base,good,bad,ugly} \
            gstreamer1.0-libav python3-gi python3-gst-1.0 libv4l-0 \
            cuda-libraries-13-0
pip install nvidia_deepstream_videodecode-*.whl

Quickstart

# Built-in selftest — verifies lib resolution, plugin discovery, CUDA context.
deepstream-videodecode-selftest

# Decode a file
python3 examples/decode_example.py /path/to/video.mp4

# File + live RTSP source
python3 examples/decode_example.py /path/to/video.mp4 \
    --rtsp rtsp://10.24.217.130:8554/ --workers 4 --frames 16

Successful output ends with frames shape : (N, H, W, 3) on torch.uint8, cuda:0 — the GPU tensor is ready for downstream consumers with no D2H copy.

Public API

from nvidia.deepstream_videodecode import (
    DecodePool,        # pool of N file-decode pipelines on N threads
    StreamHandle,      # one persistent pipeline for an RTSP/URI stream
    DecodeFrames,      # @dataclass: frames, n_kept, n_total, fps, error
    probe_metadata,    # GStreamer-only metadata probe (no decode, no PyAV)
    lib_dir,           # path to the bundled _libs/ directory
)

probe_metadata(data) -> (frame_count, fps, duration_sec, width, height, codec)

Read container metadata from raw bytes using GStreamer only — no frames are decoded, no external library (PyAV / libmediainfo) needed.

DecodePool.decode(data, *, target_indices, codec="", max_frames, timeout_sec) -> DecodeFrames

Decode raw container bytes on a pool worker and keep the frames whose decode-order index is in target_indices.

from nvidia.deepstream_videodecode import DecodePool, probe_metadata

pool = DecodePool(num_workers=8)
data = open("/path/to/video.mp4", "rb").read()

fc, fps, dur, w, h, codec = probe_metadata(data)
import numpy as np
indices = np.linspace(0, fc - 1, 8, dtype=int).tolist()

out = pool.decode(data, target_indices=indices, codec=codec,
                  max_frames=len(indices))
# out.frames: CUDA tensor (out.n_kept, H, W, 3) uint8 — no D2H copy.

What ships in the wheel

nvidia/
└── deepstream_videodecode/
    ├── __init__.py
    ├── _ds_dec.py        # DecodePool / StreamHandle API
    ├── _runtime.py       # _libs/ path resolver
    ├── _selftest.py      # deepstream-videodecode-selftest CLI
    ├── _version.py
    └── _libs/                           # flat layout
    ├── libnvbufsurface.so
    ├── libnvbufsurftransform.so     (~26 MB)
    ├── libnvbuf_fdmap.so
    ├── libnvds_meta.so
    ├── libnvdsbufferpool.so
    ├── libnvdsgst_helper.so
    ├── libnvdsgst_meta.so
    ├── libgstnvdsseimeta.so
    ├── libgstnvcustomhelper.so
    ├── libnvv4l2.so
    ├── libcuvidv4l2.so
    ├── libv4l2.so.0                 (symlink → libnvv4l2.so)
    ├── libgstnvvideo4linux2.so      (GStreamer plugin)
    ├── libgstnvvideoconvert.so      (GStreamer plugin)
    └── v4l_plugins/
        └── libcuvidv4l2_plugin.so   (libv4l plugin)

Troubleshooting

GStreamer element creation failed: ['nvdec', 'nvvconv']

GStreamer caches a plugin registry at ~/.cache/gstreamer-1.0/. If the cache was built before cuda-libraries-13-0 was installed, it records "this plugin failed to load" and never retries.

rm -rf ~/.cache/gstreamer-1.0/
python3 -c "import nvidia.deepstream_videodecode"   # forces rescan
gst-inspect-1.0 nvv4l2decoder           # should now print Factory Details

libnppig.so.13: cannot open shared object file

CUDA NPP runtime is missing:

apt install -y --no-install-recommends cuda-libraries-13-0

deepstream-videodecode-selftest says "DeepStream libs not found"

The _libs/ directory is empty or missing. Reinstall:

pip install --force-reinstall --no-deps nvidia_deepstream_videodecode-*.whl

Opening in BLOCKING MODE printed during decode

Informational message from nvv4l2decoder — not an error. Silence with:

GST_DEBUG=2 python3 your_script.py

dlsym failed: libcuvidv4l2.so: undefined symbol: libv4l2_plugin

You're on an old wheel that placed libcuvidv4l2_plugin.so alongside the main libs. Current builds isolate it in v4l_plugins/. Rebuild and reinstall.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nvidia_deepstream_videodecode_cu13-9.0.0-py3-none-manylinux_2_34_x86_64.whl (15.8 MB view details)

Uploaded Python 3manylinux: glibc 2.34+ x86-64

File details

Details for the file nvidia_deepstream_videodecode_cu13-9.0.0-py3-none-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_deepstream_videodecode_cu13-9.0.0-py3-none-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 6a1d2f8d7ea3ab67b225e4fa0475a9fe44a6d4c791580dca05a7fae4f2aa3f29
MD5 3a03009cfdb734ee09a32b2bec08b289
BLAKE2b-256 b2e1083e07755d9e86d15377238ae0b0fde8d4c0df5e52d047214c7e03f6fe1c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page