Skip to main content

Neuromeka stereo inference utilities (TensorRT)

Project description

neuromeka_stereo

TensorRT stereo inference utilities packaged for PyPI.

1. Install

1.1 Python package

pip install neuromeka_stereo

1.2 TensorRT runtime (plan execution)

Choose the Python TensorRT package that matches your driver CUDA major version.

CUDA 12.x drivers:

python -m pip install --extra-index-url https://pypi.nvidia.com tensorrt-cu12==10.14.1.48.post1
python -m pip install "cuda-python<13"  # CUDA 12.x drivers

CUDA 13.0+ drivers:

python -m pip install --extra-index-url https://pypi.nvidia.com "tensorrt-cu13==10.15.1.*"
python -m pip install cuda-python

If the exact TensorRT version differs, replace the version string with the latest available from the NVIDIA PyPI index.

1.3 Assets (manual placement required)

Until official download URLs are published, plan/onnx files must be placed manually. If a required plan file is missing or empty, StereoInference raises an error with the expected path and filename.

Place plan files here (preferred for this repo):

src/neuromeka_stereo/assets/

Or place them in the cache directory:

~/.cache/neuromeka_stereo

Expected filenames:

  • foundation_stereo_480x640_RTX4060.plan
  • foundation_stereo_480x640_RTX5060.plan
  • foundation_stereo_480x640_RTX5090.plan
  • foundation_stereo_480x640_RTX5070Ti.plan

Default selection:

  • If trt_path is omitted, NEUROMEKA_STEREO_DEFAULT_PLAN is used.
  • If that is not set, the default is foundation_stereo_480x640_RTX4060.plan.

2. Usage

2.1 StereoInference API

Basic usage:

from neuromeka_stereo import StereoInference

fs = StereoInference(
    trt_path="foundation_stereo_480x640_RTX4060.plan",
    trt_width=640,
    trt_height=480,
    intrinsics=intrinsics,
    extrinsics=extrinsics,
    color_intrinsics=color_intrinsics,
    extrinsics_ir_to_color=extrinsics_ir_to_color,
    z_far=10.0
)

disp = fs.infer(left_ir, right_ir)
depth_m = fs.infer(left_ir, right_ir, return_depth=True)
depth_aligned = fs.align_depth_splat(depth_m, out_size=(color_height, color_width))

StereoInference(trt_path=None, trt_width=0, trt_height=0, intrinsics=None, extrinsics=None, color_intrinsics=None, extrinsics_ir_to_color=None, z_far=0.0)

  • trt_path: Path to a .plan/.engine. If omitted or not found, it is treated as an asset name. The default asset is foundation_stereo_480x640_RTX4060.plan. If no local asset exists, a FileNotFoundError is raised.
  • trt_width, trt_height: Optional input size override. Required for dynamic engines that do not expose a fixed input shape. If the engine has a fixed shape, that shape is used instead.
  • intrinsics: Camera intrinsics for the left IR image. Must provide fx and the source resolution (width, height). The class scales fx to the inference resolution internally. RealSense intrinsics objects are supported.
  • extrinsics: Stereo extrinsics between IR1 and IR2. Must provide a translation vector (meters). The baseline is the translation norm.
  • color_intrinsics: Intrinsics for the RGB camera used in alignment.
  • extrinsics_ir_to_color: Extrinsics from IR1 to RGB for alignment.
  • z_far: Depth clip in meters. Set to 0 to disable clipping.

infer(left_ir, right_ir, return_depth=False)

  • left_ir, right_ir: NumPy arrays with matching H x W. Supported shapes are (H, W), (H, W, 1), (H, W, 3). Grayscale inputs are replicated to 3 channels. Values are converted to float32 internally.
  • If input size does not match the engine input, a ValueError is raised. Resize inputs to the engine shape before calling infer().
  • Default output is a disparity map (float32) in pixels with shape (H, W).
  • If return_depth=True, intrinsics and extrinsics must be set and the disparity is converted to metric depth: depth = (fx * baseline) / disp. The returned depth is in meters and optionally clipped by z_far.

align_depth_splat(depth_ir, color_intrinsics=None, extrinsics_ir_to_color=None, ir_intrinsics=None, out_size=None, splat_size=2, z_far=None)

  • Aligns a depth map (IR space) into the RGB frame using a splat projection.
  • depth_ir: Depth map from infer(..., return_depth=True).
  • color_intrinsics: Intrinsics for the RGB camera (optional if set on the instance).
  • extrinsics_ir_to_color: Extrinsics from IR1 to RGB (optional if set on the instance).
  • ir_intrinsics: Intrinsics for the IR camera. If omitted, uses self.intrinsics.
  • out_size: (H, W) output size override for the aligned map.
  • splat_size: Splat kernel size (default 2 for a 2x2 splat).
  • z_far: Optional clip distance. Defaults to self.z_far if not provided.

Notes:

  • Create one StereoInference instance and reuse it for streaming to avoid reloading the TensorRT engine each frame.
  • Call ensure_loaded() to force engine loading and shape detection early.
  • Call unload() (alias close()/release()) to free TensorRT engine/context and GPU buffers when not needed; the next infer() or ensure_loaded() will reload the engine.
  • Plan files are GPU model and TensorRT major-version specific.
  • Use set_calibration(intrinsics=..., extrinsics=...) if camera calibration values change after initialization.
  • Use set_alignment(color_intrinsics=..., extrinsics_ir_to_color=...) if RGB alignment parameters change after initialization.
  • Make sure intrinsics.width/height match the raw input resolution before any resizing; the class scales to the inference resolution.
  • Note: With a 480x640 FP16 plan, RTX 5060 measured ~2794 MiB VRAM after load (actual usage varies by environment, driver, and other processes).

2.2 RealSense demo (camera quickstart)

This example connects to a RealSense camera and runs inference on the IR stereo pair. It also compares the output with RealSense depth for visualization.

Install the dependency and run:

python -m pip install neuromeka_stereo[demo]
# or: python -m pip install opencv-python pyrealsense2
neuromeka-stereo-realsense \
  --trt_path foundation_stereo_480x640_RTX4060.plan \
  --width 640 --height 480 --fps 30 --z_far 10

If you are running from the repo:

python examples/run_realsense_demo.py \
  --trt_path foundation_stereo_480x640_RTX4060.plan \
  --width 640 --height 480 --fps 30 --z_far 10

Notes:

  • Ensure the plan file exists in src/neuromeka_stereo/assets/ or ~/.cache/neuromeka_stereo before running.
  • The plan input size is fixed; set --width/--height to match the plan.
  • In the demo, intrinsics comes from the RealSense IR stream and is scaled internally to the inference resolution. extrinsics is IR1->IR2, and z_far comes from --z_far (default 10m).
  • The display shows RGB, RS Depth, FS Depth (Raw), and FS Depth (Aligned).

3. Optional

Rebuild plan from ONNX

If you also have an ONNX file on the new machine, build a fresh plan. Place the ONNX file under src/neuromeka_stereo/assets/onnx/ or adjust the path below.

Install trtexec (choose a CUDA repo that matches your target runtime).

CUDA 12.9 repo (TensorRT 10.14.x example):

sudo apt-get install -y --allow-downgrades \
  libnvinfer-bin=10.14.1.48-1+cuda12.9 \
  libnvinfer10=10.14.1.48-1+cuda12.9 \
  libnvinfer-plugin10=10.14.1.48-1+cuda12.9 \
  libnvonnxparsers10=10.14.1.48-1+cuda12.9 \
  libnvinfer-lean10=10.14.1.48-1+cuda12.9 \
  libnvinfer-vc-plugin10=10.14.1.48-1+cuda12.9 \
  libnvinfer-dispatch10=10.14.1.48-1+cuda12.9

CUDA 13.x repo (13.0/13.1, TensorRT 10.15.x example):

sudo apt-get install -y --allow-downgrades \
  libnvinfer-bin=10.15.1.29-1+cuda13.1 \
  libnvinfer10=10.15.1.29-1+cuda13.1 \
  libnvinfer-plugin10=10.15.1.29-1+cuda13.1 \
  libnvonnxparsers10=10.15.1.29-1+cuda13.1 \
  libnvinfer-lean10=10.15.1.29-1+cuda13.1 \
  libnvinfer-vc-plugin10=10.15.1.29-1+cuda13.1 \
  libnvinfer-dispatch10=10.15.1.29-1+cuda13.1

If the exact version string differs, check the available versions and replace the numbers above:

apt-cache policy libnvinfer-bin | sed -n '1,20p'
/usr/src/tensorrt/bin/trtexec \
  --onnx=./src/neuromeka_stereo/assets/onnx/foundation_stereo_23-51-11_640x480.onnx \
  --saveEngine=./src/neuromeka_stereo/assets/foundation_stereo_480x640_RTX4060.plan \
  --fp16 \
  --shapes=left:1x3x480x640,right:1x3x480x640 \
  --skipInference

Local development setup (optional)

If you want to edit the package in-place or use the full dev stack:

python -m pip install -e .
cd /home/user/neuromeka-repo/nrmk_foundation_stereo
conda env create -f environment.yml
conda activate nrmk_fs

Appendix: Folder layout

./
  src/
    neuromeka_stereo/
      fs_infer.py
      assets/
        (empty by default; place plan/onnx here)
      FoundationStereo_TRT/
        core/
        depth_anything_pretrained_models/
        Utils.py
  examples/
    run_realsense_demo.py
  pyproject.toml
  environment.yml
  README.md

Appendix: Benchmark (D435 IR, 480x640)

Preliminary results; update for RTX 5060/5090 later.

GPU Backend Input Latency (sec) Notes
RTX 4060 TensorRT plan 480x640 ~0.42 TRT plan execution
RTX 5060 TensorRT plan 480x640 ~0.30 TRT plan execution
RTX 5070Ti TensorRT plan 480x640 ~0.24 TRT plan execution
RTX 5090 TensorRT plan 480x640 ~0.07 TRT plan execution
RTX 4060 Torch 480x640 ~1.12 Torch inference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuromeka_stereo-0.1.3.tar.gz (35.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neuromeka_stereo-0.1.3-py3-none-any.whl (39.7 kB view details)

Uploaded Python 3

File details

Details for the file neuromeka_stereo-0.1.3.tar.gz.

File metadata

  • Download URL: neuromeka_stereo-0.1.3.tar.gz
  • Upload date:
  • Size: 35.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for neuromeka_stereo-0.1.3.tar.gz
Algorithm Hash digest
SHA256 8642139b3d6f39d9544875ecb8ef5776fa23d224e31b00c8a94a04f8622ec360
MD5 0b5871fb46d284b45a12c4db87bb334a
BLAKE2b-256 6861e55c19eddd76af90702f9e73a88a0050e461f094c28ddc278364b8b60359

See more details on using hashes here.

File details

Details for the file neuromeka_stereo-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for neuromeka_stereo-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ab825cb7f0c2d6b62e9a91d6712903de6b9ed95fef897d324590321f56025897
MD5 a7bf3262f648891aa301c996f0e99ff6
BLAKE2b-256 09fe3b343e1490241a2a2d6777596f4bc393fc6047c91e01a25e6e30b6c750b2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page