Skip to main content

GPU-resident SIFT with zero-copy DLPack handoff -- pure Python, OpenCV-accurate, faster end-to-end

Project description

PySIFT

GPU-Resident Deterministic SIFT for Deep Learning Vision Pipelines

Python 3.9+ License: MIT CUDA

A pure-Python, GPU-resident SIFT implementation that matches OpenCV SIFT accuracy while running 26% faster end-to-end with 4x matching speedup. Zero-copy DLPack interop keeps tensors on the GPU across the full pipeline -- no PCIe round-trips.

Architecture

PySIFT Architecture

Key Features

  • GPU-resident pipeline -- Detection, description, matching, RANSAC, and blending all execute on the GPU via CuPy + Numba CUDA kernels
  • Zero-copy DLPack handoff -- CuPy arrays pass to PyTorch tensors without memory copies, enabling seamless integration with deep learning pipelines
  • OpenCV-accurate -- Numerically equivalent to OpenCV SIFT (Lowe 2004), verified across HPatches, Oxford 5K, IMC Phototourism, and MegaDepth-1500
  • Modular descriptor/matcher backends -- Swap in HardNet, HyNet (learned descriptors) or LightGlue (learned matching) with a single config flag
  • Deterministic -- Bitwise reproducible results via warp-shuffle reductions (no atomicAdd non-determinism)

Qualitative Results

Stitching Results

Installation

Prerequisites: CUDA dependencies

PySIFT requires an NVIDIA GPU with CUDA. Two dependencies must be installed manually because they are CUDA-version-specific:

# Check your CUDA version
nvcc --version

# 1. Install CuPy (pick ONE matching your CUDA version)
pip install cupy-cuda12x   # CUDA 12.x
pip install cupy-cuda11x   # CUDA 11.x

# 2. Install PyTorch with CUDA (default pip installs CPU-only!)
pip install torch --index-url https://download.pytorch.org/whl/cu124   # CUDA 12.4
pip install torch --index-url https://download.pytorch.org/whl/cu121   # CUDA 12.1
pip install torch --index-url https://download.pytorch.org/whl/cu118   # CUDA 11.8

Important: Both CuPy and PyTorch-CUDA are required runtime dependencies but cannot be auto-installed by pip because the correct package varies by CUDA version. Install them before installing PySIFT.

Install PySIFT

# From GitHub
pip install git+https://github.com/SivaIITM/PySIFT.git

# Or from source
git clone https://github.com/SivaIITM/PySIFT.git
cd PySIFT
pip install -e .

Full install (all dependencies at once)

pip install cupy-cuda12x   # or cupy-cuda11x
pip install -r requirements.txt
pip install git+https://github.com/SivaIITM/PySIFT.git

Recommended: depth-aware stitching

PySIFT uses MiDaS monocular depth estimation to split inliers into depth bands, giving each band its own homography. This significantly improves stitching quality for scenes with foreground/background parallax. Without timm, stitching falls back to a single global homography.

pip install timm>=0.9

Optional dependencies

# Learned descriptors (HardNet, HyNet, OriNet)
pip install kornia

# YAML config file support
pip install pyyaml

# Or install all optional deps at once
pip install -e ".[all]"

Quick Start

Python API

from pysift import PySIFT, GPUPyStitch

# Feature extraction
sift = PySIFT()
keypoints, descriptors = sift.detectAndCompute(gray_image)

# Panoramic stitching (2 or 3 images)
stitcher = GPUPyStitch()
panorama = stitcher.stitch(img_left, img_right)

CLI

# Basic stitching
pysift-stitch left.jpg right.jpg

# 3-image panorama with output directory
pysift-stitch left.jpg center.jpg right.jpg -o results/

# With config file
pysift-stitch left.jpg right.jpg --config config.yaml

# Learned pipeline
pysift-stitch left.jpg right.jpg --descriptor hardnet --matcher lightglue

Configuration Presets

Preset Orientation Descriptor Matcher Use Case
Classic histogram sift ratio Fastest. Full Lowe 2004 pipeline
Modern histogram sift lightglue Best accuracy with proven detection
Learned orinet hardnet lightglue Fully modern pipeline
Mobile histogram sift ratio Large phone images (auto-resize + denoise)

See config.yaml for all parameters and presets.

Requirements

Hardware

  • NVIDIA GPU with CUDA support (tested on RTX 3050 4GB and above)
  • CUDA Toolkit 11.x or 12.x

Software

Package Version Purpose
Python >= 3.9 Runtime
PyTorch >= 2.0 Tensor ops, SVD, CUDA graphs
CuPy >= 12.0 GPU arrays, CUDA kernels
Numba >= 0.57 JIT-compiled CUDA kernels
NumPy >= 1.22 CPU array operations
OpenCV >= 4.5 Image I/O, CLAHE
kornia >= 0.7 Optional: HardNet, HyNet, OriNet
timm >= 0.9 Optional: MiDaS depth estimation
PyYAML any Optional: config file support

Citation

@article{sivakumar2026pysift,
  title   = {PySIFT: GPU-Resident Deterministic SIFT for Deep Learning Vision Pipelines},
  author  = {Sivakumar, K.S.},
  journal = {arXiv preprint},
  year    = {2026}
}

License

This project is licensed under the MIT License -- see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

staysift-0.1.0.tar.gz (60.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

staysift-0.1.0-py3-none-any.whl (58.6 kB view details)

Uploaded Python 3

File details

Details for the file staysift-0.1.0.tar.gz.

File metadata

  • Download URL: staysift-0.1.0.tar.gz
  • Upload date:
  • Size: 60.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for staysift-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0f7b787fa9e01ff55e502ab63cdf155142058f76bc49c4e08babca7d21c1386e
MD5 a56ecb2a3922c327fc69ab9cdbe24a92
BLAKE2b-256 acbab74a867f326202663da675e26d35507efccf0b3e015402755ef15163a73a

See more details on using hashes here.

File details

Details for the file staysift-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: staysift-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 58.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for staysift-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 92471a3063046f9971f1453a024e35065a57b149fd3cc7e9453e9ff2d05106de
MD5 0d4d8e9433f8ebeafecfb9b050c33426
BLAKE2b-256 ae468b1a9404ce8a70975e9e28fa9f9f2b35ca31dbb2ea1fecfe343bb8e2ae9f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page