GPU-resident SIFT with zero-copy DLPack handoff -- pure Python, OpenCV-accurate, faster end-to-end
Project description
PySIFT
GPU-Resident Deterministic SIFT for Deep Learning Vision Pipelines
A pure-Python, GPU-resident SIFT implementation that matches OpenCV SIFT accuracy while running 26% faster end-to-end with 4x matching speedup. Zero-copy DLPack interop keeps tensors on the GPU across the full pipeline -- no PCIe round-trips.
Architecture
Key Features
- GPU-resident pipeline -- Detection, description, matching, RANSAC, and blending all execute on the GPU via CuPy + Numba CUDA kernels
- Zero-copy DLPack handoff -- CuPy arrays pass to PyTorch tensors without memory copies, enabling seamless integration with deep learning pipelines
- OpenCV-accurate -- Numerically equivalent to OpenCV SIFT (Lowe 2004), verified across HPatches, Oxford 5K, IMC Phototourism, and MegaDepth-1500
- Modular descriptor/matcher backends -- Swap in HardNet, HyNet (learned descriptors) or LightGlue (learned matching) with a single config flag
- Deterministic -- Bitwise reproducible results via warp-shuffle reductions (no atomicAdd non-determinism)
Qualitative Results
Installation
Prerequisites: CUDA dependencies
PySIFT requires an NVIDIA GPU with CUDA. Two dependencies must be installed manually because they are CUDA-version-specific:
# Check your CUDA version
nvcc --version
# 1. Install CuPy (pick ONE matching your CUDA version)
pip install cupy-cuda12x # CUDA 12.x
pip install cupy-cuda11x # CUDA 11.x
# 2. Install PyTorch with CUDA (default pip installs CPU-only!)
pip install torch --index-url https://download.pytorch.org/whl/cu124 # CUDA 12.4
pip install torch --index-url https://download.pytorch.org/whl/cu121 # CUDA 12.1
pip install torch --index-url https://download.pytorch.org/whl/cu118 # CUDA 11.8
Important: Both CuPy and PyTorch-CUDA are required runtime dependencies but cannot be auto-installed by pip because the correct package varies by CUDA version. Install them before installing PySIFT.
Install PySIFT
# From GitHub
pip install git+https://github.com/SivaIITM/PySIFT.git
# Or from source
git clone https://github.com/SivaIITM/PySIFT.git
cd PySIFT
pip install -e .
Full install (all dependencies at once)
pip install cupy-cuda12x # or cupy-cuda11x
pip install -r requirements.txt
pip install git+https://github.com/SivaIITM/PySIFT.git
Recommended: depth-aware stitching
PySIFT uses MiDaS monocular depth estimation to split inliers into depth bands, giving each band its own homography. This significantly improves stitching quality for scenes with foreground/background parallax. Without timm, stitching falls back to a single global homography.
pip install timm>=0.9
Optional dependencies
# Learned descriptors (HardNet, HyNet, OriNet)
pip install kornia
# YAML config file support
pip install pyyaml
# Or install all optional deps at once
pip install -e ".[all]"
Quick Start
Python API
from pysift import PySIFT, GPUPyStitch
# Feature extraction
sift = PySIFT()
keypoints, descriptors = sift.detectAndCompute(gray_image)
# Panoramic stitching (2 or 3 images)
stitcher = GPUPyStitch()
panorama = stitcher.stitch(img_left, img_right)
CLI
# Basic stitching
pysift-stitch left.jpg right.jpg
# 3-image panorama with output directory
pysift-stitch left.jpg center.jpg right.jpg -o results/
# With config file
pysift-stitch left.jpg right.jpg --config config.yaml
# Learned pipeline
pysift-stitch left.jpg right.jpg --descriptor hardnet --matcher lightglue
Configuration Presets
| Preset | Orientation | Descriptor | Matcher | Use Case |
|---|---|---|---|---|
| Classic | histogram | sift | ratio | Fastest. Full Lowe 2004 pipeline |
| Modern | histogram | sift | lightglue | Best accuracy with proven detection |
| Learned | orinet | hardnet | lightglue | Fully modern pipeline |
| Mobile | histogram | sift | ratio | Large phone images (auto-resize + denoise) |
See config.yaml for all parameters and presets.
Requirements
Hardware
- NVIDIA GPU with CUDA support (tested on RTX 3050 4GB and above)
- CUDA Toolkit 11.x or 12.x
Software
| Package | Version | Purpose |
|---|---|---|
| Python | >= 3.9 | Runtime |
| PyTorch | >= 2.0 | Tensor ops, SVD, CUDA graphs |
| CuPy | >= 12.0 | GPU arrays, CUDA kernels |
| Numba | >= 0.57 | JIT-compiled CUDA kernels |
| NumPy | >= 1.22 | CPU array operations |
| OpenCV | >= 4.5 | Image I/O, CLAHE |
| kornia | >= 0.7 | Optional: HardNet, HyNet, OriNet |
| timm | >= 0.9 | Optional: MiDaS depth estimation |
| PyYAML | any | Optional: config file support |
Citation
@article{sivakumar2026pysift,
title = {PySIFT: GPU-Resident Deterministic SIFT for Deep Learning Vision Pipelines},
author = {Sivakumar, K.S.},
journal = {arXiv preprint},
year = {2026}
}
License
This project is licensed under the MIT License -- see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file staysift-0.1.0.tar.gz.
File metadata
- Download URL: staysift-0.1.0.tar.gz
- Upload date:
- Size: 60.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f7b787fa9e01ff55e502ab63cdf155142058f76bc49c4e08babca7d21c1386e
|
|
| MD5 |
a56ecb2a3922c327fc69ab9cdbe24a92
|
|
| BLAKE2b-256 |
acbab74a867f326202663da675e26d35507efccf0b3e015402755ef15163a73a
|
File details
Details for the file staysift-0.1.0-py3-none-any.whl.
File metadata
- Download URL: staysift-0.1.0-py3-none-any.whl
- Upload date:
- Size: 58.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92471a3063046f9971f1453a024e35065a57b149fd3cc7e9453e9ff2d05106de
|
|
| MD5 |
0d4d8e9433f8ebeafecfb9b050c33426
|
|
| BLAKE2b-256 |
ae468b1a9404ce8a70975e9e28fa9f9f2b35ca31dbb2ea1fecfe343bb8e2ae9f
|