GPU Virtual Memory Stitching SDK: CUDA VMM allocator with chunk caching and DLPack tensors for PyTorch
Project description
deep-variance
GPU Virtual Memory Management SDK
CUDA virtual memory management (VMM) with physical chunk caching and DLPack-backed PyTorch tensors.
Ships as a pre-compiled wheel — no compiler or build tools required on install.
Requirements
| Python | 3.12 (cp312) |
| PyTorch | any CUDA build |
| CUDA | 12.x driver and runtime |
| Platform | Linux x86_64 (glibc ≥ 2.34 — Ubuntu 22.04+, RHEL 9+) |
Install
# 1. Install PyTorch for your CUDA version (https://pytorch.org)
pip install torch
# 2. On HPC, load CUDA if it is not in your path
module load cuda
# 3. Install deep-variance
pip install deep-variance
Usage
import torch
from deep_variance import (
vmm_empty,
vmm_empty_nd,
set_cache_limit,
cache_stats,
)
# 1-D allocation: 1 M float32 elements on CUDA device 0
t = vmm_empty(1_000_000, dtype=torch.float32, device="cuda:0")
# N-D allocation: (100, 1000) float32
t = vmm_empty_nd((100, 1000), dtype=torch.float32)
# Tune the physical chunk cache (2 GB per pool)
set_cache_limit(device_id=0, chunk_bytes=0, max_bytes=2 * 1024**3)
# Inspect cache utilisation
print(cache_stats())
Environment check
deep-variance-check # check C++, torch, CUDA
deep-variance-check --module-load # also run `module load cuda` if CUDA not visible
Or from Python:
from deep_variance import check_environment, ensure_cuda_visible
ensure_cuda_visible(use_module=True) # attempt `module load cuda` if needed
report = check_environment()
for name, (ok, msg) in report.items():
print(f"{name}: {'ok' if ok else 'MISSING'} — {msg}")
Analytics (opt-out)
Usage telemetry is enabled by default. Events are sent from a background daemon thread and never block the caller. All network and I/O errors are silently ignored. No personally identifiable information is collected.
To opt out, set the environment variable before importing:
export DEEP_VARIANCE_NO_TELEMETRY=1
Or disable at runtime:
from deep_variance import disable_analytics, analytics_summary
disable_analytics() # stop for this process
print(analytics_summary()) # inspect counts collected so far
To write events to a local SQLite file instead of the network endpoint:
export DEEP_VARIANCE_ANALYTICS_DB=/path/to/events.db
Development
git clone <repo>
cd deepvariance-ms-sdk
pip install -e ".[dev]"
pytest # unit tests (no GPU required)
pytest --run-cuda-live # + CUDA-live tests (requires CUDA GPU)
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deep_variance-1.0.4-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: deep_variance-1.0.4-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae75a4ae0c4f3f1a3fb0d2424069fb77df823c051df9cf8321c70e77fd95e585
|
|
| MD5 |
98bd44bdf750fcaa3083db8b3ccc0a97
|
|
| BLAKE2b-256 |
75d264e26584e7d7e932ac4e7ef82744302bfb7e286b0a409d9014b59ccb1d6f
|