Predict GPU execution time & memory for PyTorch models โ without running them.
Project description
Blink ๐ญ
GPU Performance Predictor for Deep Learning Models
Blink predicts the execution time and peak memory usage of PyTorch neural networks on GPU hardware before you actually run or deploy them.
It combines classical ML (XGBoost, Random Forest) with a Graph Neural Network (GNN) that encodes the computational graph of any model architecture, acting as a "virtual profiler."
โก Quick Start
Installation
Blink is published on PyPI. You can install the core API, or install with optional dependency groups:
# Core prediction API only
pip install blink-gpu
# Include Streamlit Dashboard, SHAP explainability, and Plotly
pip install "blink-gpu[full]"
# Include FastAPI REST Server
pip install "blink-gpu[api]"
# Install everything
pip install "blink-gpu[all]"
Note: You must install PyTorch (torch, torchvision) separately according to your CUDA hardware.
Python Usage
import torchvision.models as tv
from blink import BlinkPredictor, BlinkAnalyzer
# 1. Analyze any PyTorch model architecture
model = tv.resnet18(weights=None)
print(BlinkAnalyzer().summary(model))
# โ Parameters: 11,689,512 | FLOPs: 1,814 M | Conv layers: 20 | Size: 44.59 MB
# 2. Predict execution time and memory for a batch size
predictor = BlinkPredictor()
result = predictor.predict(model, batch_size=32)
print(f"Exec time: {result['exec_time_ms']:.1f} ms")
print(f"Memory : {result['memory_mb']:.1f} MB")
# โ Exec time: 18.3 ms | Memory: 184.3 MB
# 3. Sweep multiple batch sizes
sweep = predictor.predict_batch("resnet50", batch_sizes=[1, 16, 32, 64])
๐ป Command Line Interface (CLI)
Blink comes with a built-in CLI for quick profiling without writing scripts:
# Predict via CLI
$ blink predict resnet50 --batch-size 32
๐ฎ Blink prediction for 'resnet50'
Batch Exec (ms) Memory (MB) CI-Exec (80%)
------------------------------------------------------------
32 28.45 294.5 [22.1 - 36.6]
# Launch the Streamlit Dashboard
$ blink dashboard --port 8501
# Launch the FastAPI REST Server
$ blink server --host 0.0.0.0 --port 8000
๐ Streamlit Dashboard & Explainability
Blink includes a rich, interactive web dashboard. Run blink dashboard to access:
- Live Predictions: Instantly predict performance for custom PyTorch code or TorchVision models.
- ๐ SHAP Explainability ("Why this prediction?"): Interactive waterfall charts explaining exactly which architectural features (e.g., FLOPs, Conv layers, Model Depth) drove the predicted execution time and memory footprint up or down.
- Batch Size Optimizer: Find the maximum batch size that fits within your specific GPU memory budget (e.g., 8GB, 16GB, 24GB).
- Compare Architectures: Side-by-side performance comparison of different models.
๐ REST API & Docker Deployment
Blink can be deployed as a microservice to provide GPU cost estimates to other applications.
Docker Compose (Recommended)
You can spin up both the Streamlit Dashboard and the FastAPI backend instantly using Docker.
git clone https://github.com/Aniketxmishra/Blink_Main.git
cd Blink_Main
docker compose up -d
- Dashboard:
http://localhost:8501 - REST API:
http://localhost:8000/docs(Swagger UI)
REST API Example
curl -X POST "http://localhost:8000/api/v2/predict" \
-H "Content-Type: application/json" \
-d '{"model_name": "resnet50", "batch_size": 32}'
# Response:
# {
# "model_name": "resnet50",
# "batch_size": 32,
# "predictions": {
# "exec_time_ms": 28.45,
# "exec_time_bounds": [22.1, 36.6],
# "memory_usage_mb": 294.5,
# ...
# }
# }
๐ง How it Works (Architecture)
PyTorch Model
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ Feature Extractor โ โ layer counts, FLOPs, params, depth, width, skip connections
โ + GNN Extractor โ โ graph-based architecture encoding (ArchitectureGNN)
โโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ Prediction Models โ
โ โโโโโโโโโโโโโโโโโ โ
โ ยท XGBoost (tuned) โ โ main predictor (best MAPE) + SHAP Explainer
โ ยท Random Forest โ โ latency confidence intervals (Quantile Regression)
โ ยท GNN Predictor โ โ graph-native, generalizes across architectures
โโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โผ
Predicted: exec_time_ms, memory_mb
Model Performance on Held-out Data:
- Execution Time (XGBoost): ~8% MAPE
- Memory Usage (XGBoost): ~6% MAPE
๐ฌ Development & Paper Reproducibility
Blink was developed alongside a research study evaluating the efficacy of static and graph-based features for GPU performance prediction.
To reproduce the paper's figures and ablation study:
git clone https://github.com/Aniketxmishra/Blink_Main.git
cd Blink_Main
pip install -e ".[full]"
python scripts/ablation_study.py
python scripts/generate_paper_figures.py
Outputs will be saved to the results/ directory.
๐ License
MIT License โ see LICENSE for details. Made by Aniket Mishra.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file blink_gpu-0.1.6.tar.gz.
File metadata
- Download URL: blink_gpu-0.1.6.tar.gz
- Upload date:
- Size: 329.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bfcfdfcd700d7774ee84eca61cefb22c8463d96b9fef923325cfb45f730d52e
|
|
| MD5 |
ea705106b36576a270d742559805c5e0
|
|
| BLAKE2b-256 |
ade9800d3f95fd33a332f1c43c6e548b4cd0895fbd480d73f0c0e946ac6989e3
|
File details
Details for the file blink_gpu-0.1.6-py3-none-any.whl.
File metadata
- Download URL: blink_gpu-0.1.6-py3-none-any.whl
- Upload date:
- Size: 345.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7adeadd174922bd834c264efd82ad8142f5c126cafe7e5b28c3ae3542bb5b384
|
|
| MD5 |
3d62eebe738493f76ead50991638eea3
|
|
| BLAKE2b-256 |
74e189e2b4c28728576e3ef67706da45da9707ce83650fa6d9fc34214cffcafe
|