Predict GPU execution time & memory for PyTorch models — without running them.

These details have not been verified by PyPI

Project links

Project description

Blink 🔭

GPU Performance Predictor for Deep Learning Models

Blink predicts execution time and memory usage of PyTorch neural networks on GPU without actually running them. It combines classical ML (XGBoost, Random Forest) with a Graph Neural Network (GNN) that encodes the computational graph of any model architecture.

Overview

Given a PyTorch model and a batch size, Blink answers:

How long will a forward pass take on this GPU?
How much GPU memory will it consume?

This is useful for:

Batch size optimization before deployment
Hardware cost estimation for training runs
NAS (Neural Architecture Search) — filtering architectures by predicted cost

Architecture

PyTorch Model
      │
      ▼
┌─────────────────────┐
│  Feature Extractor  │  ← layer counts, FLOPs, params, depth, width, skip connections
│  + GNN Extractor    │  ← graph-based architecture encoding (ArchitectureGNN)
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│  Prediction Models  │
│  ─────────────────  │
│  · XGBoost (tuned)  │  ← main predictor (best MAPE)
│  · Random Forest    │  ← ensemble comparison
│  · GNN Predictor    │  ← graph-native, generalizes across architectures
│  · Linear / Ridge   │  ← baselines
└─────────┬───────────┘
          │
          ▼
   Predicted: execution_time_ms, memory_mb
   + Uncertainty bounds (lower / upper)

Project Structure

Blink/
├── dashboard.py             # 🖥️  Main Streamlit web app  (run this)
├── prediction_api.py        # 🌐  Flask REST API
│
├── ── Core ML Modules ──
│   ├── model_profiler.py    # GPU profiler (CUDA events)
│   ├── feature_extractor.py # Static feature extraction from nn.Module
│   ├── gnn_extractor.py     # GNN-based graph feature extraction
│   ├── gnn_model.py         # ArchitectureGNN model definition (PyG)
│   ├── prediction_model.py  # Train XGBoost / RF / Linear models
│   ├── train_gnn.py         # Train the GNN predictor
│   ├── train_memory_model.py# Train memory prediction model
│   ├── gpu_predictor.py     # Inference class with caching & batch support
│   ├── model_analyser.py    # Model complexity analysis utilities
│   ├── advanced_features.py # Extended feature engineering
│   ├── dynamic_predictor.py # Dynamic / online prediction
│   ├── gpu_info.py          # GPU metadata (pynvml)
│   ├── workload_scheduler.py# Batch workload scheduler
│   └── performance_monitor.py
│
├── scripts/                 # 🔬  Experiment & data scripts
│   ├── collect_data.py      # Profile CNN/Transformer/custom models → data/raw/
│   ├── enhance_dataset.py   # Augment dataset (more batch sizes / models)
│   ├── diverse_architectures.py  # Profile diverse arch families
│   ├── ablation_study.py    # 5-condition ablation (Table II in paper)
│   ├── generate_paper_figures.py # Reproduce all paper figures
│   └── generate_paper_tables.py  # Reproduce paper tables
│
├── tests/                   # ✅  Test suite
│   ├── test_diverse_models.py
│   ├── test_predictors.py
│   ├── test_profiler.py
│   ├── test_gnn_scaling.py
│   └── evaluate_gnn_vs_xgb.py
│
├── data/
│   ├── raw/                 # Raw profiling CSVs (gitignored)
│   ├── processed/           # Feature-engineered CSVs
│   ├── enriched/            # Final training-ready dataset
│   └── feedback_log.csv     # Online feedback loop log
│
├── models/                  # Serialized model artifacts (gitignored)
│   ├── xgboost_(tuned)_model.joblib
│   ├── random_forest_model.joblib
│   ├── gnn_predictor.pth
│   ├── memory_model.joblib
│   └── ...
│
├── results/
│   ├── figures/             # Paper figures (PNG)
│   ├── ablation_study_table.csv
│   ├── gnn_scaling_table.csv
│   └── ...
│
├── templates/index.html     # HTML template for web interface
├── legacy/                  # Archived / superseded scripts
├── requirements.txt
└── .gitignore

Installation

# 1. Clone the repo
git clone <your-repo-url>
cd Blink

# 2. Create a virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # Linux/macOS

# 3. Install dependencies
pip install -r requirements.txt

# 4. Install PyTorch Geometric (match your CUDA version)
# See: https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html
pip install torch-geometric

Requirements: NVIDIA GPU with CUDA, Python ≥ 3.10

Usage

1. Launch the Dashboard

streamlit run dashboard.py

Features: live model prediction, batch size optimizer, model comparison, performance monitor.

2. Collect Profiling Data

python scripts/collect_data.py --batch-sizes 1 4 16 32 64

3. Train Prediction Models

# Train XGBoost / RF / Linear baseline models
python prediction_model.py

# Train GNN predictor
python train_gnn.py

# Train memory model
python train_memory_model.py

4. Run Ablation Study

python scripts/ablation_study.py

5. Predict via Python API

from gpu_predictor import GPUPredictor
import torchvision.models as models

predictor = GPUPredictor()
model = models.resnet50(pretrained=False)
result = predictor.predict_for_custom_model(model, batch_size=16)
print(result)
# {'execution_time_ms': 12.4, 'memory_mb': 1820, 'confidence_lower': 11.1, ...}

Data Pipeline

collect_data.py
    └─▶ data/raw/*.csv          (GPU profiling measurements)
            │
            ▼
feature_extractor.py
    └─▶ data/processed/*.csv    (static model features)
            │
            ▼
enhance_dataset.py
    └─▶ data/enriched/*.csv     (augmented, training-ready)
            │
            ▼
prediction_model.py / train_gnn.py
    └─▶ models/                 (trained predictors)

Model Performance

Results on held-out test set (20% split):

Model	Exec Time MAPE	Memory MAPE	Notes
XGBoost (tuned)	~8%	~6%	Best overall
Random Forest	~11%	~9%	Robust baseline
GNN Predictor	~10%	~8%	Best on unseen architectures
Linear Regression	~22%	~19%	Baseline

(Full ablation study results: results/ablation_study_table.csv)

Dashboard

The Streamlit dashboard (dashboard.py) provides:

Tab	Description
🎯 Prediction	Predict execution time & memory for standard or custom models
⚡ Batch Optimizer	Find optimal batch size within a memory budget
📊 Model Comparison	Compare predictions across multiple architectures
📈 Performance Monitor	Live GPU utilization and prediction history

Paper Reproducibility

To reproduce all paper figures and tables:

python scripts/generate_paper_figures.py
python scripts/generate_paper_tables.py
python scripts/ablation_study.py

Outputs saved to results/figures/.

License

MIT License — see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Mar 27, 2026

0.1.7

Mar 25, 2026

0.1.6

Mar 25, 2026

0.1.5

Mar 25, 2026

This version

0.1.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blink_gpu-0.1.0.tar.gz (17.0 kB view details)

Uploaded Mar 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

blink_gpu-0.1.0-py3-none-any.whl (10.9 kB view details)

Uploaded Mar 4, 2026 Python 3

File details

Details for the file blink_gpu-0.1.0.tar.gz.

File metadata

Download URL: blink_gpu-0.1.0.tar.gz
Upload date: Mar 4, 2026
Size: 17.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for blink_gpu-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a8005ab7f8e210f1d127ebe2d5c8c7dbde541b2b9b8496360a6b74bac47b55b4`
MD5	`3b74c1523cdbc819bb135025a6dca76e`
BLAKE2b-256	`bfe8c76923fe63db0ae8771f01f3eb6a520689dff18ba03135ff5632b6e1c518`

See more details on using hashes here.

File details

Details for the file blink_gpu-0.1.0-py3-none-any.whl.

File metadata

Download URL: blink_gpu-0.1.0-py3-none-any.whl
Upload date: Mar 4, 2026
Size: 10.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for blink_gpu-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aa9d5eda9449a670ed7da2f5787294d042104f002f849afa6a33ab043e4762da`
MD5	`8223cc0fee97ec93d6ca8bb7672ee5be`
BLAKE2b-256	`f0d57b434549a276ef26486990e998fdd44f19b7681af5077ef0bc23f963742c`

See more details on using hashes here.

blink-gpu 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Blink 🔭

📋 Table of Contents

Overview

Architecture

Project Structure

Installation

Usage

1. Launch the Dashboard

2. Collect Profiling Data

3. Train Prediction Models

4. Run Ablation Study

5. Predict via Python API

Data Pipeline

Model Performance

Dashboard

Paper Reproducibility

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes