Skip to main content

Predict GPU execution time & memory for PyTorch models โ€” without running them.

Project description

Blink ๐Ÿ”ญ

GPU Performance Predictor for Deep Learning Models

Blink predicts execution time and memory usage of PyTorch neural networks on GPU without actually running them. It combines classical ML (XGBoost, Random Forest) with a Graph Neural Network (GNN) that encodes the computational graph of any model architecture.


๐Ÿ“‹ Table of Contents


Overview

Given a PyTorch model and a batch size, Blink answers:

  • How long will a forward pass take on this GPU?
  • How much GPU memory will it consume?

This is useful for:

  • Batch size optimization before deployment
  • Hardware cost estimation for training runs
  • NAS (Neural Architecture Search) โ€” filtering architectures by predicted cost

Architecture

PyTorch Model
      โ”‚
      โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Feature Extractor  โ”‚  โ† layer counts, FLOPs, params, depth, width, skip connections
โ”‚  + GNN Extractor    โ”‚  โ† graph-based architecture encoding (ArchitectureGNN)
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
          โ”‚
          โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Prediction Models  โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”‚
โ”‚  ยท XGBoost (tuned)  โ”‚  โ† main predictor (best MAPE)
โ”‚  ยท Random Forest    โ”‚  โ† ensemble comparison
โ”‚  ยท GNN Predictor    โ”‚  โ† graph-native, generalizes across architectures
โ”‚  ยท Linear / Ridge   โ”‚  โ† baselines
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
          โ”‚
          โ–ผ
   Predicted: execution_time_ms, memory_mb
   + Uncertainty bounds (lower / upper)

Project Structure

Blink/
โ”œโ”€โ”€ dashboard.py             # ๐Ÿ–ฅ๏ธ  Main Streamlit web app  (run this)
โ”œโ”€โ”€ prediction_api.py        # ๐ŸŒ  Flask REST API
โ”‚
โ”œโ”€โ”€ โ”€โ”€ Core ML Modules โ”€โ”€
โ”‚   โ”œโ”€โ”€ model_profiler.py    # GPU profiler (CUDA events)
โ”‚   โ”œโ”€โ”€ feature_extractor.py # Static feature extraction from nn.Module
โ”‚   โ”œโ”€โ”€ gnn_extractor.py     # GNN-based graph feature extraction
โ”‚   โ”œโ”€โ”€ gnn_model.py         # ArchitectureGNN model definition (PyG)
โ”‚   โ”œโ”€โ”€ prediction_model.py  # Train XGBoost / RF / Linear models
โ”‚   โ”œโ”€โ”€ train_gnn.py         # Train the GNN predictor
โ”‚   โ”œโ”€โ”€ train_memory_model.py# Train memory prediction model
โ”‚   โ”œโ”€โ”€ gpu_predictor.py     # Inference class with caching & batch support
โ”‚   โ”œโ”€โ”€ model_analyser.py    # Model complexity analysis utilities
โ”‚   โ”œโ”€โ”€ advanced_features.py # Extended feature engineering
โ”‚   โ”œโ”€โ”€ dynamic_predictor.py # Dynamic / online prediction
โ”‚   โ”œโ”€โ”€ gpu_info.py          # GPU metadata (pynvml)
โ”‚   โ”œโ”€โ”€ workload_scheduler.py# Batch workload scheduler
โ”‚   โ””โ”€โ”€ performance_monitor.py
โ”‚
โ”œโ”€โ”€ scripts/                 # ๐Ÿ”ฌ  Experiment & data scripts
โ”‚   โ”œโ”€โ”€ collect_data.py      # Profile CNN/Transformer/custom models โ†’ data/raw/
โ”‚   โ”œโ”€โ”€ enhance_dataset.py   # Augment dataset (more batch sizes / models)
โ”‚   โ”œโ”€โ”€ diverse_architectures.py  # Profile diverse arch families
โ”‚   โ”œโ”€โ”€ ablation_study.py    # 5-condition ablation (Table II in paper)
โ”‚   โ”œโ”€โ”€ generate_paper_figures.py # Reproduce all paper figures
โ”‚   โ””โ”€โ”€ generate_paper_tables.py  # Reproduce paper tables
โ”‚
โ”œโ”€โ”€ tests/                   # โœ…  Test suite
โ”‚   โ”œโ”€โ”€ test_diverse_models.py
โ”‚   โ”œโ”€โ”€ test_predictors.py
โ”‚   โ”œโ”€โ”€ test_profiler.py
โ”‚   โ”œโ”€โ”€ test_gnn_scaling.py
โ”‚   โ””โ”€โ”€ evaluate_gnn_vs_xgb.py
โ”‚
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ raw/                 # Raw profiling CSVs (gitignored)
โ”‚   โ”œโ”€โ”€ processed/           # Feature-engineered CSVs
โ”‚   โ”œโ”€โ”€ enriched/            # Final training-ready dataset
โ”‚   โ””โ”€โ”€ feedback_log.csv     # Online feedback loop log
โ”‚
โ”œโ”€โ”€ models/                  # Serialized model artifacts (gitignored)
โ”‚   โ”œโ”€โ”€ xgboost_(tuned)_model.joblib
โ”‚   โ”œโ”€โ”€ random_forest_model.joblib
โ”‚   โ”œโ”€โ”€ gnn_predictor.pth
โ”‚   โ”œโ”€โ”€ memory_model.joblib
โ”‚   โ””โ”€โ”€ ...
โ”‚
โ”œโ”€โ”€ results/
โ”‚   โ”œโ”€โ”€ figures/             # Paper figures (PNG)
โ”‚   โ”œโ”€โ”€ ablation_study_table.csv
โ”‚   โ”œโ”€โ”€ gnn_scaling_table.csv
โ”‚   โ””โ”€โ”€ ...
โ”‚
โ”œโ”€โ”€ templates/index.html     # HTML template for web interface
โ”œโ”€โ”€ legacy/                  # Archived / superseded scripts
โ”œโ”€โ”€ requirements.txt
โ””โ”€โ”€ .gitignore

Installation

# 1. Clone the repo
git clone <your-repo-url>
cd Blink

# 2. Create a virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # Linux/macOS

# 3. Install dependencies
pip install -r requirements.txt

# 4. Install PyTorch Geometric (match your CUDA version)
# See: https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html
pip install torch-geometric

Requirements: NVIDIA GPU with CUDA, Python โ‰ฅ 3.10


Usage

1. Launch the Dashboard

streamlit run dashboard.py

Features: live model prediction, batch size optimizer, model comparison, performance monitor.

2. Collect Profiling Data

python scripts/collect_data.py --batch-sizes 1 4 16 32 64

3. Train Prediction Models

# Train XGBoost / RF / Linear baseline models
python prediction_model.py

# Train GNN predictor
python train_gnn.py

# Train memory model
python train_memory_model.py

4. Run Ablation Study

python scripts/ablation_study.py

5. Predict via Python API

from gpu_predictor import GPUPredictor
import torchvision.models as models

predictor = GPUPredictor()
model = models.resnet50(pretrained=False)
result = predictor.predict_for_custom_model(model, batch_size=16)
print(result)
# {'execution_time_ms': 12.4, 'memory_mb': 1820, 'confidence_lower': 11.1, ...}

Data Pipeline

collect_data.py
    โ””โ”€โ–ถ data/raw/*.csv          (GPU profiling measurements)
            โ”‚
            โ–ผ
feature_extractor.py
    โ””โ”€โ–ถ data/processed/*.csv    (static model features)
            โ”‚
            โ–ผ
enhance_dataset.py
    โ””โ”€โ–ถ data/enriched/*.csv     (augmented, training-ready)
            โ”‚
            โ–ผ
prediction_model.py / train_gnn.py
    โ””โ”€โ–ถ models/                 (trained predictors)

Model Performance

Results on held-out test set (20% split):

Model Exec Time MAPE Memory MAPE Notes
XGBoost (tuned) ~8% ~6% Best overall
Random Forest ~11% ~9% Robust baseline
GNN Predictor ~10% ~8% Best on unseen architectures
Linear Regression ~22% ~19% Baseline

(Full ablation study results: results/ablation_study_table.csv)


Dashboard

The Streamlit dashboard (dashboard.py) provides:

Tab Description
๐ŸŽฏ Prediction Predict execution time & memory for standard or custom models
โšก Batch Optimizer Find optimal batch size within a memory budget
๐Ÿ“Š Model Comparison Compare predictions across multiple architectures
๐Ÿ“ˆ Performance Monitor Live GPU utilization and prediction history

Paper Reproducibility

To reproduce all paper figures and tables:

python scripts/generate_paper_figures.py
python scripts/generate_paper_tables.py
python scripts/ablation_study.py

Outputs saved to results/figures/.


License

MIT License โ€” see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blink_gpu-0.1.0.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

blink_gpu-0.1.0-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page