Pre-flight checks for PyTorch pipelines. Catch silent failures before they waste your GPU.
Project description
preflight
Pre-flight checks for PyTorch pipelines. Catch silent failures before they waste your GPU.
Most deep learning bugs don't crash your training loop, they silently produce a garbage model. NaNs in your data, labels leaking between train and val, wrong channel ordering, dead gradients. You won't know until hours later, after the GPU bill has landed.
preflight is a pre-training validation tool you run in 30 seconds before starting any training job. It's not a linter. It's a pre-flight check, similar to the kind pilots run before the expensive thing takes off.
Install
pip install preflight-ml
Quickstart
Create a small Python file that exposes your dataloader:
# my_dataloader.py
import torch
from torch.utils.data import DataLoader, TensorDataset
x = torch.randn(200, 3, 224, 224)
y = torch.randint(0, 10, (200,))
dataloader = DataLoader(TensorDataset(x, y), batch_size=32)
Run preflight:
preflight run --dataloader my_dataloader.py
Output:
preflight — pre-training check report
╭────────────────────────┬──────────┬────────┬──────────────────────────────────────────────────╮
│ Check │ Severity │ Status │ Message │
├────────────────────────┼──────────┼────────┼──────────────────────────────────────────────────┤
│ nan_inf_detection │ FATAL │ PASS │ No NaN or Inf values found in 10 sampled batches │
│ normalisation_sanity │ WARN │ PASS │ Normalisation looks reasonable (mean=0.001) │
│ channel_ordering │ WARN │ PASS │ Channel ordering looks correct (NCHW) │
│ label_leakage │ FATAL │ PASS │ No val_dataloader provided — skipped │
│ split_sizes │ INFO │ PASS │ train=200 samples │
│ vram_estimation │ WARN │ INFO │ No CUDA GPU detected — skipped │
│ class_imbalance │ WARN │ PASS │ Class distribution looks balanced │
│ shape_mismatch │ FATAL │ PASS │ No model provided — skipped │
│ gradient_check │ FATAL │ PASS │ No model+loss provided — skipped │
╰────────────────────────┴──────────┴────────┴──────────────────────────────────────────────────╯
0 fatal 0 warnings 9 passed
Pre-flight passed. Safe to start training.
Checks
preflight runs 10 checks across three severity tiers. A FATAL failure exits with code 1 and blocks CI.
| Check | Severity | What it catches |
|---|---|---|
nan_inf_detection |
FATAL | NaN or Inf values anywhere in sampled batches |
label_leakage |
FATAL | Samples appearing in both train and val sets |
shape_mismatch |
FATAL | Dataset output shape incompatible with model input |
gradient_check |
FATAL | Zero gradients, dead layers, exploding gradients |
normalisation_sanity |
WARN | Data that looks unnormalised (raw pixel values etc.) |
channel_ordering |
WARN | NHWC tensors when PyTorch expects NCHW |
vram_estimation |
WARN | Estimated peak VRAM exceeds 90% of GPU memory |
class_imbalance |
WARN | Severe class imbalance beyond configurable threshold |
split_sizes |
INFO | Empty or degenerate train/val splits |
duplicate_samples |
INFO | Identical samples within a split |
With a model
Pass a model file to enable shape, gradient, and VRAM checks:
# my_model.py
import torch.nn as nn
model = nn.Sequential(nn.Flatten(), nn.Linear(3 * 224 * 224, 10))
# my_loss.py
import torch.nn as nn
loss_fn = nn.CrossEntropyLoss()
preflight run \
--dataloader my_dataloader.py \
--model my_model.py \
--loss my_loss.py \
--val-dataloader my_val_dataloader.py
Configuration
Add a .preflight.toml to your repo root to configure thresholds and disable checks:
[thresholds]
imbalance_threshold = 0.05
nan_sample_batches = 20
[checks]
vram_estimation = false
[ignore]
# check = "class_imbalance"
# reason = "intentional: rare event dataset"
CI integration
Add to your GitHub Actions workflow:
- name: Install preflight
run: pip install preflight-ml
- name: Run pre-flight checks
run: preflight run --dataloader scripts/dataloader.py --format json
The --format json flag outputs machine-readable results. Exit code is 1 if any FATAL check fails, 0 otherwise.
List all checks
preflight checks
What preflight does NOT do
- It does not replace unit tests. Use pytest for code logic.
- It does not guarantee a correct model. Passing preflight is a minimum safety bar, not a certification.
- It does not run your full training loop. Use it as a gate before training starts.
- It does not modify your code unless you pass
--fix.
Roadmap
-
--fixflag — auto-patch common issues (channel ordering, normalisation) - Dataset snapshot + drift detection (
preflight diff baseline.json new_data.pt) - Full dry-run mode (one batch through model + loss + backward)
- Jupyter magic command (
%load_ext preflight) -
preflight-monaiplugin for medical imaging checks -
preflight-sktimeplugin for time series checks
Contributing
See CONTRIBUTING.md. New checks are welcome. Each one needs a passing test, a failing test, and a fix hint.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file preflight_ml-0.1.1.tar.gz.
File metadata
- Download URL: preflight_ml-0.1.1.tar.gz
- Upload date:
- Size: 13.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c7c65f46cf4a48267967436fcb7d5ef39b9d78e2b5197310b533f77ec30f225
|
|
| MD5 |
67343ceae300760a74ce7e88ffb82b2d
|
|
| BLAKE2b-256 |
9e56d96e329e104ca37b5191fe20732275d578847d7725fba8e7e49b9c909c68
|
File details
Details for the file preflight_ml-0.1.1-py3-none-any.whl.
File metadata
- Download URL: preflight_ml-0.1.1-py3-none-any.whl
- Upload date:
- Size: 16.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a579c92fe79e99ecaa25dbc1fcee6095347aed8a4641871aa31b69d602ed879
|
|
| MD5 |
ed1605de0131910e5349825c2ce7bb03
|
|
| BLAKE2b-256 |
0cedad3e371f27c4be73211a2309d26dad5348bbe996148dbd70b58616d6ac6b
|