Type-Safe Deep Learning Framework for Computer Vision
Project description
Orchard ML
Type-safe deep learning framework for reproducible computer vision research
| CI & Quality |
|
| Platform |
|
| Stack |
|
| Code Style |
|
Table of Contents
- Overview
- Hardware Requirements
- Quick Start
- Colab Notebooks
- Experiment Management
- Documentation
- Citation
- Roadmap
- License
Overview
Orchard ML is a research-grade PyTorch training framework engineered for reproducible, scalable computer vision experiments across diverse domains. Built on MedMNIST v2 medical imaging datasets and expanded to astronomical imaging (Galaxy10 DECals) and standard benchmarks (CIFAR-10/100), it provides a domain-agnostic platform supporting multi-resolution architectures (28×28, 32×32, 64×64, 224×224), automated hyperparameter optimization, and cluster-safe execution.
Key Differentiators:
- Type-Safe Configuration Engine:
Pydantic V2-based declarative manifests eliminate runtime errors - Idempotent Lifecycle Orchestration:
RootOrchestratorcoordinates a 7-phase initialization sequence (seeding, filesystem, logging, infrastructure locks, telemetry) via Context Manager with full dependency injection - Zero-Conflict Execution: Kernel-level file locking (
fcntl) prevents concurrent runs from corrupting shared resources - Intelligent Hyperparameter Search:
Optunaintegration with TPE sampling and Median Pruning - Hardware-Agnostic: Auto-detection and optimization for
CPU/CUDA/MPSbackends - Audit-Grade Traceability:
BLAKE2b-hashed run directories with fullYAMLsnapshots
Supported Architectures:
| Resolution | Architectures | Parameters | Use Case |
|---|---|---|---|
| 28 / 32 / 64 / 224 | ResNet-18 |
~11M | Multi-resolution baseline, transfer learning |
| 28 / 32 / 64 | MiniCNN |
~95K | Fast prototyping, ablation studies |
| 224×224 | EfficientNet-B0 |
~4.0M | Efficient compound scaling |
| 224×224 | ConvNeXt-Tiny |
~27.8M | Modern ConvNet design |
| 224×224 | ViT-Tiny |
~5.5M | Patch-based attention, multiple weight variants |
[!TIP] 1000+ additional architectures via timm: Any model in the
timmregistry can be used by prefixing the name withtimm/in your recipe:architecture: name: "timm/mobilenetv3_small_100" # ~1.5M params, edge-friendly pretrained: trueThis works with MobileNet, DenseNet, RegNet, EfficientNet-V2, and any other architecture supported by
timm. Seerecipes/config_timm_mobilenetv3.yamlfor a ready-to-use example.
Hardware Requirements
CPU Training (28×28 / 32×32 / 64×64)
- Supported Resolutions: 28×28, 32×32, 64×64
- Time: ~2.5 hours (
ResNet-18, 28×28, 60 epochs, 16 cores) - Time: ~5-10 minutes (
MiniCNN, 28×28, 60 epochs, 16 cores) - Architectures:
ResNet-18,MiniCNN - Use Case: Development, testing, limited hardware environments
GPU Training (All Resolutions)
- 28×28 Resolution:
MiniCNN: ~2-3 minutes (60 epochs)ResNet-18: ~10-15 minutes (60 epochs)
- 32×32 Resolution (CIFAR-10/100):
MiniCNN: ~3-5 minutes (60 epochs)ResNet-18: ~15-20 minutes (60 epochs)
- 64×64 Resolution:
MiniCNN: ~3-5 minutes (60 epochs)ResNet-18: ~15-20 minutes (60 epochs)
- 224×224 Resolution:
EfficientNet-B0: ~30 minutes per trial (15 epochs)ViT-Tiny: ~25-35 minutes per trial (15 epochs)
- VRAM: 8GB recommended for 224×224 resolution
- Architectures:
ResNet-18,EfficientNet-B0,ConvNeXt-Tiny,ViT-Tiny
[!WARNING] 224×224 training on CPU is not recommended - it would take 10+ hours per trial. High-resolution training requires GPU acceleration. Only 28×28 resolution has been tested and validated for CPU training.
[!NOTE] Apple Silicon (
MPS): The codebase includesMPSbackend support (device detection, seeding, memory management), but it has not been tested on real hardware. If you encounter issues, please open an issue.
[!NOTE] Data Format: Orchard ML operates on
NPZarchives as its canonical data format. All datasets are downloaded or converted toNPZbefore entering the training pipeline. Custom datasets in other formats (HDF5, DICOM, TIFF) can be integrated by adding a conversion step in a dedicated fetcher module — see the Galaxy10 fetcher for a reference implementation.
Representative Benchmarks (RTX 5070 Laptop GPU):
| Task | Architecture | Resolution | Device | Time | Notes |
|---|---|---|---|---|---|
| Smoke Test | MiniCNN |
28×28 | CPU/GPU | <30s | 1-epoch sanity check |
| Quick Training | MiniCNN |
28×28 | GPU | ~2-3 min | 60 epochs |
| Quick Training | MiniCNN |
28×28 | CPU (16 cores) | ~5-10 min | 60 epochs, CPU-validated |
| Mid-Res Training | MiniCNN |
64×64 | GPU | ~3-5 min | 60 epochs |
| Transfer Learning | ResNet-18 |
28×28 | GPU | ~5 min | 60 epochs |
| Transfer Learning | ResNet-18 |
28×28 | CPU (16 cores) | ~2.5h | 60 epochs, CPU-validated |
| High-Res Training | EfficientNet-B0 |
224×224 | GPU | ~30 min/trial | 15 epochs per trial, GPU required |
| High-Res Training | ViT-Tiny |
224×224 | GPU | ~25-35 min/trial | 15 epochs per trial, GPU required |
| Optimization Study | EfficientNet-B0 |
224×224 | GPU | ~2h | 4 trials (early stop at AUC≥0.9999) |
| Optimization Study | Various | 224×224 | GPU | ~1.5-5h | 20 trials, highly variable |
[!NOTE] Timing Variance: Optimization times are highly dependent on early stopping criteria, pruning configuration, and dataset complexity:
- Early Stopping: Studies may finish in 1-3 hours if performance thresholds are met quickly (e.g., AUC ≥ 0.9999 after 4 trials)
- Full Exploration: Without early stopping, 20 trials can extend to 5+ hours
- Pruning Impact: Median pruning can save 30-50% of total time by terminating underperforming trials
Quick Start
Step 1: Environment Setup
Option A: Install from source (recommended)
git clone https://github.com/tomrussobuilds/orchard-ml.git
Navigate into the project directory and install in editable mode:
cd orchard-ml
pip install -e .
With development tools (linting, testing, type checking):
pip install -e ".[dev]"
Option B: Install from PyPI
pip install orchard-ml
orchard init # generates recipe.yaml with all defaults
orchard run recipe.yaml
Step 2: Verify Installation (Optional)
# Run 1-epoch sanity check (~30 seconds, CPU/GPU)
# Downloads BloodMNIST 28×28 by default
python -m tests.smoke_test
# Note: You can skip this step - datasets are auto-downloaded on first run
Step 3: Training Workflow
Orchard ML uses the orchard CLI as the single entry point for all workflows. The pipeline behavior is controlled entirely by the YAML recipe:
- Training only: Use a
config_*.yamlfile (nooptuna:section) - Optimization + Training: Use an
optuna_*.yamlfile (hasoptuna:section) - With Export: Add an
export:section to your config
orchard --version # Verify installation
orchard run --help # Show available options
Training Only (Quick start)
# 28×28 resolution (CPU-compatible)
orchard run recipes/config_mini_cnn.yaml # ~2-3 min GPU, ~5-10 min CPU
orchard run recipes/config_resnet_18.yaml # ~10-15 min GPU, ~2.5h CPU
# 32×32 resolution (CIFAR-10/100)
orchard run recipes/config_cifar10_mini_cnn.yaml # ~3-5 min GPU
orchard run recipes/config_cifar10_resnet_18.yaml # ~10-15 min GPU
# 64×64 resolution (CPU/GPU)
orchard run recipes/config_mini_cnn_64.yaml # ~3-5 min GPU
# 224×224 resolution (GPU required)
orchard run recipes/config_efficientnet_b0.yaml # ~30 min GPU
orchard run recipes/config_vit_tiny.yaml # ~25-35 min GPU
# Override any config value on the fly
orchard run recipes/config_mini_cnn.yaml --set training.epochs=20 --set training.seed=99
What happens:
- Dataset auto-downloaded to
./dataset/ - Training runs for 60 epochs with early stopping
- Results saved to timestamped directory in
outputs/
Hyperparameter Optimization + Training (Full pipeline)
# 28×28 resolution - fast iteration
orchard run recipes/optuna_mini_cnn.yaml # ~5 min GPU, ~5-10 min CPU
orchard run recipes/optuna_resnet_18.yaml # ~15 min GPU
# 32×32 resolution - CIFAR-10/100
orchard run recipes/optuna_cifar100_mini_cnn.yaml # ~1-2h GPU
orchard run recipes/optuna_cifar100_resnet_18.yaml # ~3-4h GPU
# 224×224 resolution - requires GPU
orchard run recipes/optuna_efficientnet_b0.yaml # ~1.5-5h*, GPU
orchard run recipes/optuna_vit_tiny.yaml # ~3-5h*, GPU
# *Time varies due to early stopping (may finish in 1-3h if target AUC reached)
What happens:
- Optimization: Explores hyperparameter combinations with
Optuna - Training: Full 60-epoch training with best hyperparameters found
- Artifacts: Interactive plots, best_config.yaml, model weights
[!TIP] Model Search: Enable
optuna.enable_model_search: truein yourYAMLconfig to letOptunaautomatically explore all registered architectures for the target resolution. Useoptuna.model_poolto restrict the search to a subset of architectures (e.g.["vit_tiny", "efficientnet_b0"]).
View optimization results:
firefox outputs/*/figures/param_importances.html # Which hyperparameters matter most
firefox outputs/*/figures/optimization_history.html # Trial progression
Model Export (Production deployment)
All training configs (config_*.yaml) include ONNX export by default:
orchard run recipes/config_efficientnet_b0.yaml
# → Training + ONNX export to outputs/*/exports/model.onnx
See the Export Guide for configuration options (format, quantization, validation).
Colab Notebooks
Try Orchard ML directly in Google Colab — no local setup required:
| Notebook | Description | Runtime | Time |
|---|---|---|---|
MiniCNN training on BloodMNIST 28×28 — end-to-end training, evaluation, and ONNX export |
CPU | ~15 min | |
Automatic architecture search (EfficientNet-B0, ViT-Tiny, ConvNeXt-Tiny, ResNet-18) on Galaxy10 224×224 with Optuna |
T4 GPU | ~30-45 min |
Experiment Management
Every run generates a complete artifact suite for total traceability. Both training-only and optimization workflows share the same RunPath orchestrator, producing BLAKE2b-hashed timestamped directories.
Browse Sample Artifacts — Excel reports, YAML configs, and diagnostic plots from real training runs.
See the full artifact tree for the complete directory layout — logs, model weights, and HTML plots are generated locally and not tracked in the repo.
Browse Recipe Configs — Ready-to-use YAML configurations for every architecture and workflow.
Copy the closest recipe, tweak the parameters, and run:
cp recipes/config_efficientnet_b0.yaml my_run.yaml
# edit hyperparameters, swap dataset/model, add or remove sections (optuna, export, tracking)
orchard run my_run.yaml
Documentation
| Guide | Covers |
|---|---|
| Framework Guide | System architecture diagrams, design principles, component deep-dives |
| Architecture Guide | Supported model architectures, weight transfer, grayscale adaptation, MixUp |
| Configuration Guide | Full parameter reference, usage patterns, adding new datasets |
| Optimization Guide | Optuna integration, search space config, pruning strategies, visualization |
| Docker Guide | Container build instructions, GPU-accelerated execution, reproducibility mode |
| Export Guide | ONNX export pipeline, quantization options, validation and benchmarking |
| Tracking Guide | MLflow local setup, dashboard and run comparison, programmatic querying |
| Artifact Guide | Output directory structure, training vs optimization artifact differences |
| Testing Guide | Test suite, quality automation scripts, CI/CD pipeline details |
orchard/ / tests/ |
Internal package structure, module responsibilities, extension points |
Citation
@software{orchardml2026,
author = {Tommaso Russo},
title = {Orchard ML: Type-Safe Deep Learning Framework},
year = {2026},
url = {https://github.com/tomrussobuilds/orchard-ml},
note = {PyTorch framework with Pydantic V2 configuration and Optuna optimization}
}
Roadmap
- Expanded Dataset Domains: Climate, remote sensing, microscopy
- Multi-modal Support: Detection, segmentation hooks
- Distributed Training:
DDP,FSDPsupport for multi-GPU
License
MIT License - See LICENSE for details.
Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass:
pytest tests/ -v - Submit a pull request
For detailed guidelines, see CONTRIBUTING.md.
Contact
For questions or collaboration: GitHub Issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file orchard_ml-0.1.7.tar.gz.
File metadata
- Download URL: orchard_ml-0.1.7.tar.gz
- Upload date:
- Size: 159.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51c21580a501acc822990853b1b44d4c5c3e1129d13f512d8120ee653b5164b1
|
|
| MD5 |
679542c54b7da4873da681ba70c796d4
|
|
| BLAKE2b-256 |
cebbc6fdec981fb470af8864843cd3842fb6799751edc493b658e6137cc7e815
|
Provenance
The following attestation bundles were made for orchard_ml-0.1.7.tar.gz:
Publisher:
publish.yml on tomrussobuilds/orchard-ml
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
orchard_ml-0.1.7.tar.gz -
Subject digest:
51c21580a501acc822990853b1b44d4c5c3e1129d13f512d8120ee653b5164b1 - Sigstore transparency entry: 991529065
- Sigstore integration time:
-
Permalink:
tomrussobuilds/orchard-ml@8d14ee629fba07b77c2e1b73fec76c2ac3216c9b -
Branch / Tag:
refs/tags/v0.1.7 - Owner: https://github.com/tomrussobuilds
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8d14ee629fba07b77c2e1b73fec76c2ac3216c9b -
Trigger Event:
push
-
Statement type:
File details
Details for the file orchard_ml-0.1.7-py3-none-any.whl.
File metadata
- Download URL: orchard_ml-0.1.7-py3-none-any.whl
- Upload date:
- Size: 195.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2dc099766793b24dbe38d62d3cbcf593d09abb63c3606a3c14fce44e96a6b7e0
|
|
| MD5 |
a397fd751d49519c8b15faf8d1472a3b
|
|
| BLAKE2b-256 |
683651d38fb7b667d0e4beb2ab0989d1eee5df91eb76698720c1080321f65045
|
Provenance
The following attestation bundles were made for orchard_ml-0.1.7-py3-none-any.whl:
Publisher:
publish.yml on tomrussobuilds/orchard-ml
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
orchard_ml-0.1.7-py3-none-any.whl -
Subject digest:
2dc099766793b24dbe38d62d3cbcf593d09abb63c3606a3c14fce44e96a6b7e0 - Sigstore transparency entry: 991529085
- Sigstore integration time:
-
Permalink:
tomrussobuilds/orchard-ml@8d14ee629fba07b77c2e1b73fec76c2ac3216c9b -
Branch / Tag:
refs/tags/v0.1.7 - Owner: https://github.com/tomrussobuilds
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8d14ee629fba07b77c2e1b73fec76c2ac3216c9b -
Trigger Event:
push
-
Statement type: