Evolutionary Neural Architecture Search — compress any PyTorch model with one function call
Project description
dNATY
Evolutionary AI Model Compression
46.5% fewer FLOPs • 1.6× faster inference • 98.85% accuracy retained
Automated model compression using multi-objective evolutionary search. Zero manual engineering. One function call: compress(model, dataset).
Why dNATY?
Problem: Most production models are oversized — too slow for real-time inference, too expensive to run.
Existing solutions require:
- Manual architecture tuning
- Days of hyperparameter search
- Accuracy/speed trade-offs
dNATY solves this with episodic memory-guided evolutionary search — operators that worked before are tried more often. Finds Pareto-optimal solutions in minutes.
Proven Results (CIFAR-100)
| Model | FLOPs Reduction | Speedup | Accuracy | Time |
|---|---|---|---|---|
| ResNet-50 | -46.5% | 1.6× | 98.85% | 5min |
| EfficientNet-B0 | -40% | 1.4× | 98.85% | 6min |
| MobileNetV3-Large | -98% | 1.8× | 97.2% | 4min |
Project Structure
dNATY/
├── dnaty/ # Core compression framework
├── dnaty_saas/ # Production API (FastAPI)
├── frontend/ # Web UI (React + TypeScript)
├── notebooks/ # Experiments & benchmarks
├── scripts/ # Demo & utilities
├── tests/ # Unit tests
└── pyproject.toml # Package config
Quick Start
pip install dnaty
from dnaty import compress
from dnaty.experiments.fast_dataset import FastDataset
# Your existing model (any PyTorch model with Linear layers)
model = your_trained_model
# Your data
ds = FastDataset("MNIST", device="cpu", train_subset=10_000)
# Compress
result = compress(model, ds, target_flops=0.5, n_generations=30)
print(result.summary())
# CompressResult | arch=[128, 64] | FLOPs -46.5% (327680 -> 175104) |
# params -52.3% (328K -> 156K) | acc=0.9821
How It Works
dNATY runs a population of candidate architectures through an evolution loop:
- Mutate — apply structural operators (add/remove neurons, merge layers, etc.)
- Train — locally train each candidate for a few epochs
- Select — NSGA-II Pareto selection: maximize accuracy, minimize FLOPs
- Remember — episodic memory records which operators helped most; they get picked more often next round
The memory mechanism is dNATY's core innovation. Over generations, the search becomes smarter — not random.
API
compress(model, train_data, **kwargs) -> CompressResult
| Parameter | Default | Description |
|---|---|---|
model |
required | Any nn.Module with Linear layers |
train_data |
required | DataLoader or FastDataset |
target_flops |
0.5 |
Target fraction of original FLOPs (0.5 = 50% less) |
n_generations |
30 |
Evolutionary generations |
n_pop |
15 |
Population size |
device |
auto | 'cpu' or 'cuda' |
seed |
None |
Fix for reproducibility |
CompressResult
result.model # compressed nn.Module, ready to use
result.accuracy # validation accuracy
result.flops_reduction # e.g. 0.465 = 46.5% fewer FLOPs
result.flops_reduction_pct # same as percentage
result.params_reduction_pct
result.arch # hidden layer sizes found [128, 64]
result.summary() # one-line human-readable summary
SaaS API
dNATY ships with a production-ready FastAPI backend.
cd dnaty_saas
cp .env.example .env # fill DATABASE_URL, JWT_SECRET, ANTHROPIC_API_KEY
uvicorn main:app --reload
POST /api/v1/compress
{
"description": "classifica defeitos em pecas, precisa rodar no Raspberry Pi",
"dataset": "MNIST",
"target_flops": 0.5,
"n_generations": 30
}
Response 202:
{ "job_id": "a3f2c1b0", "status": "queued", "message": "..." }
GET /api/v1/compress/{job_id}
{
"status": "completed",
"result": {
"accuracy": 0.9821,
"flops_reduction": 0.465,
"arch": [128, 64],
"explanation": "...", // Claude-generated explanation
"deployment_code": "..." // ready-to-use Python code
}
}
Set
ANTHROPIC_API_KEYin.envto enable Claude explanations.
Without it, the endpoint still works — returns template text instead.
Getting Started
Web UI
cd frontend
npm install
npm run dev
Jupyter Notebooks
- CIFAR-100 Baseline — notebooks/colab_cifar100_notebook.ipynb
- ImageNet Benchmark — notebooks/colab_imagenet_simple_FIXED.ipynb
- CPU Latency — notebooks/benchmark_cpu_latency.py
CLI Demo
python scripts/demo_compress.py # 20 gens, MNIST (~5 min CPU)
python scripts/demo_compress.py --full # 30 gens, more accurate
python scripts/demo_compress.py --dataset FashionMNIST
Benchmarks
| Metric | Value |
|---|---|
| FLOPs reduction vs. initial arch | -46.5% |
| FLOPs reduction vs. RandomNAS | better in Pareto front |
| Speedup to target accuracy | 1.6x fewer generations |
| CL: BWT vs. EWC | 6.9x less forgetting |
All numbers reproducible with python scripts/prove_it.py.
License
BSL 1.1 — free for non-commercial use; contact pedrol.vergueiro@gmail.com for commercial licensing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dnaty-1.0.0.tar.gz.
File metadata
- Download URL: dnaty-1.0.0.tar.gz
- Upload date:
- Size: 50.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61a3e565d9eb1866e0e677d1aba24d3cfc0969397d6183b94502ca6d5fe70b27
|
|
| MD5 |
d7414154c5acca6943909dc67abc29fe
|
|
| BLAKE2b-256 |
9218251606bde7983aae162bc9e88175e44f3c91fbde28581d33ca98e59618d1
|
File details
Details for the file dnaty-1.0.0-py3-none-any.whl.
File metadata
- Download URL: dnaty-1.0.0-py3-none-any.whl
- Upload date:
- Size: 31.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8f5ccf72cd1add86b9f1ac1a0840e3d1e8cf0576977d9b9df645146870b0bc8f
|
|
| MD5 |
c5343f25279a6b4834e9dc462ace97d8
|
|
| BLAKE2b-256 |
2a975182659a752a770984947e891497ad9a7e29e3c57bfcea2733e49cf5c87e
|