Evolutionary Neural Architecture Search — compress any PyTorch model with one function call

These details have not been verified by PyPI

Project links

Project description

dNATY

Evolutionary AI Model Compression

46.5% fewer FLOPs · 1.6× faster search · 98.59% accuracy retained · no GPU required

Compress any PyTorch model with one function call.
dNATY uses multi-objective evolutionary search to find smaller, faster architectures — automatically.

pip install dnaty

Quickstart

import torch.nn as nn
from dnaty import compress
from dnaty.experiments.fast_dataset import FastDataset

# 1. Your model — any nn.Module with Linear layers
model = nn.Sequential(
    nn.Flatten(),
    nn.Linear(784, 512), nn.ReLU(),
    nn.Linear(512, 256), nn.ReLU(),
    nn.Linear(256, 10)
)

# 2. Load dataset (cached in RAM — zero I/O across generations)
ds = FastDataset("MNIST", device="cpu", train_subset=10_000)

# 3. Compress
result = compress(model, ds, target_flops=0.5, n_generations=30)

print(result.summary())
# CompressResult | arch=[301, 153, 128] | FLOPs -46.5% (1,133,056 → 605,802)
#   | params -46.5% (536K → 286K) | acc=0.9859

The compressed model is a regular nn.Module — drop it into your existing pipeline:

result.model          # nn.Module, ready for inference
result.accuracy       # 0.9859
result.flops_reduction_pct  # 46.5
result.arch           # [301, 153, 128]  ← hidden layer sizes found

Why dNATY?

The problem: most models ship larger than they need to be. That means slower inference, higher cloud bills, and models too heavy for edge devices (cameras, drones, robots). Shrinking them by hand is days of trial-and-error with no guarantee you found the best size/accuracy trade-off.

What you get with dNATY:

Smaller, cheaper models — ~46% fewer FLOPs on MNIST, accuracy kept (98.59%)
No GPU — the search runs on CPU in minutes, so it works in CI and on the edge hardware you already have
No retraining — point it at a model + dataset, get a deployable nn.Module back
One function call — compress(model, dataset); export to .pth / .onnx

How is this different from pruning / quantization / distillation?

Those methods shrink the model you already have. dNATY searches for a smaller architecture that does the same job — a different layer here. They're complementary, not competing:

Method	What it does	Catch
Quantization	Lower-precision weights (fp32→int8)	Same architecture & op count. Stack it on top of dNATY.
Pruning	Zeroes individual weights	Needs sparse runtimes to actually run faster; manual tuning
Distillation	Trains a small student model	You design the student + write the training loop
DARTS	Gradient-based architecture search	Needs a GPU + hours of config
Random NAS	Random architecture sampling	No memory — re-tries bad ideas
dNATY	Evolves a smaller architecture, memory-guided	CPU-only, one call, no retraining

The engine is episodic memory-guided evolutionary search (NSGA-II, multi-objective): operators that helped in past generations get sampled more often, so it converges faster than random search — no gradients, no GPU.

Benchmark: dNATY vs alternatives

Results on MNIST (30K training samples, CPU, seed=42).

Method	FLOPs reduction	Accuracy	Setup effort	GPU needed
dNATY	−46.5%	98.59%	1 function call	No
RandomNAS	−41.2%	98.54%	1 function call	No
`torch.nn.utils.prune`	−30–40%*	varies	manual per-layer	No
DARTS	−35–50%	varies	hours of config	Yes
Manual knowledge distillation	−20–60%*	varies	custom training loop	No

* highly dependent on model and manual choices

Continual learning (Split-MNIST, 5 tasks, 3 seeds)

Method	Backward Transfer (BWT)	Less forgetting
dNATY	−0.145	best
EWC	−0.999	near-total forgetting
MLP (no CL)	−0.998	baseline

dNATY achieves 6.9× less catastrophic forgetting than EWC.

CPU Latency Comparison

All numbers reproducible: python scripts/prove_it.py

Measured across real datasets

Compression depends on how oversized your model is — dNATY finds the right size, it doesn't force a fixed cut. Measured on CPU (held-out accuracy):

Dataset	FLOPs ↓	Accuracy	Note
MNIST	−50.4%	97.0%	oversized MLP → big cut
Fashion-MNIST	−54.6%	86.4%	oversized MLP → big cut
UCI Wine Quality	−78.4%	63.7%	extra capacity useless → shrinks hard
UCI Adult / Census	−2.7%	84.0%	already lean → small cut (correct)
UCI Covertype	−1.5%	78.1%	already lean → small cut (correct)
CIFAR-10 (MLP)	−1.2%	46.4%	MLP unfit for RGB — conv NAS is WIP

Full table, config, and reproduction: BENCHMARKS_REAL.md.

Real examples

MNIST — MLP compression

import torch.nn as nn
from dnaty import compress
from dnaty.experiments.fast_dataset import FastDataset

model = nn.Sequential(
    nn.Flatten(),
    nn.Linear(784, 512), nn.ReLU(),
    nn.Linear(512, 256), nn.ReLU(),
    nn.Linear(256, 10),
)

ds = FastDataset("MNIST", device="cpu", train_subset=30_000)
result = compress(model, ds, target_flops=0.5, n_generations=50, seed=42)

print(result.summary())
# FLOPs -46.5% (1,133,056 → 605,802) | acc=0.9859 | arch=[301, 153, 128]

CIFAR-10 — image classification

import torch.nn as nn
from dnaty import compress
from dnaty.experiments.fast_dataset import FastDataset

model = nn.Sequential(
    nn.Flatten(),
    nn.Linear(3072, 1024), nn.ReLU(),
    nn.Linear(1024, 512),  nn.ReLU(),
    nn.Linear(512, 10),
)

ds = FastDataset("CIFAR10", device="cpu", train_subset=50_000)
result = compress(model, ds, target_flops=0.5, n_generations=30, seed=0)

print(result.summary())
# FLOPs reduction · +4.43 pp accuracy vs ResNet baseline

Custom DataLoader

dNATY works with any standard torch.utils.data.DataLoader:

from torch.utils.data import DataLoader, TensorDataset
import torch

X = torch.randn(5_000, 128)
y = torch.randint(0, 2, (5_000,))
loader = DataLoader(TensorDataset(X, y), batch_size=256, shuffle=True)

model = nn.Sequential(
    nn.Linear(128, 256), nn.ReLU(),
    nn.Linear(256, 128), nn.ReLU(),
    nn.Linear(128, 2)
)

result = compress(model, loader, target_flops=0.4, n_generations=20)

Deterministic results with seed

result = compress(model, ds, target_flops=0.5, n_generations=30, seed=42)
# Run again with the same seed → identical result

API reference

`compress(model, train_data, **kwargs) → CompressResult`

Parameter	Type	Default	Description
`model`	`nn.Module`	required	Any model with `nn.Linear` layers
`train_data`	`FastDataset` or `DataLoader`	required	Training data
`target_flops`	`float`	`0.5`	Target FLOPs as fraction of original (`0.5` = 50% less)
`n_generations`	`int`	`30`	Evolutionary generations to run
`n_pop`	`int`	`15`	Population size (diversity vs. speed)
`device`	`str`	auto	`'cpu'` or `'cuda'`
`seed`	`int`	`None`	Fix for reproducibility
`verbose`	`bool`	`True`	Print generation-by-generation progress

`CompressResult`

result.model                # nn.Module — compressed model, ready for inference
result.accuracy             # float — validation accuracy
result.flops_reduction      # float — e.g. 0.465 = 46.5% fewer FLOPs
result.flops_reduction_pct  # float — percentage version
result.params_reduction_pct # float — parameter reduction percentage
result.original_flops       # int — FLOPs of the input model
result.compressed_flops     # int — FLOPs of the compressed model
result.original_params      # int — parameters of the input model
result.compressed_params    # int — parameters of the compressed model
result.arch                 # list[int] — hidden layer sizes found
result.generations          # int — generations that were run
result.summary()            # str — one-line human-readable summary

`FastDataset`

Zero-overhead dataset loading — loads everything into RAM once, serves batches via direct indexing.

from dnaty.experiments.fast_dataset import FastDataset

ds = FastDataset(
    name="MNIST",            # "MNIST" | "FashionMNIST" | "CIFAR10"
    device="cpu",            # "cpu" or "cuda"
    train_subset=10_000,     # use a subset of training data (None = full)
    val_size=10_000,         # validation split size
    data_dir="./data",       # where to download/cache
)

`DnatyEvolver` (advanced)

Direct access to the evolutionary engine for custom search loops:

from dnaty.evolution.evolver import DnatyEvolver

evolver = DnatyEvolver(
    n_pop=20,
    n_generations=50,
    input_size=784,
    n_classes=10,
    init_hidden=[512, 256],
    device="cpu",
    verbose=True,
)
evolver.run(train_data, val_data)

best = evolver.population[0]
print(best.model, best.acc, best.count_flops())

How it works

Initial architecture
        │
        ▼
┌─────────────────────────────────────────┐
│  Population of N candidate architectures │
│  (mutations: add/remove neurons, merge   │
│   layers, split, widen, narrow, skip)    │
└──────────────┬──────────────────────────┘
               │  each generation:
               │
        ┌──────▼──────┐
        │   Mutate    │  ← episodic memory weights operator probabilities
        └──────┬──────┘
               │
        ┌──────▼──────┐
        │    Train    │  3 epochs per candidate (AMP on GPU, fp32 on CPU)
        └──────┬──────┘
               │
        ┌──────▼──────┐
        │   Select    │  NSGA-II Pareto front: max acc + min FLOPs
        └──────┬──────┘
               │
        ┌──────▼──────┐
        │   Remember  │  operators that helped get higher probability next round
        └─────────────┘
               │
               ▼
     Best compressed model

The episodic memory is dNATY's core differentiator. Unlike random search or gradient-based NAS, the search improves over generations by remembering what worked.

Installation

pip install dnaty              # stable (recommended)
pip install dnaty==1.0.1       # pin to specific version
pip install git+https://github.com/pedrovergueiroo/dNATY  # latest from source

Requirements: Python 3.10+, PyTorch 2.0+, NumPy 1.24+

Optional dev dependencies:

pip install dnaty[dev]   # adds pytest, matplotlib, jupyter

Project structure

dNATY/
├── dnaty/
│   ├── compress.py              # public API: compress()
│   ├── evolution/evolver.py     # DnatyEvolver — main search loop
│   ├── core/
│   │   ├── arch.py              # DynamicMLP — mutable architecture
│   │   └── individual.py        # Individual = model + memory + fitness
│   ├── operators/mutations.py   # 8 structural operators
│   ├── training/local_train.py  # fast local trainer (AMP, FP32)
│   └── experiments/
│       └── fast_dataset.py      # FastDataset — zero-I/O loader
├── dnaty_saas/                  # Production API (FastAPI + PostgreSQL)
├── frontend/                    # Web UI (React + TypeScript + Tailwind)
├── notebooks/                   # CIFAR-100, ImageNet experiments
├── scripts/
│   ├── prove_it.py              # reproduces all benchmark numbers
│   └── demo_compress.py         # interactive demo
└── tests/                       # pytest suite

Reproducing the benchmarks

# Full benchmark suite (~25 min on CPU)
python scripts/prove_it.py

# Quick demo (~5 min)
python scripts/demo_compress.py

# Run tests
pytest tests/

Results are written to results/ as JSON files.

SaaS API

dNATY ships with a production-ready API backend (FastAPI + PostgreSQL + Stripe).

cd dnaty_saas
cp .env.example .env    # configure DATABASE_URL, JWT_SECRET, etc.
pip install -r requirements.txt
uvicorn main:app --reload

POST /api/v1/compress — submit a compression job
GET /api/v1/compress/{job_id} — poll status and get results
See /docs (Swagger) when the server is running.

License

Business Source License 1.1 — free for non-commercial use.
Contact pedrol.vergueiro@gmail.com for commercial licensing.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.1.0

Jun 5, 2026

1.0.1

May 29, 2026

1.0.0

May 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dnaty-1.1.0.tar.gz (62.1 kB view details)

Uploaded Jun 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dnaty-1.1.0-py3-none-any.whl (36.1 kB view details)

Uploaded Jun 5, 2026 Python 3

File details

Details for the file dnaty-1.1.0.tar.gz.

File metadata

Download URL: dnaty-1.1.0.tar.gz
Upload date: Jun 5, 2026
Size: 62.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for dnaty-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6c1aef0f73fa7ec192783e195d1032a792b8493c813b6bc6be06e9d3298cfb1b`
MD5	`0f781a62b6adc365ef3a3ae9f848b1a0`
BLAKE2b-256	`f4b5b9d6edd1f9794025e14c32a5f2772489e55aa19c45e3fa8fb64652e310e4`

See more details on using hashes here.

File details

Details for the file dnaty-1.1.0-py3-none-any.whl.

File metadata

Download URL: dnaty-1.1.0-py3-none-any.whl
Upload date: Jun 5, 2026
Size: 36.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for dnaty-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d66e8023702c5d53e9eef312399240401c3c72c0c2c3ace81dc8c66fcb51af5a`
MD5	`3891dcf467e1b59e7a4299eea7d6a8f7`
BLAKE2b-256	`3ea65ae5b6e2d528fbfe15d6b925c284a671d2dbe10d0de5146846b2a1e88ffa`

See more details on using hashes here.

dnaty 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dNATY

Evolutionary AI Model Compression

Quickstart

Why dNATY?

How is this different from pruning / quantization / distillation?

Benchmark: dNATY vs alternatives

Measured across real datasets

Real examples

MNIST — MLP compression

CIFAR-10 — image classification

Custom DataLoader

Deterministic results with seed

API reference

compress(model, train_data, **kwargs) → CompressResult

CompressResult

FastDataset

DnatyEvolver (advanced)

How it works

Installation

Project structure

Reproducing the benchmarks

SaaS API

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`compress(model, train_data, **kwargs) → CompressResult`

`CompressResult`

`FastDataset`

`DnatyEvolver` (advanced)