Configuration, experimentation, and hyperparameter optimization for Python. No runtime magic. No launcher. Just Python modules you compose.
Project description
Ato: A Scope-Based Config Layer for ML
Describe experiments as composable config views. Get reproducibility for free.
Ato is a scope-based configuration layer for ML experiments. It treats configuration not as static files, but as a sequence of transformations (reasoning) with explicit priorities and dependencies.
pip install ato
Quick Start
from ato.scope import Scope
from ato.adict import ADict
scope = Scope()
@scope.observe(default=True)
def defaults(config):
config.lr = 1e-3
config.epochs = 50
config.model = 'resnet50'
@scope.observe(priority=1)
def high_lr(config):
config.lr = 3e-3
@scope.observe(priority=2, chain_with='high_lr')
def long_run(config):
config.epochs = 200
@scope.manual
def docs(manual):
manual.lr = 'Learning rate for optimizer'
manual.epochs = 'Number of training epochs'
@scope
def train(config):
print(f'Training {config.model} for {config.epochs} epochs with lr={config.lr}')
if __name__ == '__main__':
train()
Run it:
python train.py # lr=1e-3, epochs=50
python train.py high_lr # lr=3e-3, epochs=50
python train.py long_run # lr=3e-3, epochs=200 (chain_with auto-applies high_lr)
python train.py lr=0.01 # CLI override
python train.py manual # Show view application order + docs
Table of Contents
- Core Concepts
- CLI Syntax
- Fingerprinting (Reproducibility)
- SQL Tracker (Local Experiment Tracking)
- Hyperparameter Optimization
- FAQ
Core Concepts
Views and Priority
Views are functions that modify configuration. Lower priority is applied first.
@scope.observe(priority=-1) # Applied first
def base(config):
config.lr = 0.001
@scope.observe(priority=0) # Applied second
def mid(config):
config.lr = 0.01
@scope.observe(priority=1) # Applied last
def high(config):
config.lr = 0.1
Application order:
Default Views (priority=0) → Named Views (by priority) → CLI Arguments → Lazy Views
CLI arguments always have the highest priority.
ADict: Structure-Aware Dict
ADict is not just a dict—it's a structure-aware config object.
from ato.adict import ADict
# Nested access
config = ADict()
config.model.backbone.layers = [64, 128, 256] # Auto-creates nested structure
# Structural hashing (tracks structure, not values)
config1 = ADict(lr=0.1, epochs=100)
config2 = ADict(lr=0.01, epochs=200)
config1.get_structural_hash() == config2.get_structural_hash() # True (same structure)
config3 = ADict(lr=0.1, epochs='100') # epochs is str!
config1.get_structural_hash() == config3.get_structural_hash() # False (different structure)
# Access tracking
config = ADict(lr=0.1, epochs=100, unused=999)
_ = config.lr
config.get_minimal_config() # {'lr': 0.1} - only accessed keys
# Freeze/Defrost
config.freeze() # Read-only
config.defrost() # Editable
# File I/O
config = ADict.from_file('config.yaml')
config.dump('config.json')
Config Chaining
Declare dependencies with chain_with to auto-apply prerequisite views.
@scope.observe()
def base_setup(config):
config.project_name = 'my_project'
config.data_dir = '/data'
@scope.observe(chain_with='base_setup') # base_setup applied first
def advanced_training(config):
config.distributed = True
@scope.observe(chain_with=['base_setup', 'gpu_setup']) # Multiple dependencies
def multi_node_training(config):
config.nodes = 4
python train.py advanced_training
# Result: base_setup → advanced_training
Lazy Evaluation
Views with lazy=True are executed after CLI arguments are applied.
@scope.observe()
def base_config(config):
config.dataset = 'imagenet'
@scope.observe(lazy=True) # Executed after CLI args
def computed_config(config):
if config.dataset == 'imagenet':
config.num_classes = 1000
elif config.dataset == 'cifar10':
config.num_classes = 10
python train.py dataset:=cifar10 computed_config
# Result: num_classes=10
Python 3.11+ Context Manager:
@scope.observe()
def my_config(config):
config.model = 'resnet50'
config.num_layers = 50
with Scope.lazy(): # Executed after CLI
if config.model == 'resnet101':
config.num_layers = 101
MultiScope: Namespace Isolation
Manage completely separate configuration namespaces.
from ato.scope import Scope, MultiScope
model_scope = Scope(name='model')
data_scope = Scope(name='data')
scope = MultiScope(model_scope, data_scope)
@model_scope.observe(default=True)
def model_config(model):
model.backbone = 'resnet50'
model.lr = 0.1 # Model learning rate
@data_scope.observe(default=True)
def data_config(data):
data.dataset = 'cifar10'
data.lr = 0.001 # Data augmentation LR (no conflict!)
@scope
def train(model, data): # Parameter names match scope names
print(f'Model LR: {model.lr}, Data LR: {data.lr}')
python train.py model.backbone:=resnet101 data.dataset:=imagenet
CLI Syntax
Basic Syntax
| Type | Syntax | Example |
|---|---|---|
| Apply view | view_name |
python train.py high_lr long_run |
| Python expression | key=value |
lr=0.01, layers=[1,2,3], enable=True |
| String literal | key:=value |
model:=resnet50, name:="Hello World" |
| Nested key | a.b.c=value |
model.backbone:=resnet101 |
String Assignment (:= Syntax)
Unlike Python expressions (=), := assigns values directly as strings.
# Simple string without quotes
python train.py model:=resnet50
# String with spaces requires quotes
python train.py prompt:="Hello World"
python train.py prompt:='Hello World'
# Mixed usage
python train.py lr=0.01 layers=[1,2,3] name:=experiment_1
MultiScope CLI
python train.py model.backbone:=resnet101 data.batch_size=64
Fingerprinting (Reproducibility)
Code Fingerprinting
Tracks logic changes, ignoring comments, whitespace, and variable names.
@scope.trace(trace_id='train_step')
@scope
def train_v1(config):
loss = model(data)
return loss
@scope.trace(trace_id='train_step')
@scope
def train_v2(config):
# Added comment
loss = model(data) # Same logic
return loss
# train_v1 and train_v2 have identical fingerprints (same logic)
When fingerprint changes:
@scope.trace(trace_id='train_step')
@scope
def train_v3(config):
loss = model(data) * 2 # Logic changed!
return loss
Runtime Fingerprinting
Tracks actual function outputs.
import numpy as np
# Basic: Track full output
@scope.runtime_trace(trace_id='predictions')
@scope
def evaluate(model, data):
return model.predict(data)
# init_fn: Fix randomness
@scope.runtime_trace(
trace_id='predictions',
init_fn=lambda: np.random.seed(42)
)
@scope
def evaluate_with_dropout(model, data):
return model.predict(data)
# inspect_fn: Track specific parts
@scope.runtime_trace(
trace_id='predictions',
inspect_fn=lambda preds: preds[:100] # Only first 100
)
@scope
def evaluate_large_output(model, data):
return model.predict(data)
When to use:
- Code fingerprinting: Track code changes, verify refactoring
- Runtime fingerprinting: Detect non-determinism, debug silent failures
SQL Tracker (Local Experiment Tracking)
Lightweight experiment tracking with SQLite. No server required.
Logging
from ato.db_routers.sql.manager import SQLLogger
from ato.adict import ADict
config = ADict(
experiment=ADict(
project_name='image_classification',
sql=ADict(db_path='sqlite:///experiments.db')
),
lr=0.001,
batch_size=32
)
logger = SQLLogger(config)
run_id = logger.run(tags=['baseline', 'resnet50'])
for epoch in range(100):
loss = train_one_epoch()
acc = validate()
logger.log_metric('train_loss', loss, step=epoch)
logger.log_metric('val_accuracy', acc, step=epoch)
logger.log_artifact(run_id, 'checkpoints/model_best.pt', data_type='model')
logger.finish(status='completed')
Querying
from ato.db_routers.sql.manager import SQLFinder
finder = SQLFinder(config)
# Get all runs
runs = finder.get_runs_in_project('image_classification')
# Find best run
best_run = finder.find_best_run(
project_name='image_classification',
metric_key='val_accuracy',
mode='max'
)
# Find similar experiments (same config structure)
similar = finder.find_similar_runs(run_id=123)
Hyperparameter Optimization
Built-in Hyperband algorithm with successive halving.
from ato.adict import ADict
from ato.hyperopt.hyperband import HyperBand
from ato.scope import Scope
scope = Scope()
search_spaces = ADict(
lr=ADict(
param_type='FLOAT',
param_range=(1e-5, 1e-1),
num_samples=20,
space_type='LOG'
),
batch_size=ADict(
param_type='INTEGER',
param_range=(16, 128),
num_samples=5,
space_type='LOG'
),
model=ADict(
param_type='CATEGORY',
categories=['resnet50', 'resnet101', 'efficientnet_b0']
)
)
hyperband = HyperBand(
scope,
search_spaces,
halving_rate=0.3,
num_min_samples=3,
mode='max'
)
@hyperband.main
def train(config):
model = create_model(config.model)
optimizer = Adam(lr=config.lr)
val_acc = train_and_evaluate(model, optimizer)
return val_acc
if __name__ == '__main__':
best_result = train()
print(f'Best config: {best_result.config}')
print(f'Best metric: {best_result.metric}')
FAQ
Does Ato replace Hydra?
No. Hydra focuses on hierarchical composition and overrides; Ato focuses on priority-based reasoning and causality tracking. Use them together or separately.
Does Ato conflict with MLflow/W&B?
No. MLflow/W&B provide dashboards and cloud tracking; Ato provides local causality tracking (config reasoning + code fingerprinting). Use them together: MLflow/W&B for metrics/dashboards, Ato for "why did this change?"
Do I need a server?
No. Ato uses local SQLite. Zero setup, zero network calls.
Can I use Ato with existing config files?
Yes. Ato is format-agnostic:
- Load YAML/JSON/TOML → Ato fingerprints the result
- Use argparse → Ato integrates seamlessly
- Import OpenMMLab configs →
_base_inheritance handled
What's the performance overhead?
Minimal:
- Config fingerprinting: microseconds
- Code fingerprinting: once at decoration time
- Runtime fingerprinting: depends on
inspect_fncomplexity - SQLite logging: milliseconds per metric
Requirements
- Python >= 3.7 (Lazy evaluation requires Python >= 3.8)
- SQLAlchemy (for SQL Tracker)
- PyYAML, toml (for config serialization)
See pyproject.toml for full dependencies.
Contributing
Contributions are welcome! Submit issues or pull requests.
git clone https://github.com/Dirac-Robot/ato.git
cd ato
pip install -e .
Running Tests
python -m pytest unit_tests/
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ato-2.2.1.tar.gz.
File metadata
- Download URL: ato-2.2.1.tar.gz
- Upload date:
- Size: 28.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81ad56a7e8946fc6fd7109d4d1675c6791bc3025e989602aa520c61d755d7ce1
|
|
| MD5 |
47ba02ce2d140b732e03e1c395475269
|
|
| BLAKE2b-256 |
cccff40140bbf36b12ced6001e2a6648b97a9e9bfce48bd598f741dcdd254d58
|
File details
Details for the file ato-2.2.1-py3-none-any.whl.
File metadata
- Download URL: ato-2.2.1-py3-none-any.whl
- Upload date:
- Size: 26.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8334a9a2352b3798ba629789450e2e7611474b28906de4ff567e33a104bb7030
|
|
| MD5 |
41398ffbc4c9898d938d3de9555c2138
|
|
| BLAKE2b-256 |
98b2396cecdd32c8c9522cf41432be9fbbdab71221b2146b297ef9cc16e82856
|