Run reproducible experiments from yaml configuration file
Project description
Expyrun
Run fully reproducible experiments from YAML configuration files.
Expyrun is a command-line tool that launches your code from a YAML configuration file and automatically stores everything required to reproduce the run in a dedicated output directory.
It helps you:
- Centralize experiment configuration
- Track code and dependency versions
- Reproduce experiments exactly
- Organize outputs cleanly
โ ๏ธ Project Status
This library was originally developed to fit my own needs as a
researcher.
Its design and implementation are therefore somewhat opinionated and
tailored toward research workflows.
Expyrun is currently in beta.
Contributions are very welcome!
Do not hesitate to open an issue if you encounter a bug, have a
suggestion, or would like to discuss improvements.
โจ Features
- YAML-based configuration
- Configuration inheritance
- Environment variable resolution (
${MY_VAR}) - Self-referencing config values (e.g., experiment names based on hyperparameters)
- Automatic experiment directory creation
- Frozen
requirements.txtsnapshot - Source code snapshot
- Automatic stdout/stderr logging
- Command-line hyperparameter overrides
[!WARNING] Current limitation: lists of objects are not yet supported in the configuration file.
๐ Installation
Install with pip
pip install expyrun
Install from source
git clone https://github.com/raphaelreme/expyrun.git
cd expyrun
pip install .
๐ Getting Started
Expyrun is a command-line tool. Once installed:
expyrun -h # Display Expyrun help
expyrun path/to/config.yml # Run the experiments described by the YAML configuration
expyrun path/to/config.yml --debug # Run in a debug-specific folder and using the original code without duplication
1๏ธโฃ Create an entry point
Your code must expose a function with the following signature:
def entry_point(name: str, config: dict) -> None:
...
name: the experiment nameconfig: the parsed configuration dictionary
Expyrun will import and execute this function.
2๏ธโฃ Minimal configuration file
__run__:
__main__: package.module:entry_point
__output_dir__: /path/to/output_dir
__name__: my_experiment
# Additional configuration passed to your function
# seed: 666
# data: /path/to/data
# device: cuda
__run__ section fields
| Key | Required | Description |
|---|---|---|
__main__ |
โ | Entry point in the form package.module:function |
__output_dir__ |
โ | Base directory where experiments are stored |
__name__ |
โ | Experiment name (used to build output path) |
__code__ |
โ | Optional path to the source code |
By default, Expyrun searches for your package in the current working
directory.
You can override this using __code__.
[!NOTE] As of now, Expyrun only duplicates the package of the
__main__entry point, which is searched inside__code__folder. Consequently, all of your code should be contained into a single package (which may consist of multiple subpackages)
๐ฆ What Expyrun Generates
For each run, Expyrun creates:
{output_dir}/{name}/exp.{i}/ # If run without --debug (default)
{output_dir}/DEBUG/{name}/exp.{i} # if run with --debug
Inside:
config.yml--- parsed configurationraw_config.yml--- original configurationfrozen_requirements.txt--- environment snapshotoutputs.log--- stdout/stderr log- A copy of your source code package
From inside your entry function, the working directory is automatically
set to the experiment folder.
You can safely write outputs (models, logs, metrics, etc.) directly to
the current directory.
[!NOTE] Expyrun does not copy external dependencies such as datasets (usually to heavy). You are responsible for keeping data paths valid when reproducing experiments.
๐งฉ Configuration File Format
Expyrun reserves three special sections in YAML files.
__default__
Inherit configuration from other YAML files.
__default__: path/to/base.yml
Or:
__default__:
- base.yml
- other.yml
Paths may be:
- Absolute:
/path/to/file.yml - Relative to CWD:
path/to/file.yml - Relative to the config file:
./path/to/file.yml
This allows you to build modular experiment configurations.
__new_key_policy__
Defines how new keys are handled when inheriting.
Options:
"raise"--- Error"warn"--- Warning (Default)"pass"--- Silently accept
A new key is one not defined in any parent configs.
[!NOTE] This does not apply to a base configuration (with no parent).
__run__
Defines how the experiment should be executed.
__run__:
__main__: package.module:function
__name__: experiment_name
__output_dir__: /base/output/path
__code__: optional/path/to/code
User-defined configuration
Any parameters that your experiment needs to run. For example:
seed: 666
training:
lr: 0.0001
epochs: 50
datasets:
- Cifar10
- Cifar100
- ImageNet
๐งช Concrete Example
[!TIP] See the example/ directory in the repository for a minimal working example.
Project structure
my_project/
โโโ data/
โโโ src/
โ โโโ __init__.py
โ โโโ utils.py
โ โโโ data.py
| โโโ methods.py
โ โโโ experiments/
โ โโโ __init__.py
โ โโโ train.py
โ โโโ eval.py
โโโ configs/
โ โโโ data.yml
โ โโโ methods.yml
โ โโโ experiments/
โ โโโ common.yml
โ โโโ train.yml
โ โโโ eval.yml
data.yml
data:
location: $DATA_FOLDER
train_size: 0.7
methods.yml
ResNet:
layers: 50
epochs: 200
lr: 0.001
ViT:
epochs: 30
lr: 0.0005
patch_size: 16
common.yml
seed: 666
device: cuda
train.yml
__default__:
- ../data.yml
- ../methods.yml
- ./common.yml
__run__:
__main__: src.experiments.train:main
__output_dir__: $OUTPUT_DIR
__name__: training/{seed} # Name can depend on the seed
eval.yml
__new_key_policy__: pass # Allow new keys
__default__: ./train.yml # Inherit from train and therefore from common, data and methods
__run__:
__main__: src.experiments.eval:main
__name__: evaluation/{seed}
training_exp: 0 # Id of the training exp to reload
training_folder: $OUTPUT_DIR/training/{seed}/exp.{training_exp}/
โถ Running Experiments
From the root of my_project:
# Set up the required env variables (could be inside ~/.bashrc)
export OUTPUT_DIR=/path/to/output
export DATA_FOLDER=/path/to/data
# Then run expyrun
expyrun configs/experiments/train.yml
With debug mode:
expyrun configs/experiments/train.yml --debug
Override parameters from the CLI:
expyrun configs/experiments/eval.yml --training_exp 3
๐ Output Structure Example
After running, you typically get:
$OUTPUT_DIR/
โโโ training/
โ โโโ 666/
โ โโโ exp.0/
โ โโโ config.yml
โ โโโ raw_config.yml
โ โโโ frozen_requirements.txt
โ โโโ outputs.log
โ โโโ src/
โ โโโ checkpoints/
โ โโโ ViT.ckpt
โ โโโ ResNet.ckpt
โโโ evaluation/
โโโ 666/
โโโ exp.0/
โโโ ...
๐ Reproducing Experiments
Exact reproduction
# Will reproduce this previous experiments into the next available exp.{i} folder
expyrun $OUTPUT_DIR/training/666/exp.0/config.yml
Modify hyperparameters
expyrun $OUTPUT_DIR/training/666/exp.0/raw_config.yml --ResNet.lr 0.005 --seed 111
config.ymlโ parsed, fixed configurationraw_config.ymlโ original config; recommended when modifying parameters: If you change a hyperparameter that affects the experiment name (i.e.seed), the directory will automatically adapt.
Parsing Variables
Expyrun resolves environment variables inside YAML, as well as self references:
data_path: $DATA_FOLDER
dataset: ${DATASET}_raw
seed: 555
output_path: $OUTPUT_FOLDER/{seed}
Environment Variables
Expyrun defines the following variables:
EXPYRUN_CWD
The original working directory from which Expyrun was launched.
This can be useful if your code needs to know where execution started before Expyrun switches to the experiment directory.
๐ก Tips
-
Consider using
dataclassesanddaciteto convert configuration dictionaries into strongly-typed Python objects. -
Keep datasets versioned or documented externally for full reproducibility.
-
Use inheritance (
__default__) to build clean experiment hierarchies.
๐ License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file expyrun-0.2.1.tar.gz.
File metadata
- Download URL: expyrun-0.2.1.tar.gz
- Upload date:
- Size: 13.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4058e3141f20c3f59c897b806a8bfe9ce61aa7d90feeafd8169db37380241fa6
|
|
| MD5 |
57fdb1f6b5445b324e4547853ac47256
|
|
| BLAKE2b-256 |
38899f9631b7daf751122dadef83d46ff6af681e4f89c05c632b2cf25f085494
|
File details
Details for the file expyrun-0.2.1-py3-none-any.whl.
File metadata
- Download URL: expyrun-0.2.1-py3-none-any.whl
- Upload date:
- Size: 14.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c460b919d4a140c249689fdaef324f9347d7b6d850b94778fb68552e592828c5
|
|
| MD5 |
e6770839986c493f9867cb98887627f8
|
|
| BLAKE2b-256 |
bc5c73451e5116d537d950858b83e9d26e5e4820d4b9f8778960642f932faa58
|