Declarative machine learning experiments.
Project description
dmlx
Declarative Machine Learning eXperiments
Introduction
dmlx is a declarative framework for machine learning (ML) experiments.
Typically, ML codebases use the standard python library argparse to parse
parameters from command line, and pass these parameters deep into the models and
other components. dmlx standardizes this process and provides an elegant
framework for experiment declaration and basic management, including the
following main features:
- Declarative Experiment Components: Declarative interfaces are presented for defining resusable and reproducible experiment components and hyperparameters, such as model path, dataset getter and random seed.
click-powered Command Line Interface:clickis integrated to provide powerful command line functionalities, including parameter properties.- Automatic Parameter Collection: Parameter properties will be wired with command line inputs and collected for experiment reproducibility.
- Experiment Archive Management: Archive directories will be automatically created to hold experiment data for further analysis.
- ML Framework Independent:
dmlxis independent from ML frameworks so you can use whatever ML framework you like (PyTorch/TensorFlow/ScikitLearn/...).
Example
An example ML codebase using dmlx is illustrated below:
my_innovative_approach/model/baseline.pyours.py
dataset/dataset_foo.pydataset_bar.py
experiments/- ...
approach.pytrain.pyanalyze.py
-
Firstly, models are defined as submodules of the
modelmodule, and dataset loaders are defined as submodules of thedatasetmodule. These components should expect normal Python arguments, and the component factories defined later usingcomponent()will parse command line parameters and pass the arguments to real components.# model/xxx.py class Model: def __init__(self, alpha: float, beta: float, ...) -> None: ...
# dataset/dataset_yyy.py def get_dataset_yyy(...): ...
-
Secondly, the components (models/datasets) and other parameters can be declared as properties on a composed approach using
dmlx. The parameter properties, declared byargument()andoption(), will define corresponding command line parameters and store them as instance attributes. The component properties, declared bycomponent(), will create the actual component objects and store them as instance attributes.# approach.py from dmlx.context import argument, option, component class Approach: model = component( argument("model_locator", default="ours"), # click argument "model", # module base "Model", # default factory name ) dataset = component( option("dataset_locator", "-d", "--dataset"), # click option "dataset", # module base ) epochs = option("-e", "--epochs", type=int, default=800) # click option def run(self): for epoch in range(self.epochs): for x, y_true in self.dataset: y_pred = self.model(x) yield x, y_true, y_pred
-
Thirdly,
dmlx.experiment.Experimentcan be used to declare your experiment. The experiment object will create an underlyingclickcommand, and the experiment context will collect the parameters(model_locator,dataset_locaterandepochs) and wire them with command line inputs.# train.py from dmlx.experiment import Experiment experiment = Experiment() with experiment.context(): from approach import Approach @experiment.main() def main(**args): experiment.init() approach = Approach() with (experiment.path / "train.log").open("w") as log_file: for x, y_true, y_pred in approach.run(): metrics = compute_metrics(y_pred, y_true) log_file.write(repr(metrics) + "\n") approach.model.save(experiment.path / "model.bin") experiment.run()
-
Finally, you can invoke
train.pyin the command line to actually conduct the experiment, where component params accept string locators in the form ofpath.to.module[:factory_name][?[k_0=v_0][;k_n=v_n...]]with values parsed byjson.loads.python train.py 'ours?alpha=0.1' \ --dataset 'dataset_foo:get_dataset_foo? version = "2.0"; shots = 5; # ... ' \ --epochs 500
-
After calling
experiment.init(), an experiment directory will be created inexperiments/, to whichexperiment.pathwill point, and the experiment meta will be dumped intometa.jsonin that directory. Extra data can also be saved to the experiment directory, as shown intrain.py, where a log filetrain.logholding epoch metrics and a model archivemodel.binare created. This experiment archive can then be loaded to perform extensive inspections, such as visualization and further statistical analysis, where properties defined onApproachwill be automatically restored:# analyze.py from dmlx.experiment import Experiment experiment = Experiment() with experiment.context(): from approach import Approach @experiment.main() def main(**args): print("Loaded args:", args) print("Loaded meta:", experiment.meta) approach = Approach() approach.model.load(experiment.path / "model.bin") # Now, `args`, `approach.model`, `approach.dataset` and other properties # are all restored, ready for extensive inspections. experiment.load("/path/to/the/experiment")
Links
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dmlx-0.2.0.tar.gz.
File metadata
- Download URL: dmlx-0.2.0.tar.gz
- Upload date:
- Size: 7.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.6.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
057e0a21bacc114ba21c1a5fc18a02dc49e877e64acea997b2d6528289399db4
|
|
| MD5 |
57972f4242cecae5cbdb09b5a9787d59
|
|
| BLAKE2b-256 |
b469cdfa31e44fadd3503fa5ad330ea631e80604ca042b46b55e3f54b0c211be
|
File details
Details for the file dmlx-0.2.0-py3-none-any.whl.
File metadata
- Download URL: dmlx-0.2.0-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.6.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eea9e6745d38ccaac1e951dc64d47928cf041bfd3a61273e9404b1e9696bc1a5
|
|
| MD5 |
e3c3e70698ec8c26bf72756250126ebe
|
|
| BLAKE2b-256 |
471118efd117aa962efba48f76bbf5a312a851fbf9a7ebfd797b6338030d90a4
|