Automates machine learning and other computer experiments
Project description
experitur
Automates machine learning and other computer experiments. Includes grid search and resuming aborted experiments. No lock-in, all your data is easily accessible in a text-based, machine-readable format.
Lab notebook
Every experiment is described in a lab book. This is a text file with a YAML header, e.g. a Markdown file or a YAML file without further content:
---
# In this part of the document called the "experiment section", enclosed by "---", you describe the experiment(s).
id: example
parameter_grid:
parameter_1: [1,2,3]
parameter_2: [a,b,c]
---
# An example experiment
In this part of the document, you can write down any content you like. Markdown files are allowed to contain a YAML header, so this could be Markdown.
Parameter grid
The core of an experiment is its parameter grid. It works like sklearn.model_selection.ParameterGrid
. Each parameter has a list of values that it can take. A number of trials is generated from the cross product of the values of each parameter.
Run function
Each experiment has a run
setting (unless it is an abstract experiment). It is a string pointing to a python function (i.e. <fully.qualified.name>:<function_name>
). Upon execution, the function gets called with the working directory and the parameters of the current trial. It may return a result dictionary.
Signature: (working_directory: str, parameters: dict) -> dict
---
# examle_labbook.md
id: example_experiment
run: "echo:run"
parameter_grid:
a: [1,2]
b: [a,b]
---
# echo.py
from pprint import pformat
def run(working_directory, parameters):
print("working_directory:", working_directory)
print("parameters:", pformat(parameters))
return {}
Now, you can run the experiment:
$ experitur run example_labbook.md
Running example_labbook.md...
Independent parameters: ['a', 'b']
Trial a-1_b-a: 0%| | 0/4 [00:00<?, ?/s]
a: 1
b: a
working_directory: example_labbook/example_experiment/a-1_b-a
parameters: {'a': 1, 'b': 'a'}
Trial a-1_b-b: 50%|█████████████████████████████████████████████████████████████████▌ | 2/4 [00:01<00:01, 1.98/s]
a: 1
b: b
working_directory: example_labbook/example_experiment/a-1_b-b
parameters: {'a': 1, 'b': 'b'}
Trial a-2_b-a: 75%|██████████████████████████████████████████████████████████████████████████████████████████████████▎ | 3/4 [00:02<00:00, 1.52/s]
a: 2
b: a
working_directory: example_labbook/example_experiment/a-2_b-a
parameters: {'a': 2, 'b': 'a'}
Trial a-2_b-b: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:03<00:00, 1.31/s]
a: 2
b: b
working_directory: example_labbook/example_experiment/a-2_b-b
parameters: {'a': 2, 'b': 'b'}
Overall: 4.035s
a-1_b-a: 1.009s (25%)
a-1_b-b: 1.008s (24%)
a-2_b-a: 1.007s (24%)
a-2_b-b: 1.007s (24%)
As you can see, run
was called four times with every combination of [1,2] x [a,b].
Multiple experiments
The experiment section can hold multiple experiments in a list:
---
- id: experiment_1
parameter_grid:
...
- id: experiment_2
parameter_grid:
...
---
Experiment inheritance
One experiment may inherit the settings of another, using the base
property:
---
- id: experiment_1
parameter_grid:
a: [1, 2, 3]
- id: experiment_2
base: experiment_1
parameter_grid:
b: [x, y, z]
# In effect, experiment_2 also a parameter 'a' that takes the values 1,2,3.
---
Parameter substitution
experitur
includes a recursive parameter substitution engine. Each value string is treated as a recursive format string and is resolved using the whole parameter set of a trial.
---
id: parsub
run: "echo:run"
parameter_grid:
a_1: [foo]
a_2: [bar]
a: ["{a_{b}}"]
b: [1,2]
---
$ experitur run parsub.md
Running parsub.md...
Independent parameters: ['b']
Trial 0: b-1
0% (0/2) [ ] eta --:-- /
a: foo
a_1: foo
a_2: bar
b: 1
parsub/parsub/b-1
{'a': 'foo', 'a_1': 'foo', 'a_2': 'bar', 'b': 1}
Trial 1: b-2
50% (1/2) [####### ] eta --:-- -
a: bar
a_1: foo
a_2: bar
b: 2
parsub/parsub/b-2
{'a': 'bar', 'a_1': 'foo', 'a_2': 'bar', 'b': 2}
Overall: 0.002s
b-1: 0.000s (18%)
b-2: 0.000s (14%)
This way, you can easily run complicated setups with settings that depend on other settings.
Recursive format strings work like string.Formatter
with two excpetions:
-
Recursive field names: The field name itself may be a format string:
format("{foo_{bar}}", bar="baz", foo_baz="foo") -> "foo"
-
Literal output: If the format string consist solely of a replacement field and does not contain a format specification, no to-string conversion is performed:
format("{}", 1) -> 1
This allows the use of format strings for non-string values.
Files
When experitur
executes a lab book, it creates the following file structure in the directory where the lab book is located:
/
+- lab_book.md
+- lab_book/
| +- experiment1_id/
| | +- trial1_id/
| | | +- experitur.yaml
| | ...
| ...
<lab_book_name>/<experiment_id>/<trial_id>/experitur.yaml
contains the parameters and the results from a trial, e.g.:
experiment_id: example_experiment
parameters_post: {a: 1, b: a}
parameters_pre: {a: 1, b: a}
result: {}
success: true
time_end: 2019-01-31 13:50:51.003637
time_start: 2019-01-31 13:50:50.002264
trial_id: a-1_b-a
Most items should be self-explanatory. parameters_pre
are the parameters passed to the run function, parameters_post
are the parameters after the run function had the chance to update them. trial_id
is derived from the parameters that are varied in the parameter grid. This way, you can easily interpret the file structure.
Collecting results
Use experitur collect <lab_book>.md
to collect all the results (including parameters and metadata) of all trials of a lab book into a single CSV file located at <lab_book>/results.csv
.
Calling functions and default parameters
Your run
function might call other functions that have default parameters.
experitur
gives you some utility functions that extract these default parameters adds them to the list of parameters.
-
extract_parameters(prefix: str, parameters: dict) -> dict
: Extract all parameters that start withprefix
.extract_parameters("p_", {"p_a": 1, "p_b": 2}) == {"a": 1, "b": 2}
-
apply_parameters(prefix: str, parameters: dict, callable_: callable, *args, **kwargs)
: Callcallable_
with the parameters starting withprefix
.apply_parameters("p_", {"p_a": 1, "p_b": 2}, fun, 10, c=20) # is the same as fun(10, a=1, b=2, c=20)
-
set_default_parameters(prefix, parameters, [callable_,] **defaults)
: Set default values for parameters that were not set previously. Values indefaults
override default parameters ofcallable_
.def foo(a=1, b=2, c=3): pass set_default_parameters("foo_", parameters, foo, c=4) # is the same as parameters.setdefault("foo_a", 1) parameters.setdefault("foo_b", 2) parameters.setdefault("foo_c", 4)
It is a good idea to make use of set_default_parameters
and apply_parameters
excessively. This way, your result files always contain the full set of parameters.
For a simple example, see examples/str_split.md.
Installation
experitur
is packaged on PyPI.
pip install experitur
Be warned that this package is currently under heavy development and anything might change any time!
Compatibility
experitur
is tested with Python 3.5, 3.6 and 3.7.
Similar software
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for experitur-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13ae904643f16ed9980e949adbdb8c147f08dd78b9607ebde12eb0dfd0cc53da |
|
MD5 | 5c9632781dcad57e48308527665fcead |
|
BLAKE2b-256 | 2e6ab1031655f136e0a4b1cf75dd79b9635bad1803b88384460f21c740cfea35 |