digitallab is a python package for conducting large-scale computational experiments.
Project description
Digital Lab (digitallab)
Introduction
digitallab is a python package for conducting large-scale computational experiments. The underlying framework is based on the module sacred. It extends its functionality by allowing batches of experiments, repetitions of experiments with different seeds, and parallel execution of experiments. Furthermore, it provides tools to evaluate the experiments via plots or tables.
Dependencies
Python packages:
- numpy
- tqdm
- sacred
- pandas
- seaborn
- matplotlib
- pytables
- pick
Optional dependencies:
For using MongoDB:
- pymongo
- MongoDB database
For using TinyDB:
- tinydb (~=3.15.2)
- tinydb-serialization (~=1.0.4)
- hashsf
Installation
Via pip
Run
pip install --user digitallab
From source
Clone the project to your hard drive and run the command
python3 setup.py install --user
in the project folder.
Example
Conducting experiments
Assume we want to compare the run times and quality of three methods (fast
, slow
, special
).
fast
and slow
are taking the same arguments while special
requires an extra parameter.
We want to compare two instances "A" and "B". The three methods are defined as follows:
import numpy as np
def slow(config):
np.random.seed(config["seed"])
return_dict = dict()
if config["instance"] == "A":
return_dict["runtime"] = np.max(np.random.normal(1000, scale=300), 0)
return_dict["value"] = np.random.normal(1, scale=0.5)
else:
return_dict["runtime"] = np.max(np.random.normal(10000, scale=300), 0)
return_dict["value"] = np.random.normal(10, scale=0.5)
return return_dict
def fast(config):
np.random.seed(config["seed"])
return_dict = dict()
if config["instance"] == "A":
return_dict["runtime"] = np.max(np.random.normal(200, scale=100), 0)
return_dict["value"] = np.random.normal(2, scale=0.7)
else:
return_dict["runtime"] = np.max(np.random.normal(2000, scale=100), 0)
return_dict["value"] = np.random.normal(20, scale=0.7)
return return_dict
def special(config):
np.random.seed(config["seed"])
return_dict = dict()
if config["instance"] == "A":
return_dict["runtime"] = np.max(np.random.normal(500, scale=100), 0)
return_dict["value"] = np.random.normal(1.5, scale=config["scale"])
else:
return_dict["runtime"] = np.max(np.random.normal(5000, scale=100), 0)
return_dict["value"] = np.random.normal(15, scale=config["scale"])
return return_dict
Then we can run the experiments. For the purpose of this example we will be using TinyDB, however MongoDB is highly recommended and should be the preferred database for storing experimental results.
from dlab.lab import Lab
# create the lab
lab = Lab("example").add_tinydb_storage("example_db")
Then we assign two dictonaries which define our experiments. digitallab
will provide every
possible combination of parameters to our experiment function. Additionally, every
parameter combination will be submitted as often as specified by the field number_of_repetitions
(each time with a different seed). By the way, a field seed
is added for each config
with the specific seed. The results of the experiments can be deleted and the experiments
repeated and the given seeds will be identical.
Mandatory keys in a settings file are experiment
, sub_experiment
, version
, and
number_of_repetitions
.
standard_setting = {
"experiment": "test",
"sub_experiment": "standard",
"version": "1",
"number_of_repetitions": 10,
"method": ["slow", "fast"],
"instance": ["A", "B"]
}
special_setting = {
"experiment": "test",
"sub_experiment": "special",
"version": "1",
"number_of_repetitions": 10,
"method": "special",
"scale": [0.1, 0.5, 1],
"instance": ["A", "B"]
}
Finally we can define our experiment function and run the experiments:
@lab.experiment
def main(_config):
if _config["method"] == "fast":
return fast(_config)
elif _config["method"] == "slow":
return slow(_config)
elif _config["method"] == "special":
return special(_config)
Evaluating experiments
To be done...
ToDos
The project is work in progress and there are still some tasks to be done:
- Documentation
- Examples
- Add support for SQL
- Faster caching!
- Experiments should not run if they do not have a matching experiment name
- UI (perhaps)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for digitallab-1.4.3.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a24576ce65ca0dd6d99e6deb647fc1f4a9c471c5928538ab48cb819136d63cf |
|
MD5 | eb1484faec9b90f7fa9fda91468eea69 |
|
BLAKE2b-256 | f86787db36c8d2b452070e019dbb97f129096b974fc80da142b135957ea8250c |