digitallab is a python package for conducting large-scale computational experiments.

These details have not been verified by PyPI

Project links

Homepage

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Digital Lab (digitallab)

Introduction

digitallab is a python package for conducting large-scale computational experiments. The underlying framework is based on the module sacred. It extends its functionality by allowing batches of experiments, repetitions of experiments with different seeds, and parallel execution of experiments. Furthermore, it provides tools to evaluate the experiments via plots or tables.

Dependencies

Python packages:

numpy
tqdm
sacred
pandas
seaborn
matplotlib
pytables
pick

Optional dependencies:

For using MongoDB:

pymongo
MongoDB database

For using TinyDB:

tinydb (~=3.15.2)
tinydb-serialization (~=1.0.4)
hashsf

Installation

Via pip

Run

pip install --user digitallab

From source

Clone the project to your hard drive and run the command

python3 setup.py install --user

in the project folder.

Example

Conducting experiments

Assume we want to compare the run times and quality of three methods (fast, slow, special). fast and slow are taking the same arguments while special requires an extra parameter. We want to compare two instances "A" and "B". The three methods are defined as follows:

import numpy as np

def slow(config):
    np.random.seed(config["seed"])
    return_dict = dict()
    if config["instance"] == "A":
        return_dict["runtime"] = np.max(np.random.normal(1000, scale=300), 0)
        return_dict["value"] = np.random.normal(1, scale=0.5)
    else:
        return_dict["runtime"] = np.max(np.random.normal(10000, scale=300), 0)
        return_dict["value"] = np.random.normal(10, scale=0.5)
    return return_dict


def fast(config):
    np.random.seed(config["seed"])
    return_dict = dict()
    if config["instance"] == "A":
        return_dict["runtime"] = np.max(np.random.normal(200, scale=100), 0)
        return_dict["value"] = np.random.normal(2, scale=0.7)
    else:
        return_dict["runtime"] = np.max(np.random.normal(2000, scale=100), 0)
        return_dict["value"] = np.random.normal(20, scale=0.7)
    return return_dict


def special(config):
    np.random.seed(config["seed"])
    return_dict = dict()
    if config["instance"] == "A":
        return_dict["runtime"] = np.max(np.random.normal(500, scale=100), 0)
        return_dict["value"] = np.random.normal(1.5, scale=config["scale"])
    else:
        return_dict["runtime"] = np.max(np.random.normal(5000, scale=100), 0)
        return_dict["value"] = np.random.normal(15, scale=config["scale"])
    return return_dict

Then we can run the experiments. For the purpose of this example we will be using TinyDB, however MongoDB is highly recommended and should be the preferred database for storing experimental results.

from dlab.lab import Lab

# create the lab
lab = Lab("example").add_tinydb_storage("example_db")

Then we assign two dictonaries which define our experiments. digitallab will provide every possible combination of parameters to our experiment function. Additionally, every parameter combination will be submitted as often as specified by the field number_of_repetitions (each time with a different seed). By the way, a field seed is added for each config with the specific seed. The results of the experiments can be deleted and the experiments repeated and the given seeds will be identical.

Mandatory keys in a settings file are experiment, sub_experiment, version, and number_of_repetitions.

standard_setting = {
    "experiment": "test",
    "sub_experiment": "standard",
    "version": "1",
    "number_of_repetitions": 10,
    "method": ["slow", "fast"],
    "instance": ["A", "B"]
}

special_setting = {
    "experiment": "test",
    "sub_experiment": "special",
    "version": "1",
    "number_of_repetitions": 10,
    "method": "special",
    "scale": [0.1, 0.5, 1],
    "instance": ["A", "B"]
}

Finally we can define our experiment function and run the experiments:

@lab.experiment
def main(_config):
    if _config["method"] == "fast":
        return fast(_config)
    elif _config["method"] == "slow":
        return slow(_config)
    elif _config["method"] == "special":
        return special(_config)

Evaluating experiments

To be done...

ToDos

The project is work in progress and there are still some tasks to be done:

Documentation
Examples
Add support for SQL
Faster caching!
Experiments should not run if they do not have a matching experiment name
UI (perhaps)

Project details

These details have not been verified by PyPI

Project links

Homepage

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

1.4.3.3

Dec 12, 2021

1.4.3.2

Aug 12, 2021

1.4.3.1

Aug 12, 2021

1.4.3.0

Jul 5, 2021

1.4.2.0

Jun 14, 2021

1.4.1.0

Jun 9, 2021

1.4.0.0 yanked

Jun 4, 2021

Reason this release was yanked:

unstable

1.3.1.1

Jun 4, 2021

1.3.1.0

May 21, 2021

1.3.0.6

May 14, 2021

1.3.0.4

May 14, 2021

1.3.0.3

May 10, 2021

1.3.0.2

May 4, 2021

1.3.0.1

Apr 28, 2021

1.3.0.0

Apr 28, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

digitallab-1.4.3.3-py3-none-any.whl (67.9 kB view hashes)

Uploaded Dec 12, 2021 Python 3

Hashes for digitallab-1.4.3.3-py3-none-any.whl

Hashes for digitallab-1.4.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`612555afe5f41ea16a65ea0cb8f6dee4438ed6fd55043db6ba5716b63c106825`
MD5	`5b76a3aafb9bc19cb626a518f012ca32`
BLAKE2b-256	`9e3517052aa509722cae5c150718679097007bda91ef01e0003fd35f28b6c348`