Skip to main content

A framework for running computational experiments, powered by marimo notebooks

Project description

expd: experiment design

This package is powered by a collection of marimo notebooks.

expd is a command-line and GUI utility for running reproducible computational experiments, with independent trials optionally executed in parallel.

Quickstart. Install expd with

pip install expd
  1. Create a project:
expd init my_project && cd my_project
  1. Fill out the template marimo notebook to implement the trial function:
marimo edit src/trial.py
  1. Run an experiment by executing the trial function across varying inputs:
expd run
  1. View past experiment results:
expd view

CLI

Commands. The expd CLI comes with four commands. Each command is accompanied with a marimo notebook.

command description
expd init initialize project structure, creating a marimo notebook trial.py
expd explore interactively explore the trial function implemented in trial.py
expd run design and run an experiment based on trial.py
expd view view past expd run results

Projects

Running expd init my_project creates a directory my_project with the following structure:

my_project/
├── outputs
└── src
    └── trial.py

Defining a trial

The file trial.py is a marimo notebook that implements the core logic of your experiments. It starts as a template, which you'll need to fill out.

Types. In the cell named types, you'll define the inputs to and outputs of each trial as dataclasses.

Each input must have a type annotation. The supported scalar annotations are:

  • int
  • float
  • bool

Numerical (NumPy) arrays are also supported with the annotation

  • np.ndarray

In addition to numerical values, arbitrary categorical values are supported. These can be singleton or lists of arbitrary Python objects, selected from a given finite set. For example, define a string representing a filename with

from dataclasses import dataclass
import expd as ex

@dataclass
class Inputs(ex.Model):
   file: ex.T[str, ex.Categorical(options=list_of_files)]

or a boolean vector with

from dataclasses import dataclass
import expd as ex

@dataclass
class Inputs(ex.Model):
   vector: ex.T[bool, ex.Categorical(options=[True, False], length=3)]

Make sure to see the examples for common use cases.

Trial function. In the cell named trial, you'll define the trial function which maps inputs to outputs.

Other expd commands will use these cells to run your experiment and visualize its results.

[!TIP]

expd only uses the types and trial cells. While prototyping, you may use other cells to execute the trial function on different inputs. expd won't execute these extra cells when running an experiment.

Running an experiment

An experiment is a collection of independent executions of the trial function with varying inputs.

Create an experiment with:

expd run

This opens the experiment runner notebook in your browser.

Strategies. The runner notebook will prompt you to choose a generation strategy for each input. You can choose to grid numerical values or sample them randomly, and you can choose grid or sample subsets of categorical values.

Number of trials. The number of trials in your experiment is the size of your experiment grid times the number of samples per stochastic strategy, both of which are configurable in the notebook UI.

Experiment key. Every experiment is given a unique key based on a UTC timestamp, and the results of the experiment are saved to the outputs/ directory.

Parallelization. You may optionally choose to parallelize the execution of your experiment across trials. To do this, select a number of cores greater than one.

[!NOTE]

Currently, expd exploits parallelization using Python threads. This can yield speed-ups over a single threaded execution if your trial function calls expensive subroutines that release the GIL. Libraries such as NumPy, Torch, and TensorFlow release the GIL for many numerical operations.

In the future we may add an option to use multiple processes.

Resuming experiments. If an experiment was interrupted (for example, because your computer crashed, or you shut the experiment down), expd can resume it from the last completed trial. Just click the resume experiment button and choose the key that you'd like to resume.

Viewing experiment results

View the results of a previous experiment by running expd view in a project directory.

Experiment results are stored in the outputs/ folder, with each experiment stored in a directory keyed by a timestamp. Each experiment stores each trial's inputs and outputs, as well as metadata that describing the experiment set-up and the hardware specifications of the machine on which the experiment was run.

The viewer notebook launched by expd view presents this information to you in a friendly UI, and provides instructions on how to recreate any given trial from an experiment.

├── outputs
│   └── 2024-10-17-22-51-20
│       ├── experiment_meta.json
│       ├── requirements.txt
│       ├── src
│          ├── __pycache__
│             └── trial.cpython-310.pyc
│          └── trial.py
│       ├── strategies
│          ├── categorical-uniform
│          └── scalar-grid
│       └── trials
│           ├── trial-0
│              ├── inputs
│                 ├── a.ckpt
│                 └── b.ckpt
│              ├── outputs
│                 ├── a.ckpt
│                 ├── b.ckpt
│                 └── dictionary.ckpt
│              └── trial-meta.json

Examples

The examples directory contains several example projects that show how to define various kinds of inputs, including common use cases such as cycling through a list of filenames and saving an image in the trial function.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

expd-0.1.1.tar.gz (32.2 kB view details)

Uploaded Source

Built Distribution

expd-0.1.1-py3-none-any.whl (41.1 kB view details)

Uploaded Python 3

File details

Details for the file expd-0.1.1.tar.gz.

File metadata

  • Download URL: expd-0.1.1.tar.gz
  • Upload date:
  • Size: 32.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.2

File hashes

Hashes for expd-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7d89516d79328085ff032eece377a0c287918ee8841f9f5d98ed593b2cfad6d1
MD5 b38a390e080ee6848e579c72b6a09df3
BLAKE2b-256 5f6b5762b94a791185b95bf7f0070b0cd8bab756030dfcfd8271d9af3be4811b

See more details on using hashes here.

File details

Details for the file expd-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: expd-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 41.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.2

File hashes

Hashes for expd-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7c2966a6ec1f01fd4104826c7306e410d095c90d6d951b5b8d8d29aafedcbc48
MD5 44e64b8e303e84a5764faa8eeee2cbfa
BLAKE2b-256 7d977bddf16f0c027e6d5d99f5811ca932d27973e2d8f54eea9a6572c0ac5f63

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page