Skip to main content

A pytest plugin to help developers of research-oriented software projects keep track of the results of their numerical experiments.

Project description

PyPI version Python versions See Build Status on travis Documentation Status

A pytest plugin to help developers of research-oriented software projects keep track of the results of their numerical experiments.


What it does

pytest-experiments allows research-oriented programmers to easily persist data about numerical experiments so that they can track metrics over time.

  1. Know what experiments you’ve run, when you ran them, its inputs and its results over time and project development.

  2. Review and compare experiments over time to ensure that continued development is improving your results.

How it works

An experiment is a python function that runs your method or algorithm against some input and reports one or more metrics of interest.

Experiments are basically unit tests of numerical methods. Like unit tests we provide a function or method under test with some input and assert that its output conforms to some concrete expectations. Unlike unit tests, the method under test produces some metrics which we are interested in but for which concrete expectations do not exist. We store these metrics, along with some metadata in a database so that we can track our results over time.

We use pytest to collect and execute our experiments. This plugin offers a notebook fixture to facilitate recording your experiments. Here is a very simple example experiment:

import pytest
from my_numerical_method_package import (
    my_implementation,  # your numerical method
    result_is_valid,    # returns True iff the result is well-formed
    performance_metric, # the performance metric we care about
)

@pytest.mark.experiment  # [optional] mark this test as an experiment
@pytest.mark.parameterize("x", [1, 2, 3])  # The inputs; we will run this experiment for x=1, x=2, and x=3
def test_my_numerical_method(notebook, x):  # Request the notebook fixture
    result = my_implementation(x)
    assert result_is_valid(result)  # our concrete expectations about the result
    notebook.record(performance=performance_metric(result))  # record the performance

At the end of the test the notebook fixture will save experiment metadata, the inputs, and whatever was passed to notebook.record to a database. By default, this database will be a sqlite database called experiments.db.

A machine learning example

For example, suppose we are building a machine learning classifier. The method under test would be our model, the input would be train and validation datasets and any hyper-parameters of our methods. The model is initialized with the hyper-parameters, trained on the training data, and the output is the predictions on the validation set.

We want the model to return probabilities, so we have a concrete expectation that the predictions should all be between 0 and 1. If any are not, our code is wrong and the experiment should fail.

However, we are not only interested in returning probabilities, we also want our model to return good predictions (e.g. the predictions have high accuracy and high fairness). We might have some conrete expectations about these metrics: for example we may wish to reject any result that has metrics strictly worse than some baseline, but it is not easy or meaningful to specify a criterion based on the accuracy and fairness values for when we should stop developing our model. In fact, the metrics collected during the experiment may inform subsequent development.

See the demo directory for a detailed example-based walkthrough.

Installation

You can install “pytest-experiments” via pip from PyPI:

$ pip install pytest-experiments

Contributing

Contributions are very welcome. This project uses poetry for packaging.

To get set up simply clone the repo and run

poetry install
poetry run pre-commit install

The first command will install the package along with all development dependencies in a virtual environment. The second command will install the pre-commit hook which will automatically format source files with black.

Tests can be run with pytest

Please document any code added with docstrings. New modules can be auto-documented by running:

sphinx-apidoc -e -o docs/source src/pytest_experiments

Documentation can be compiled (for example to html with make):

cd docs/
make html

License

Distributed under the terms of the MIT license, “pytest-experiments” is free and open source software

Issues

If you encounter any problems, please file an issue along with a detailed description.

Acknowledgements

This pytest plugin was generated with Cookiecutter along with @hackebrot’s cookiecutter-pytest-plugin template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytest-experiments-0.1.1.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

pytest_experiments-0.1.1-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file pytest-experiments-0.1.1.tar.gz.

File metadata

  • Download URL: pytest-experiments-0.1.1.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Linux/5.11.0-41-generic

File hashes

Hashes for pytest-experiments-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4d417c75ac6dbe942799a4681f146e17d5e3458d7c1e8541dc9b897fe1cab55d
MD5 db677802be354c0195b648aaf567f7f9
BLAKE2b-256 b918ee54cf5240892d1b3d57b47d8058e1989b7e6a047a17e81baf60cd567b4a

See more details on using hashes here.

File details

Details for the file pytest_experiments-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pytest_experiments-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Linux/5.11.0-41-generic

File hashes

Hashes for pytest_experiments-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ba45b6c21159964e0f8f7abdcf927b6e526a436904686215c6b69974e0eadbf5
MD5 e5247f60a9d80c2b97a4d21b15e9a0c3
BLAKE2b-256 caf0331725045c60845a1f89884b4c50efc4d725df0189046df1c8ee5b38429a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page