Skip to main content

A pytest plugin to help developers of research-oriented software projects keep track of the results of their numerical experiments.

Project description

PyPI version Python versions See Build Status on travis

A pytest plugin to help developers of research-oriented software projects keep track of the results of their numerical experiments.


What it does

pytest-experiments allows research-oriented programmers to easily persist data about numerical experiments so that they can track metrics over time.

  1. Know what experiments you’ve run, when you ran them, its inputs and its results over time and project development.

  2. Review and compare experiments over time to ensure that continued development is improving your results.

How it works

An experiment is a python function that runs your method or algorithm against some input and reports one or more metrics of interest.

Experiments are basically unit tests of numerical methods. Like unit tests we provide a function or method under test with some input and assert that its output conforms to some concrete expectations. Unlike unit tests, the method under test produces some metrics which we are interested in but for which concrete expectations do not exist. We store these metrics, along with some metadata in a database so that we can track our results over time.

We use pytest to collect and execute our experiments. This plugin offers a notebook fixture to facilitate recording your experiments. Here is a very simple example experiment:

import pytest
from my_numerical_method_package import (
    my_implementation,  # your numerical method
    result_is_valid,    # returns True iff the result is well-formed
    performance_metric, # the performance metric we care about
)

@pytest.mark.experiment  # [optional] mark this test as an experiment
@pytest.mark.parameterize("x", [1, 2, 3])  # The inputs; we will run this experiment for x=1, x=2, and x=3
def test_my_numerical_method(notebook, x):  # Request the notebook fixture
    result = my_implementation(x)
    assert result_is_valid(result)  # our concrete expectations about the result
    notebook.record(performance=performance_metric(result))  # record the performance

At the end of the test the notebook fixture will save experiment metadata, the inputs, and whatever was passed to notebook.record to a database. By default, this database will be a sqlite database called experiments.db.

A machine learning example

For example, suppose we are building a machine learning classifier. The method under test would be our model, the input would be train and validation datasets and any hyper-parameters of our methods. The model is initialized with the hyper-parameters, trained on the training data, and the output is the predictions on the validation set.

We want the model to return probabilities, so we have a concrete expectation that the predictions should all be between 0 and 1. If any are not, our code is wrong and the experiment should fail.

However, we are not only interested in returning probabilities, we also want our model to return good predictions (e.g. the predictions have high accuracy and high fairness). We might have some conrete expectations about these metrics: for example we may wish to reject any result that has metrics strictly worse than some baseline, but it is not easy or meaningful to specify a criterion based on the accuracy and fairness values for when we should stop developing our model. In fact, the metrics collected during the experiment may inform subsequent development.

See the demo directory for a detailed example-based walkthrough.

Installation

You can install “pytest-experiments” via pip from PyPI:

$ pip install pytest-experiments

Contributing

Contributions are very welcome. This project uses poetry for packaging.

To get set up simply clone the repo and run

poetry install
poetry run pre-commit install

The first command will install the package along with all development dependencies in a virtual environment. The second command will install the pre-commit hook which will automatically format source files with black.

Tests can be run with pytest

Please document any code added with docstrings. New modules can be auto-documented by running:

sphinx-apidoc -e -o docs/source src/pytest_experiments

Documentation can be compiled (for example to html with make):

cd docs/
make html

License

Distributed under the terms of the MIT license, “pytest-experiments” is free and open source software

Issues

If you encounter any problems, please file an issue along with a detailed description.

Acknowledgements

This pytest plugin was generated with Cookiecutter along with @hackebrot’s cookiecutter-pytest-plugin template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytest-experiments-0.1.0.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

pytest_experiments-0.1.0-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file pytest-experiments-0.1.0.tar.gz.

File metadata

  • Download URL: pytest-experiments-0.1.0.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Linux/5.11.0-41-generic

File hashes

Hashes for pytest-experiments-0.1.0.tar.gz
Algorithm Hash digest
SHA256 04829b4d01efbfa00bce7c2473bb3db75a474b97dac5c2ab67adc521b6da6280
MD5 3b668a35ea78ab8ae85e19614a3b48be
BLAKE2b-256 79945464f4de5ea9125ca8711b3667b0a6ac39358495c2966c570171b4be7ea1

See more details on using hashes here.

File details

Details for the file pytest_experiments-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pytest_experiments-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Linux/5.11.0-41-generic

File hashes

Hashes for pytest_experiments-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4582099705f7c566bac4602d95b22d5176f2bd8e9ae4870f8b629f8fc40f91da
MD5 dc0398c9651c81186da38d10589f1c2d
BLAKE2b-256 e1ed431a28662987bfca5f0529a3a4a042ffff2fb90e7d302a74e4304bedca44

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page