Skip to main content

A framework for managing machine learning experiments.

Project description

https://static.neocrym.com/images/scalarstop/v1/1x/scalarstop-wordmark-color-black-on-white--1x.png

Keep track of your machine learning experiments with ScalarStop.

ScalarStop is a Python framework for reproducible machine learning research.

It was written and open-sourced at Neocrym, where it is used to train thousands of models every week.

ScalarStop can help you:

  • organize datasets and models with content-addressable names.

  • save/load datasets and models to/from the filesystem.

  • record hyperparameters and metrics to a relational database.

System requirements

ScalarStop is a Python package that requires Python 3.8 or newer.

Currently, ScalarStop only supports tracking tf.data.Dataset datasets and tf.keras.Model models. As such, ScalarStop requires TensorFlow 2.8.0 or newer.

We encourage anybody that would like to add support for other machine learning frameworks to ScalarStop. :)

Installation

ScalarStop is available on PyPI.

Selecting a TensorFlow package variant

If you are using TensorFlow on a CPU, you can install ScalarStop with the command:

python3 -m pip install scalarstop[tensorflow]

If you are using TensorFlow with GPUs, you can install ScalarStop with the command:

python3 -m pip install scalarstop[tensorflow-gpu]

Selecting a PostgreSQL psycopg2 package variant

If you intend to use ScalarStop with PostgreSQL, you should also install either psycopg2-binary (which works out of the box) or psycopg2 (which you compile from source).

Therefore, your installation command could look like either:

python3 -m pip install scalarstop[tensorflow,psycopg2]
python3 -m pip install scalarstop[tensorflow,psycopg2-binary]
python3 -m pip install scalarstop[tensorflow-gpu,psycopg2]
python3 -m pip install scalarstop[tensorflow-gpu,psycopg2-binary]

Development

If you would like to make changes to ScalarStop, you can clone the repository from GitHub.

git clone https://github.com/scalarstop/scalarstop.git
cd scalarstop
python3 -m pip install .

Usage

Read the ScalarStop Tutorial to learn the core concepts behind ScalarStop and how to structure your datasets and models.

Afterwards, you might want to dig deeper into the ScalarStop Documentation. In general, a typical ScalarStop workflow involves four steps:

1. Organize your datasets with scalarstop.datablob.

2. Describe your machine learning model architectures using scalarstop.model_template.

3. Load, train, and save machine learning models with scalarstop.model.

4. Save hyperparameters and training metrics to a SQLite or PostgreSQL database using scalarstop.train_store.

Contributing to ScalarStop

We warmly welcome contributions to ScalarStop. Here are the technical details for getting started with adding code to ScalarStop.

Getting started

First, clone this repository from GitHub. All development happens on the main branch.

git clone https://github.com/scalarstop/scalarstop.git

Then, run make install to install Python dependencies in a Poetry virtualenv.

You can run make help to see the other commands that are available.

Checking your code

Run make fmt to automatically format code.

Run make lint to run Pylint and MyPy to check for errors.

Generating documentation

Documentation is important! Here is how to add to it.

Generating Sphinx documentation

You can generate a local copy of our Sphinx documentation at scalarstop.com with make docs.

The generated documentation can be found at docs/_build/dirhtml. To view it, you should start an HTTP server in this directory, such as:

make docs
cd docs/_build/dirhtml
python3 -m http.server 5000

Then visit http://localhost:5000 in your browser to preview changes to the documentation.

If you want to use Sphinx’s ability to automatically generate hyperlinks to the Sphinx documentation of other Python projects, then you should configure intersphinx settings at the path docs/conf.py. If you need to download an objects.inv file, make sure to update the make update-sphinx command in the Makefile.

Editing the tutorial notebook

The main ScalarStop tutorial is in a Jupyter notebook. If you have made changes to ScalarStop, you should rerun the Jupyter notebook on your machine with your changes to make sure that it still runs without error.

Running unit tests

Run make test to run all unit tests.

If you want to run a specific unit test, try running python3 -m poetry run python -m unittest -k {name of your test}.

Unit tests with SQLite3

If you are running tests using a Python interpreter that does not have the SQLite3 JSON1 extension, then TrainStore unit tests involving SQLite3 will be skipped. This is likely to happen if you are using Python 3.8 on Windows. If you suspect that you are missing the SQLite3 JSON1 extension, the Django documentation has some suggestions for how to fix it.

Unit tests with PostgreSQL

By default, tests involving PostgreSQL are skipped. To enable PostgreSQL, run make test in a shell where the environment variable TRAIN_STORE_CONNECTION_STRING is set to a SQLAlchemy database connection URL–which looks something like "postgresql://scalarstop:changeme@localhost:5432/train_store". The connection URL should point to a working PostgreSQL database with an existing database and user.

The docker-compose.yml file in the root of this directory can set up a PostgreSQL instance on your local machine. If you have Docker and Docker Compose installed, you can start the PostgreSQL database by running docker-compose up in the same directory as the docker-compose.yml file.

Measuring test coverage

You can run make test-with-coverage to collect Python line and branch coverage information. Afterwards, run make coverage-html to generate an HTML report of unit test coverage. You can view the report in a web browser at the path htmlcov/index.html.

Credits

ScalarStop’s documentation is built with Sphinx using @pradyunsg’s Furo theme and is hosted by Read the Docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scalarstop-6.0.0.tar.gz (51.3 kB view details)

Uploaded Source

Built Distribution

scalarstop-6.0.0-py3-none-any.whl (55.7 kB view details)

Uploaded Python 3

File details

Details for the file scalarstop-6.0.0.tar.gz.

File metadata

  • Download URL: scalarstop-6.0.0.tar.gz
  • Upload date:
  • Size: 51.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.12 Linux/5.11.0-1028-azure

File hashes

Hashes for scalarstop-6.0.0.tar.gz
Algorithm Hash digest
SHA256 5db3399bd180184a7520d536d8bea18f2e35da3bc10788752cc848f0a4829e0a
MD5 554be21d0cd385e1c97914f3702b7a4e
BLAKE2b-256 21107bfa8d7629a1ac52a6a062dcca25f78ea7a3eaedeb70ab1aab763fd38692

See more details on using hashes here.

File details

Details for the file scalarstop-6.0.0-py3-none-any.whl.

File metadata

  • Download URL: scalarstop-6.0.0-py3-none-any.whl
  • Upload date:
  • Size: 55.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.12 Linux/5.11.0-1028-azure

File hashes

Hashes for scalarstop-6.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a3730ad5e11a24261fd3c3de749b69d91270eabb5c7cc8593cb574f08bc52eb6
MD5 f53a867bb51bd09243f2097facbc2dce
BLAKE2b-256 910b1f7aea60a6cf731664072b101410e527ef91f14559f41dadfe10f61f80a1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page