Skip to main content

simple experiment manager for machine learning.

Project description

logexp

Actions Status Python version pypi version license

Quick Links

Introduction

logexp is a simple experiment manager for machine learning. You can manage your experiments and executions from command line interface.

  • Features
    • track experiments: logexp tracks experiments and environment.
    • manage parameters: Import / export worker parameters with JSON format.
    • capture stdout / stderr: Capture stdout / stderr during execution automatically.
    • search logs: You can search your runs with jq command.
    • written in pure Python: logexp has no external dependencies.

Installation

Installing the library is simple using pip.

pip install logexp

Tutorial

In this tutorial we'll implement a simple worker for machine learning with scikit-learn. And then, let me introduce some operations to manage experiments and executions.

1. Create worker

This worker trains RandomForestClassifier and saves a trained model.

Worker needs to inherit logexp.BaseWorker. In config method, you can define worker parameters, that are logged automatically. Write your task in run method, and return logexp.Report which describes quick result if you need.

BaseWorker.storage is an artifact storage. You can save any files by using this storage.

$ cat << EOF > iris.py
import logexp
import numpy as np
import pickle
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

ex = logexp.Experiment("sklearn-iris")

@ex.worker("train-rfc")
class TrainRandomForest(logexp.BaseWorker):
    def config(self):
        self.rfc_params = {
            "n_estimators": 100,
            "min_samples_leaf": 1,
            "random_state": 0,
        }
        self.test_size = 0.3
        self.random_seed = 0

    def run(self):
        np.random.seed(self.random_seed)

        X, y = load_iris(return_X_y=True)

        X_train, X_valid, y_train, y_valid = \
            train_test_split(X, y, test_size=self.test_size)

        model = RandomForestClassifier(**self.rfc_params)
        model.fit(X_train, y_train)

        with self.storage.open("rfc.pkl", "wb") as f:
            pickle.dump(model, f)

        train_accuracy = model.score(X_train, y_train)
        valid_accuracy = model.score(X_valid, y_valid)

        report = logexp.Report()
        report["train_size"] = len(X_train)
        report["valid_size"] = len(X_valid)
        report["train_accuracy"] = train_accuracy
        report["valid_accuracy"] = valid_accuracy

        return report
EOF

2. Initialize experiment

Following command creates log-store directory (./.logexp by default) and returns experiment_id.

$ logexp init -m iris -e sklearn-iris
experiment id: 0

3. Edit parameters

Export default parameters with JSON format via:

$ logexp params -m iris -e sklearn-iris -w train-rfc > params.json
$ cat params.json
{
  "rfc_params": {
    "n_estimators": 100,
    "min_samples_leaf": 1,
    "random_state": 0
  },
  "test_size": 0.3,
  "random_seed": 0
}

You can also export params from specified run:

$ logexp params -r [ RUN_ID ]

Edit params.json file if you need.

4. Run worker

Run worker via $ logexp run command and see quick result like bellow:

$ logexp run -m iris -e 0 -w train-rfc -p params.json
** WORKER REPORT **
{
  "train_size": 105,
  "valid_size": 45,
  "train_accuracy": 1.0,
  "valid_accuracy": 0.9777777777777777
}

** SUMMARY **
run_id     : 7fcd37ef38104715ad60bd55b7e1023d
name       :
module     : iris
experiment : sklearn-iris
worker     : train-rfc
status     : finished
artifacts  : {'rootdir': '/src/.logexp/0/train-rfc/7fcd37ef38104715ad60bd55b7e1023d/artifacts'}
start_time : 2020-01-19 05:14:05.246681
end_time   : 2020-01-19 05:14:05.430199

5. View logs

Following command lists up executions:

$ logexp list -e 0 --sort start_time
run_id                           name exp_id exp_name     worker    status   start_time          end_time            note
================================ ==== ====== ============ ========= ======== =================== =================== ====
7fcd37ef38104715ad60bd55b7e1023d      0      sklearn-iris train-rfc finished 2020-01-19 05:14:05 2020-01-19 05:14:05
5300f7fc32b949bba6775c5899e09ae9      0      sklearn-iris train-rfc finished 2020-01-19 05:44:04 2020-01-19 05:44:04

$ logexp logs command exports all logs with JSON format. Using jq command, you can do more complex search.

$ logexp logs -e 0 | jq '
  map(select(.status == "finished"))
    | sort_by(.report.valid_accuracy)
    | reverse
    | .[]
    | {run_id: .uuid, valid_accuracy: .report.valid_accuracy}'
{
  "run_id": "7fcd37ef38104715ad60bd55b7e1023d",
  "valid_accuracy": 0.9777777777777777
}
{
  "run_id": "5300f7fc32b949bba6775c5899e09ae9",
  "valid_accuracy": 0.9555555555555556
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logexp-0.1.3.tar.gz (31.7 kB view details)

Uploaded Source

Built Distribution

logexp-0.1.3-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file logexp-0.1.3.tar.gz.

File metadata

  • Download URL: logexp-0.1.3.tar.gz
  • Upload date:
  • Size: 31.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.3

File hashes

Hashes for logexp-0.1.3.tar.gz
Algorithm Hash digest
SHA256 de0486f7e6a3239cb9b259cdcdb3f4887a77b0ad0f0fca6342bb6b406fb1444e
MD5 ea93b9e37c6b644d6498332a29838cde
BLAKE2b-256 7aa0a6c6c28206fa2910f5928a389d5ed53e5fb95b45b68bb9048f5aef949dca

See more details on using hashes here.

File details

Details for the file logexp-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: logexp-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 27.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.3

File hashes

Hashes for logexp-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e5f448c453591fefa4b3ba4e7c0c1f6cc1a7a830948153f00ac20d6dc65df99f
MD5 c3e875603b94b26bc2f73bf63be4c939
BLAKE2b-256 4a5dfc5d5a45a5ee62c2f366268dafc74a01edce01fbbddca19e47f2c5c54a15

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page