Skip to main content

Module to record/document your model changes

Project description

What is ml-clerk?

ML-clerk, at its core, is a lightweight logging library that helps you keep track of performance, model parameters, timing (or whatever) throughout your data science experiment.

I'm a data scientist. Why should I care?

As a data scientist you're constantly running new experiments. As you run new experiments, you are recording your results (if not, you should be recording your results). ML clerk does exactly that. It reduces the inertia by automatically recording the results of your experiment, so you don't have to worry about keeping track.

Ok now that I'm convinced, where does model-clerk log stuff?

ML clerk has two places where it tracks things - excel or google sheets. Excel setup is easier, so start with that. For google sheets usage, see the link below.

Train a model - for demo purposes only

# Imports
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

dataset = load_iris()

# Define data
X = dataset['data']
y = dataset['target']

# Train the model
logistic_regression_classifier = LogisticRegression()
logistic_regression_classifier.fit(X, y)

predictions = logistic_regression_classifier.predict(X)
probabilities = logistic_regression_classifier.predict_proba(X)

Excel Usage

# Record artifacts to excel
from ml_clerk import Clerk

# Set up the clerk with excel mode
clerk = Clerk(excel_mode=True)

# file_path refers to the excel workbook you want to record in, and the sheet name referes to the sheet
clerk.set_up(filepath='hello_world.xlsx', sheet_name='Sheet1')

# Record all the artifacts in one go
clerk.record(predictions=predictions, probabilities=probabilities, model_parameters=logistic_regression_classifier.get_params())

Figure out google sheets permissions prior to using the google sheets mode. Follow directions below if you need any help.

Google Sheets Usage

# Record artifacts to excel
from ml_clerk import Clerk

# Set up the clerk with google sheets mode
clerk = Clerk(google_sheets_mode=True)

# file_path refers to the excel workbook you want to record in, and the sheet name referes to the sheet
clerk.set_up(file_path='#your google sheets url', sheet_name='#your sheet name')

# Record all the artifacts in one go
clerk.record(predictions=predictions, probabilities=probabilities, model_parameters=logistic_regression_classifier.get_params())

Google sheets permissions

  1. Follow this link for how to permission your Google account - https://erikrood.com/Posts/py_gsheets.html
  2. Make sure you enable the Google sheets API under the project in Google console.
  3. Update the .env with the path to token.json credentials file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml-clerk-1.0.0.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

ml_clerk-1.0.0-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file ml-clerk-1.0.0.tar.gz.

File metadata

  • Download URL: ml-clerk-1.0.0.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.11.0 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.1

File hashes

Hashes for ml-clerk-1.0.0.tar.gz
Algorithm Hash digest
SHA256 e4621ef613ce4fc1b51d1b720ca477c2b9e99b276e62c710afa2dd984fb8736c
MD5 bca888f94e9ce1ce672ca7cbf9ef77af
BLAKE2b-256 6dc21016156a7c3620443e21d0390799c5838d1986538249557fc1aaf94c11d4

See more details on using hashes here.

File details

Details for the file ml_clerk-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: ml_clerk-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.11.0 pkginfo/1.7.0 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.1

File hashes

Hashes for ml_clerk-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 10c67c293d84eee6d46b8a1316bac6ecd94205916534379aca1a2139b66f9845
MD5 754894d4801b6f91cbea87e9de3cfbc9
BLAKE2b-256 ab690c8518cf73b875ec9aa165ed520207cea357c7613b60e0bd9544e4c5869c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page