Local ML workbench configured for Databricks using uv

Project description

ML Workbench

Installation

Install ML Workbench using pip:

pip install ml_workbench

Basic Usage

Command Line Interface (CLI)

Run experiments directly from YAML configuration files:

# Run all experiments in a YAML file
cli-experiment experiment.yaml

# Run specific experiment(s)
cli-experiment experiment.yaml --experiments my_experiment

# Run with variable substitution
cli-experiment experiment.yaml --var path=/data/datasets

# Inspect configuration without running
cli-experiment experiment.yaml --show-config

Python API

Use the Python API for programmatic control:

from ml_workbench import YamlConfig, Experiment, Runner

# Load configuration
config = YamlConfig("experiment.yaml")

# Create experiment
experiment = Experiment(config, "my_experiment")

# Run experiment
runner = Runner(experiment, verbose=True)
results = runner.run()

# Access results
print(f"Best model: {results['best_model']}")
print(f"Best score: {results['best_model_score']}")

Documentation

CLI Experiment Guide

Complete guide to using the command-line interface for running experiments. Learn how to execute experiments from YAML files, use variable substitution, inspect configurations, and view dataset statistics without running experiments.

Runner Class Documentation

Comprehensive documentation for the Runner class, the core execution engine for ML experiments. Includes workflow orchestration, dataset management, preprocessing pipelines, model training, evaluation metrics, feature analysis, and MLFlow integration.

YAML Configuration Specification

Detailed specification for YAML configuration files. Covers all sections including datasets, features, models, experiments, and MLflow settings. Includes validation rules and complete examples for defining ML experiments declaratively.

Implementation Summary

Technical overview of the Runner implementation, including architecture decisions, testing strategy, and implementation details. Useful for understanding the internal workings of the ML Workbench.

Packaging and CodeArtifact Guide

Step-by-step guide for packaging and publishing ML Workbench to AWS CodeArtifact. Covers prerequisites, authentication, version management, and CI/CD integration for distributing the package within your organization.

Setup

Environment Configuration for MLFlow Databricks Integration

To direct MLFlow to your Databricks workspace (dev-internal), create a .env file in the project root with the following configuration:

# Set MLflow tracking URI to your Databricks workspace
MLFLOW_TRACKING_URI="databricks"

# Define Databricks datapoint that match your workspace (this one is for dev-internal)
DATABRICKS_HOST="https://dbc-787720e9-26e6.cloud.databricks.com"

# Getting Your Databricks Token
# - Go to your Databricks workspace: https://dbc-787720e9-26e6.cloud.databricks.com
# - Click on your profile icon (top-right)
# - Select "Settings"
# - In "User" section, select "Developer"
# - Go to Access Tokens tab
# - Click Generate New Token
# - Give it a name (e.g., "MLFlow Local Development") and expiry
# - Copy the token (you'll only see it once!)
DATABRICKS_TOKEN="dapi123456781234567890"   # <- replace with your own

Steps to set up:

Copy .env.template to .env:
```
cp .env.template .env
```
Edit .env and replace DATABRICKS_TOKEN with your personal access token (see instructions in the comments above).
The .env file is already in .gitignore, so your token won't be committed to version control.

Once configured, MLFlow will automatically log experiments to your Databricks workspace when you run experiments using the ML Workbench.

Git Pre-commit Hook for Automatic Version Increment

This project includes a pre-commit hook that automatically increments the patch version (last number) in pyproject.toml on each commit. For example, 0.0.2 → 0.0.3.

To set up the pre-commit hook:

Option 1: Use the setup script (recommended)

./scripts/setup-pre-commit.sh

Option 2: Manual installation

cp scripts/pre-commit .git/hooks/pre-commit && chmod +x .git/hooks/pre-commit

Verify the hook is set up correctly:

ls -la .git/hooks/pre-commit

You should see the file is executable (-rwxr-xr-x).

How it works:

On each commit, the hook automatically:
- Reads the current version from pyproject.toml
- Increments the patch version (e.g., 0.0.2 → 0.0.3)
- Updates pyproject.toml with the new version
- Stages the updated file so it's included in your commit

Note: The hook only increments the patch version (last number). To bump minor or major versions, manually edit pyproject.toml before committing.

Project details

Release history Release notifications | RSS feed

0.2.6

Jan 6, 2026

0.2.5

Jan 6, 2026

0.2.4

Jan 6, 2026

0.2.3

Jan 6, 2026

This version

0.2.2

Jan 6, 2026

0.2.0

Jan 6, 2026

0.1.33

Jan 6, 2026

0.1.28

Jan 3, 2026

0.1.3

Nov 30, 2025

0.1.2

Nov 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml_workbench-0.2.2.tar.gz (1.2 MB view details)

Uploaded Jan 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ml_workbench-0.2.2-py3-none-any.whl (26.0 kB view details)

Uploaded Jan 6, 2026 Python 3

File details

Details for the file ml_workbench-0.2.2.tar.gz.

File metadata

Download URL: ml_workbench-0.2.2.tar.gz
Upload date: Jan 6, 2026
Size: 1.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.6

File hashes

Hashes for ml_workbench-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`f741a72241ef88565e8260a2247530e3206518b49871a6f7c7fa5f6dffd0a1b2`
MD5	`af56681e747021f68bd7e5f523816561`
BLAKE2b-256	`85a3a33de5b4b33f8009132181dbf1b4b614eeadc2cab7983ee7861ce354bb43`

See more details on using hashes here.

File details

Details for the file ml_workbench-0.2.2-py3-none-any.whl.

File metadata

Download URL: ml_workbench-0.2.2-py3-none-any.whl
Upload date: Jan 6, 2026
Size: 26.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.6

File hashes

Hashes for ml_workbench-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f52d998e554e819b9335a080af1b3e6a7c32323c992fb5555eae99e2d657f89a`
MD5	`e1335977587f39d0979e48616b6ae3cd`
BLAKE2b-256	`5ec7c82f218cb165dd01e9070a02facb957ab0b9331bed1708440e4a726b360a`

See more details on using hashes here.

ml-workbench 0.2.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

ML Workbench

Installation

Basic Usage

Command Line Interface (CLI)

Python API

Documentation

CLI Experiment Guide

Runner Class Documentation

YAML Configuration Specification

Implementation Summary

Packaging and CodeArtifact Guide

Setup

Environment Configuration for MLFlow Databricks Integration

Git Pre-commit Hook for Automatic Version Increment

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes