An LLM agent that sits next to you through your whole ML pipeline

These details have not been verified by PyPI

Project links

Project description

mlcompass

An LLM agent that sits next to you through your whole ML pipeline — from data, through training, all the way to deployment.

🚧 Pre-alpha (v0.0.1) — under active development. APIs will change before v0.1.

What it does

mlcompass is a single CLI that follows your ML project from start to finish, keeping context across every step.

data.csv          train.py             results.csv         production
   │                  │                     │                  │
   ▼                  ▼                     ▼                  ▼
 advise   ────►   audit + watch  ────►  evaluate  ────►  deploy
                      compare

Each command writes to and reads from a shared project context (.mlcompass/), so by the time you reach deploy, the tool already knows your dataset, your model choice, your training history, and your evaluation results.

Six commands, one tool

Command	When you run it	What you get
`init`	Starting a new project	A `.mlcompass/` folder that tracks decisions
`advise`	You have a CSV, what now?	Models to try, features to derive, pitfalls to avoid
`audit`	Before you press train	Static analysis of training script (seed, val, etc.)
`watch`	While training runs	Live plateau / overfit / NaN detection
`compare`	After several runs	Hypothesis-driven diff between two runs
`evaluate`	Training done	Threshold tuning, confusion matrix, hard examples
`deploy`	Going to production	Latency estimate, dependency check, ONNX advice

Quick example — `advise` mode

mlcompass init churn-project
mlcompass advise data/customers.csv --target churn

Output:

📊 Dataset analysis (data/customers.csv)
   • 10,000 rows × 23 columns
   • Target: churn (binary, 12% positive)
   • 4 categorical, 18 numerical, 1 datetime
   • 3 columns with >50% missing values (consider dropping)

💡 Recommended models
   1. XGBoost / LightGBM   → tabular binary baseline
                             expected AUC: 0.82 – 0.87
   2. Logistic Regression  → interpretable baseline
                             expected AUC: 0.76 – 0.80
   3. FT-Transformer       → if GPU budget allows
                             expected AUC: 0.83 – 0.86

🔧 Suggested feature engineering
   • signup_date → derive days_since_signup, month, dayofweek
   • income (3 outliers >3σ) → winsorize at 99th percentile
   • country (47 categories) → target encoding or top-N

⚠️  Class imbalance (12% positive)
   • Don't optimize accuracy — use AUC, F1, or recall@k
   • Consider class_weight='balanced' or focal loss

Generate a baseline notebook? [y/N]

Quick example — `watch` mode (Faz 2)

mlcompass watch train.py

After 8 epochs:

⚠️  Epoch 8 — overfitting detected
   Train loss: 0.118  |  Val loss: 0.387  (gap 0.27, normal <0.1)

   Likely cause: regularization is too weak for the model capacity.

   Suggested fix: increase dropout 0.1 → 0.3
   Apply and restart training? [y/N]

Why mlcompass

The ML ecosystem already has great tools — but each owns one slice of the pipeline, and none of them advise:

	pandas-profiling	W&B / TensorBoard	Cursor / Devin	mlcompass
Analyzes raw data	✅	❌	❌	✅
Recommends models + features	❌	❌	partial	✅
Audits training scripts	❌	❌	reactive	✅
Watches training in real time	❌	dashboard	❌	✅
Diagnoses problems proactively	❌	❌	reactive	✅
Post-training evaluation advice	❌	basic	❌	✅
Deployment readiness check	❌	❌	❌	✅
Persistent project memory	❌	per-run	❌	✅
Permission-gated actions	❌	❌	partial	first-class

mlcompass is the advisor that sits next to all of these tools — not a replacement for any.

Install

pip install mlcompass
export ANTHROPIC_API_KEY="sk-ant-..."

Usage

# Start a project
mlcompass init my-project

# Pre-training
mlcompass advise data.csv --target label

# Training-time          (Faz 2)
mlcompass audit train.py
mlcompass watch train.py
mlcompass compare run-3 run-7

# Post-training          (Faz 3)
mlcompass evaluate results.csv

# Deployment             (Faz 4)
mlcompass deploy --target sagemaker

How it works

Built on agentlite — a small Claude agent library — mlcompass uses one orchestrator agent per command, plus focused sub-agents for sub-tasks:

       cli.py
         │
   ┌─────┴─────┐
   ▼           ▼
 advise      watch                ... deploy
 agent       agent
   │           │
   ▼           ▼
 ModelAdvisor  MetricsWatcher (Haiku, polls)
  (Opus)       Diagnostician  (Opus, called on anomaly)

Every action that would modify your code, config, or run a training process asks permission first — agentlite's permission system is first-class, not an afterthought.

See ARCHITECTURE.md for the full design.

Project context

Each mlcompass project keeps a small folder, similar in spirit to .git/:

.mlcompass/
├── project.yaml        # metadata
├── context.json        # decisions, recommendations, active state
├── datasets/           # registered datasets
└── runs/               # training run history

This is what makes mlcompass more than a chat tool: by the time you run deploy, every earlier decision is still in memory.

Roadmap

Phase	Commands	Status
Faz 1 (v0.1)	`init`, `advise`	🚧 In progress
Faz 2 (v0.2)	`audit`, `watch`, `compare`	📅 Planned
Faz 3 (v0.3)	`evaluate`	📅 Planned
Faz 4 (v0.4)	`deploy`	📅 Planned

See CHANGELOG.md for detailed plans and ARCHITECTURE.md for the design.

Non-goals

To stay focused, mlcompass will not try to be:

AutoML (use AutoGluon, AutoSklearn)
Experiment tracker (use MLflow, W&B)
Code assistant (use Cursor, Copilot, aider)
Monitoring dashboard (use Grafana, Streamlit)

mlcompass advises; you decide.

Contributing

Pre-alpha — issues and discussions welcome, PRs after v0.1.

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.7.1

May 31, 2026

0.7.0

May 31, 2026

0.6.1

May 31, 2026

0.6.0

May 30, 2026

0.5.0

May 30, 2026

0.4.0

May 30, 2026

0.3.1

May 30, 2026

0.2.0

May 30, 2026

This version

0.1.0

May 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlcompass-0.1.0.tar.gz (29.8 kB view details)

Uploaded May 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlcompass-0.1.0-py3-none-any.whl (22.1 kB view details)

Uploaded May 29, 2026 Python 3

File details

Details for the file mlcompass-0.1.0.tar.gz.

File metadata

Download URL: mlcompass-0.1.0.tar.gz
Upload date: May 29, 2026
Size: 29.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for mlcompass-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8bd95633b579d6f0a78fbd3df1c63afa716976f5806660d74b2476652bfdd402`
MD5	`3a46a6af8dd771b0d9d8c31d98846c1a`
BLAKE2b-256	`43cbdb52e6ebda7f540d3e9b5ae12a9efe5461730076e86128bf135e51badf89`

See more details on using hashes here.

File details

Details for the file mlcompass-0.1.0-py3-none-any.whl.

File metadata

Download URL: mlcompass-0.1.0-py3-none-any.whl
Upload date: May 29, 2026
Size: 22.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for mlcompass-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a71ccaded17136c16f303b6ac01440b6827de15e7e68dacf44b2fa6ddb683ef3`
MD5	`1aea3c3b1dcc16c3de3a3457ba6038e8`
BLAKE2b-256	`211d80e99a5ef38c9b74a807b6ba6408c08ecfe5f3697c4e7eb6820642c29c86`

See more details on using hashes here.

mlcompass 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mlcompass

What it does

Six commands, one tool

Quick example — `advise` mode

Quick example — `watch` mode (Faz 2)

Why mlcompass

Install

Usage

How it works

Project context

Roadmap

Non-goals

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

mlcompass 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mlcompass

What it does

Six commands, one tool

Quick example — advise mode

Quick example — watch mode (Faz 2)

Why mlcompass

Install

Usage

How it works

Project context

Roadmap

Non-goals

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Quick example — `advise` mode

Quick example — `watch` mode (Faz 2)