An LLM agent that sits next to you through your whole ML pipeline
Project description
mlcompass
An LLM agent that sits next to you through your whole ML pipeline โ from data, through training, all the way to deployment.
๐ง Pre-alpha (v0.0.1) โ under active development. APIs will change before v0.1.
What it does
mlcompass is a single CLI that follows your ML project from start to finish, keeping context across every step.
data.csv train.py results.csv production
โ โ โ โ
โผ โผ โผ โผ
advise โโโโโบ audit + watch โโโโโบ evaluate โโโโโบ deploy
compare
Each command writes to and reads from a shared project context
(.mlcompass/), so by the time you reach deploy, the tool already
knows your dataset, your model choice, your training history, and
your evaluation results.
Six commands, one tool
| Command | When you run it | What you get |
|---|---|---|
init |
Starting a new project | A .mlcompass/ folder that tracks decisions |
advise |
You have a CSV, what now? | Models to try, features to derive, pitfalls to avoid |
audit |
Before you press train | Static analysis of training script (seed, val, etc.) |
watch |
While training runs | Live plateau / overfit / NaN detection |
compare |
After several runs | Hypothesis-driven diff between two runs |
evaluate |
Training done | Threshold tuning, confusion matrix, hard examples |
deploy |
Going to production | Latency estimate, dependency check, ONNX advice |
Quick example โ advise mode
mlcompass init churn-project
mlcompass advise data/customers.csv --target churn
Output:
๐ Dataset analysis (data/customers.csv)
โข 10,000 rows ร 23 columns
โข Target: churn (binary, 12% positive)
โข 4 categorical, 18 numerical, 1 datetime
โข 3 columns with >50% missing values (consider dropping)
๐ก Recommended models
1. XGBoost / LightGBM โ tabular binary baseline
expected AUC: 0.82 โ 0.87
2. Logistic Regression โ interpretable baseline
expected AUC: 0.76 โ 0.80
3. FT-Transformer โ if GPU budget allows
expected AUC: 0.83 โ 0.86
๐ง Suggested feature engineering
โข signup_date โ derive days_since_signup, month, dayofweek
โข income (3 outliers >3ฯ) โ winsorize at 99th percentile
โข country (47 categories) โ target encoding or top-N
โ ๏ธ Class imbalance (12% positive)
โข Don't optimize accuracy โ use AUC, F1, or recall@k
โข Consider class_weight='balanced' or focal loss
Generate a baseline notebook? [y/N]
Quick example โ watch mode (Faz 2)
mlcompass watch train.py
After 8 epochs:
โ ๏ธ Epoch 8 โ overfitting detected
Train loss: 0.118 | Val loss: 0.387 (gap 0.27, normal <0.1)
Likely cause: regularization is too weak for the model capacity.
Suggested fix: increase dropout 0.1 โ 0.3
Apply and restart training? [y/N]
Why mlcompass
The ML ecosystem already has great tools โ but each owns one slice of the pipeline, and none of them advise:
| pandas-profiling | W&B / TensorBoard | Cursor / Devin | mlcompass | |
|---|---|---|---|---|
| Analyzes raw data | โ | โ | โ | โ |
| Recommends models + features | โ | โ | partial | โ |
| Audits training scripts | โ | โ | reactive | โ |
| Watches training in real time | โ | dashboard | โ | โ |
| Diagnoses problems proactively | โ | โ | reactive | โ |
| Post-training evaluation advice | โ | basic | โ | โ |
| Deployment readiness check | โ | โ | โ | โ |
| Persistent project memory | โ | per-run | โ | โ |
| Permission-gated actions | โ | โ | partial | first-class |
mlcompass is the advisor that sits next to all of these tools โ not a replacement for any.
Install
pip install mlcompass
export ANTHROPIC_API_KEY="sk-ant-..."
Usage
# Start a project
mlcompass init my-project
# Pre-training
mlcompass advise data.csv --target label
# Training-time (Faz 2)
mlcompass audit train.py
mlcompass watch train.py
mlcompass compare run-3 run-7
# Post-training (Faz 3)
mlcompass evaluate results.csv
# Deployment (Faz 4)
mlcompass deploy --target sagemaker
How it works
Built on agentlite โ a small Claude agent library โ mlcompass uses one orchestrator agent per command, plus focused sub-agents for sub-tasks:
cli.py
โ
โโโโโโโดโโโโโโ
โผ โผ
advise watch ... deploy
agent agent
โ โ
โผ โผ
ModelAdvisor MetricsWatcher (Haiku, polls)
(Opus) Diagnostician (Opus, called on anomaly)
Every action that would modify your code, config, or run a training process asks permission first โ agentlite's permission system is first-class, not an afterthought.
See ARCHITECTURE.md for the full design.
Project context
Each mlcompass project keeps a small folder, similar in spirit to
.git/:
.mlcompass/
โโโ project.yaml # metadata
โโโ context.json # decisions, recommendations, active state
โโโ datasets/ # registered datasets
โโโ runs/ # training run history
This is what makes mlcompass more than a chat tool: by the time you
run deploy, every earlier decision is still in memory.
Roadmap
| Phase | Commands | Status |
|---|---|---|
| Faz 1 (v0.1) | init, advise |
๐ง In progress |
| Faz 2 (v0.2) | audit, watch, compare |
๐ Planned |
| Faz 3 (v0.3) | evaluate |
๐ Planned |
| Faz 4 (v0.4) | deploy |
๐ Planned |
See CHANGELOG.md for detailed plans and ARCHITECTURE.md for the design.
Non-goals
To stay focused, mlcompass will not try to be:
- AutoML (use AutoGluon, AutoSklearn)
- Experiment tracker (use MLflow, W&B)
- Code assistant (use Cursor, Copilot, aider)
- Monitoring dashboard (use Grafana, Streamlit)
mlcompass advises; you decide.
Contributing
Pre-alpha โ issues and discussions welcome, PRs after v0.1.
License
MIT ยฉ 2026 Hakan Sabunis
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlcompass-0.1.0.tar.gz.
File metadata
- Download URL: mlcompass-0.1.0.tar.gz
- Upload date:
- Size: 29.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bd95633b579d6f0a78fbd3df1c63afa716976f5806660d74b2476652bfdd402
|
|
| MD5 |
3a46a6af8dd771b0d9d8c31d98846c1a
|
|
| BLAKE2b-256 |
43cbdb52e6ebda7f540d3e9b5ae12a9efe5461730076e86128bf135e51badf89
|
File details
Details for the file mlcompass-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mlcompass-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a71ccaded17136c16f303b6ac01440b6827de15e7e68dacf44b2fa6ddb683ef3
|
|
| MD5 |
1aea3c3b1dcc16c3de3a3457ba6038e8
|
|
| BLAKE2b-256 |
211d80e99a5ef38c9b74a807b6ba6408c08ecfe5f3697c4e7eb6820642c29c86
|