Frictionless hierarchical text classification — an MCP server that guides you from CSV to trained classifier
Project description
PrometheusNN
Build a hierarchical text classifier without writing code. Point Claude (or any MCP client) at your data and it walks you from "I have a CSV" to a trained model — profiling your columns, detecting whether the problem fits, surfacing concerns before training, and recommending the right setup.
PrometheusNN is the ML expert sitting next to you. You bring the data and the goal; it handles the rest.
What this looks like
You: I have a CSV of product descriptions and I want to categorize them.
Can you help?
Claude: [calls start_here on the file]
I looked at your data — 11,400 product descriptions across three
nested category levels (Chapter → Heading → Subheading). One thing
worth flagging before we train: the deepest level has ~2 examples
per class on average, which usually means poor accuracy at that
depth. Want to do all three levels, or simplify to the top two?
You: Top two.
Claude: [fires analyze_training_data → train_model with the simpler setup]
Done. Chapter accuracy 92%, Heading accuracy 78%. Want to try
classifying something?
No column-name guessing, no parameter cargo-culting, no premature training runs.
Install
pip install prometheusnn
Then add to your MCP client config (Claude Desktop, Cursor, Zed, etc.):
{
"mcpServers": {
"prometheus": {
"command": "prometheusnn"
}
}
}
That's it. Open your MCP client and start a conversation about classifying something.
How it works under the hood
Two entry-point tools handle every new conversation:
-
start_here(file_path)— profiles your CSV, detects whether it's a hierarchical, flat, or multi-label classification problem, surfaces concerns before training, and returns pre-filled next steps so the LLM never has to guess column names. -
scope_problem(goal_description)— for when you don't have data yet. Tells you whether Prometheus fits your goal, what data you'll need, and what to watch for. Politely redirects non-fits (regression, clustering, NER) to the right tools.
From there, the LLM routes through the existing tool surface (training, evaluation, threshold tuning, prediction) based on what start_here recommended.
What's actually being built
Under the conversational surface, PrometheusNN trains a cascade of neural network classifiers — one per level of your taxonomy — with parent-noise injection so deeper levels stay robust when upper levels are uncertain. Beam search explores alternative paths when confidence drops. Temperature calibration produces probabilities you can trust for routing. Dual-signal novelty detection catches items that don't belong.
Text descriptions + taxonomy labels
│
▼
┌─────────────┐
│ Embedding │ sentence-transformers (384d or 768d)
└──────┬──────┘
▼
┌─────────────┐
│ Cascade │ N classifiers, one per level
│ Classifier │ parent noise injection for robustness
└──────┬──────┘
▼
┌─────────────┐
│ Beam Search │ adaptive widening when confidence drops
└──────┬──────┘
▼
┌─────────────┐
│ Router │ calibrated thresholds → accept / review / reject
│ + Novelty │ centroid z-score + kNN distance
└─────────────┘
You don't need to know any of this to use it — but it's there if you want to dig in.
Tool reference
20 MCP tools total. The two you'll be told about; the rest the LLM picks for you.
Entry points (start here)
start_here— assess a file, detect problem type, return pre-filled next stepsscope_problem— assess a goal description (no data yet), check fit and data requirements
Training & data
analyze_training_data,train_model,resume_training
Inference
predict,predict_batch,classify_with_context
Model management
list_models,describe_model,delete_model,export_model
Evaluation & tuning
evaluate_model,explain_prediction,get_confusion_matrixget_threshold_report,set_thresholds
Other
submit_feedback,list_embedding_models,build_code_mapping
Environment variables
| Variable | Default | Description |
|---|---|---|
PROMETHEUS_HOME |
~/.prometheus |
Base directory for models, logs, feedback |
PROMETHEUS_EMBEDDING_MODEL |
all-MiniLM-L6-v2 |
Default sentence-transformer model |
Development
git clone https://github.com/mfbaig35r/prometheusnn
cd prometheusnn
uv sync --extra dev
uv run python -m pytest tests/ -v --tb=short # 179 tests
uv run ruff check src/ tests/ # lint
uv run ruff format --check src/ tests/ # format
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prometheusnn-0.1.0.tar.gz.
File metadata
- Download URL: prometheusnn-0.1.0.tar.gz
- Upload date:
- Size: 213.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7a9a996ac42b58ae5ed25c9019f8e5fedbb6a3dc1bc1821639757c6458759f9
|
|
| MD5 |
c421becda9bef104b6e4a720baecf2c8
|
|
| BLAKE2b-256 |
5aa2ed06306a0455a9865883eb3d46e98beab0cf6b8cea4be5da2cdca02db44b
|
Provenance
The following attestation bundles were made for prometheusnn-0.1.0.tar.gz:
Publisher:
publish.yml on mfbaig35r/prometheusnn
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
prometheusnn-0.1.0.tar.gz -
Subject digest:
b7a9a996ac42b58ae5ed25c9019f8e5fedbb6a3dc1bc1821639757c6458759f9 - Sigstore transparency entry: 1409271855
- Sigstore integration time:
-
Permalink:
mfbaig35r/prometheusnn@4f2cc27676a0222fb5a9953dea1d263a44c5344c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/mfbaig35r
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4f2cc27676a0222fb5a9953dea1d263a44c5344c -
Trigger Event:
push
-
Statement type:
File details
Details for the file prometheusnn-0.1.0-py3-none-any.whl.
File metadata
- Download URL: prometheusnn-0.1.0-py3-none-any.whl
- Upload date:
- Size: 71.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
527c244f04202d3050cf161da7588720fa5b5a18f78f69f2c12079f87c909412
|
|
| MD5 |
5219b6b4b5f7a0eb78072b6247714ab1
|
|
| BLAKE2b-256 |
2de94d7210ba09d0009ca1c64b4875b9218c0890f85222893e26f7d32daa3d66
|
Provenance
The following attestation bundles were made for prometheusnn-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on mfbaig35r/prometheusnn
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
prometheusnn-0.1.0-py3-none-any.whl -
Subject digest:
527c244f04202d3050cf161da7588720fa5b5a18f78f69f2c12079f87c909412 - Sigstore transparency entry: 1409271866
- Sigstore integration time:
-
Permalink:
mfbaig35r/prometheusnn@4f2cc27676a0222fb5a9953dea1d263a44c5344c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/mfbaig35r
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4f2cc27676a0222fb5a9953dea1d263a44c5344c -
Trigger Event:
push
-
Statement type: