Skip to main content

Frictionless hierarchical text classification — an MCP server that guides you from CSV to trained classifier

Project description

PrometheusNN

Build a hierarchical text classifier without writing code. Point Claude (or any MCP client) at your data and it walks you from "I have a CSV" to a trained model — profiling your columns, detecting whether the problem fits, surfacing concerns before training, and recommending the right setup.

PrometheusNN is the ML expert sitting next to you. You bring the data and the goal; it handles the rest.

What this looks like

You:    I have a CSV of product descriptions and I want to categorize them.
        Can you help?

Claude: [calls start_here on the file]

        I looked at your data — 11,400 product descriptions across three
        nested category levels (Chapter → Heading → Subheading). One thing
        worth flagging before we train: the deepest level has ~2 examples
        per class on average, which usually means poor accuracy at that
        depth. Want to do all three levels, or simplify to the top two?

You:    Top two.

Claude: [fires analyze_training_data → train_model with the simpler setup]

        Done. Chapter accuracy 92%, Heading accuracy 78%. Want to try
        classifying something?

No column-name guessing, no parameter cargo-culting, no premature training runs.

Install

pip install prometheusnn

Then add to your MCP client config (Claude Desktop, Cursor, Zed, etc.):

{
  "mcpServers": {
    "prometheus": {
      "command": "prometheusnn"
    }
  }
}

That's it. Open your MCP client and start a conversation about classifying something.

How it works under the hood

Two entry-point tools handle every new conversation:

  • start_here(file_path) — profiles your CSV, detects whether it's a hierarchical, flat, or multi-label classification problem, surfaces concerns before training, and returns pre-filled next steps so the LLM never has to guess column names.

  • scope_problem(goal_description) — for when you don't have data yet. Tells you whether Prometheus fits your goal, what data you'll need, and what to watch for. Politely redirects non-fits (regression, clustering, NER) to the right tools.

From there, the LLM routes through the existing tool surface (training, evaluation, threshold tuning, prediction) based on what start_here recommended.

What's actually being built

Under the conversational surface, PrometheusNN trains a cascade of neural network classifiers — one per level of your taxonomy — with parent-noise injection so deeper levels stay robust when upper levels are uncertain. Beam search explores alternative paths when confidence drops. Temperature calibration produces probabilities you can trust for routing. Dual-signal novelty detection catches items that don't belong.

Text descriptions + taxonomy labels
        │
        ▼
  ┌─────────────┐
  │  Embedding  │  sentence-transformers (384d or 768d)
  └──────┬──────┘
         ▼
  ┌─────────────┐
  │   Cascade   │  N classifiers, one per level
  │  Classifier │  parent noise injection for robustness
  └──────┬──────┘
         ▼
  ┌─────────────┐
  │ Beam Search │  adaptive widening when confidence drops
  └──────┬──────┘
         ▼
  ┌─────────────┐
  │   Router    │  calibrated thresholds → accept / review / reject
  │  + Novelty  │  centroid z-score + kNN distance
  └─────────────┘

You don't need to know any of this to use it — but it's there if you want to dig in.

Tool reference

20 MCP tools total. The two you'll be told about; the rest the LLM picks for you.

Entry points (start here)

  • start_here — assess a file, detect problem type, return pre-filled next steps
  • scope_problem — assess a goal description (no data yet), check fit and data requirements

Training & data

  • analyze_training_data, train_model, resume_training

Inference

  • predict, predict_batch, classify_with_context

Model management

  • list_models, describe_model, delete_model, export_model

Evaluation & tuning

  • evaluate_model, explain_prediction, get_confusion_matrix
  • get_threshold_report, set_thresholds

Other

  • submit_feedback, list_embedding_models, build_code_mapping

Environment variables

Variable Default Description
PROMETHEUS_HOME ~/.prometheus Base directory for models, logs, feedback
PROMETHEUS_EMBEDDING_MODEL all-MiniLM-L6-v2 Default sentence-transformer model

Development

git clone https://github.com/mfbaig35r/prometheusnn
cd prometheusnn
uv sync --extra dev
uv run python -m pytest tests/ -v --tb=short    # 179 tests
uv run ruff check src/ tests/                   # lint
uv run ruff format --check src/ tests/          # format

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prometheusnn-0.2.0.tar.gz (235.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prometheusnn-0.2.0-py3-none-any.whl (74.6 kB view details)

Uploaded Python 3

File details

Details for the file prometheusnn-0.2.0.tar.gz.

File metadata

  • Download URL: prometheusnn-0.2.0.tar.gz
  • Upload date:
  • Size: 235.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for prometheusnn-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a73500c5901fa588ba5a0e6127ba0b74bda6e7441c83da5a2b62ce0572a7b9ff
MD5 90b609511a14dbf50f9b67d7059b309a
BLAKE2b-256 b531ae3b4f7d5b55056d981edbc9519119b9c8d9d650d2ce3e5a1f49bc1aebe5

See more details on using hashes here.

Provenance

The following attestation bundles were made for prometheusnn-0.2.0.tar.gz:

Publisher: publish.yml on mfbaig35r/prometheusnn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file prometheusnn-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: prometheusnn-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 74.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for prometheusnn-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1704ac112dd40f1495a5f955a33b35684366e0ae44f005588f81dc5f8254044c
MD5 72e173cc12a18862a58a479764013610
BLAKE2b-256 ed11dba89319d39aece52dc5eb28fc81080de1dae8d776566d67e60725f125c1

See more details on using hashes here.

Provenance

The following attestation bundles were made for prometheusnn-0.2.0-py3-none-any.whl:

Publisher: publish.yml on mfbaig35r/prometheusnn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page