Skip to main content

Agnostic HPC AI Agent framework — multi-backend LLM, SLURM tools, terminal UI

Project description

██╗  ██╗██████╗  ██████╗     █████╗  ██████╗ ███████╗███╗   ██╗████████╗
██║  ██║██╔══██╗██╔════╝    ██╔══██╗██╔════╝ ██╔════╝████╗  ██║╚══██╔══╝
███████║██████╔╝██║         ███████║██║  ███╗█████╗  ██╔██╗ ██║   ██║   
██╔══██║██╔═══╝ ██║         ██╔══██║██║   ██║██╔══╝  ██║╚██╗██║   ██║   
██║  ██║██║     ╚██████╗    ██║  ██║╚██████╔╝███████╗██║ ╚████║   ██║   
╚═╝  ╚═╝╚═╝      ╚═════╝    ╚═╝  ╚═╝ ╚═════╝ ╚══════╝╚═╝  ╚═══╝   ╚═╝   

Talk to your SLURM cluster in plain English.

A colorful terminal agent that checks nodes, diagnoses jobs, predicts wait times, fixes permissions, and reads your cluster's own docs — no bash incantations required.

CI PyPI Python License: MIT Status: beta


See it

hpcpilot: diagnose a pending job, mirror the cluster's docs by URL, then answer from them

Diagnose a pending job, mirror the cluster's docs straight from their URL, then get an answer cited from them. The agent streams markdown, shows each tool as it runs, and asks before anything that changes state. Type / any time for commands.

Why hpcpilot

  • 🗣️ Plain-English ops — "show my pending jobs", "who's hogging node midway3-0042", "extend job 1837465 by 2 hours".
  • 📚 Knows your cluster — point it at your docs folder or a docs-site URL and it mirrors the whole user guide locally, then cites the right page.
  • 🛟 Safe by default — mutating actions (chmod, chgrp, scontrol update) show a preview and need confirmation; secrets are stored 0600, session-only unless you opt in.
  • 🔌 Bring any model — it auto-detects an installed claude/codex, or use any API provider; switch and search models live with /model.
  • 🎨 A real TUI — gradient banner, streaming answers, live tool status, searchable menus, persistent history.

Quickstart

pip install "hpcpilot[full]"

hpcpilot          # first run walks you through setup

That's it. On first launch a short wizard helps you pick a model, point at your docs, and start asking questions. If you already have the claude or codex CLI installed, hpcpilot uses it automatically — zero config, no API key needed.

Living in the TUI

Type / to open the command menu (arrow keys to choose, Enter to run):

Command What it does
/model Browse, search, and switch models (type to filter a long list)
/effort Set reasoning effort (low / medium / high)
/docs Show, add, or /docs sync your cluster documentation
/tools List the HPC tools the agent can call
/config Re-run the setup wizard
/keys See where your API key comes from (or clear it)
/copy · /save Copy the last answer · save the transcript
/clear · /retry · /help · /exit …and the usual essentials

Don't know a model's exact name? Just open /model and type fast, reasoning, gemini, or part of a name — the picker filters as you type.

Teach it your cluster's docs

Point hpcpilot at a local folder of markdown, or at a docs-site URL — a whole user guide with many pages. A URL is crawled once (sitemap, then the site's own nav links) into a local cache, so the agent reads it instantly and offline:

# A local folder…
hpcpilot --docs-base-path /path/to/cluster-docs

# …or mirror an online user guide (the whole site, not one page)
hpcpilot --docs-url https://docs.rcc.uchicago.edu/

You can also paste a path or URL in the wizard's Docs step. Inside the TUI, /docs shows what's loaded, /docs add adds a source, and /docs sync re-crawls when the docs change. Then ask "how do I request a GPU node?" and the agent reads the relevant page and answers with specifics. (URL mirroring uses the full extra.)

🎬 Regenerate the demo

The GIF above is built with vhs from assets/demo.tape, which runs scripts/demo.py — a self-animating session that uses hpcpilot's real banner, streaming renderer, and tool-status output, so it looks exactly like the app while staying fully reproducible (no live LLM or cluster).

conda install -c conda-forge vhs ttyd ffmpeg   # one-time
vhs assets/demo.tape                            # re-renders assets/demo.gif

Edit the questions or timing in scripts/demo.py and re-run to taste.

Models & providers (click to expand)

hpcpilot talks to any OpenAI-compatible API, plus the Claude and Codex CLIs. It does not hard-code model names — use /model (or --list-models <provider>) to see what each provider currently offers and pick the latest.

Provider --backend Auth
OpenAI openai OPENAI_API_KEY
Google Gemini gemini GEMINI_API_KEY
Groq groq GROQ_API_KEY
OpenRouter openrouter OPENROUTER_API_KEY
DeepSeek deepseek DEEPSEEK_API_KEY
Mistral mistral MISTRAL_API_KEY
xAI xai XAI_API_KEY
Together together TOGETHER_API_KEY
OpenCode opencode OPENCODE_API_KEY
Any OpenAI-compatible API custom --api-base-url
Claude CLI claude uses the claude binary
Codex CLI codex uses the codex binary

API vs. CLI backends: API backends use hpcpilot's curated HPC tools (job wait prediction, permission checks, quota lookups, doc reading). CLI backends drive the external agent binary directly with its own toolset.

Useful flags: --model, --effort {low,medium,high}, --docs-url, --docs-base-path, --list-models, --list-tools, --dangerous-bypass, --config-path.

Install from source

git clone https://github.com/PursuitOfDataScience/hpcpilot.git
cd hpcpilot
pip install -e ".[full]"
hpcpilot

Status

Beta. The portable SLURM tools (squeue/sinfo/scontrol) work on any SLURM cluster; site-specific account/quota commands are configurable. Issues and PRs welcome.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpcpilot-0.1.0.tar.gz (68.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hpcpilot-0.1.0-py3-none-any.whl (71.6 kB view details)

Uploaded Python 3

File details

Details for the file hpcpilot-0.1.0.tar.gz.

File metadata

  • Download URL: hpcpilot-0.1.0.tar.gz
  • Upload date:
  • Size: 68.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hpcpilot-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c5f6d9f63a446931039be29bede61f351107fed1a6c703f2000b716337bd003c
MD5 2e9d1427fc249af2583475ecf3d9edaa
BLAKE2b-256 1faa61b15d49de6f3361cf6d411067b434a8cfe08c0aa8d363efa4f08c39ac43

See more details on using hashes here.

Provenance

The following attestation bundles were made for hpcpilot-0.1.0.tar.gz:

Publisher: release.yml on PursuitOfDataScience/hpcpilot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hpcpilot-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: hpcpilot-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 71.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hpcpilot-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1a6a58e11404bb021603777f46bdbb82996b871a0c7b8fa25377214558b834cd
MD5 bde70c36945cacd9db14c8a6b6558f6b
BLAKE2b-256 e0745534d87c81ae630777efad9bd1b416649651c351a01cc0f581e8494350a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for hpcpilot-0.1.0-py3-none-any.whl:

Publisher: release.yml on PursuitOfDataScience/hpcpilot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page