Skip to main content

Interactive terminal CLI for chatting with your Kubernetes cluster via an AI backend.

Project description

kube-q

Chat with your Kubernetes cluster from the terminal.

kube-q is an interactive CLI (kq) that connects to an AI-powered backend and lets you query, debug, and manage your cluster in plain English — with streaming responses, persistent session history, full-text search, conversation branching, token cost tracking, human-in-the-loop approval flows, and rich terminal rendering.


Features

  • Interactive REPL — persistent conversation history, slash commands, Tab completion
  • Streaming responses — tokens render in real-time via Server-Sent Events
  • Session persistence — every conversation is saved to a local SQLite database; resume any past session with --session-id
  • Full-text searchkq --search "pod crash" or /search inside the REPL; FTS5-powered with highlighted match snippets and boolean syntax
  • Conversation branching/branch forks the current conversation at any point; the original is untouched; /branches lists all forks
  • Token & cost tracking — every response shows tokens used; /tokens shows session totals and estimated dollar cost; rates configurable per model
  • Human-in-the-Loop (HITL) — review and approve or deny destructive actions before they run
  • Namespace context — set an active namespace with /ns <name>; it's injected into every message automatically
  • File attachments — embed YAML, JSON, logs, and more with @filename anywhere in a message
  • Conversation save — dump the full session to a Markdown file with /save
  • Single-query mode — pipe-friendly with kq --query "…" and --output plain
  • TLS & auth--api-key / KUBE_Q_API_KEY env var, custom CA cert via --ca-cert
  • Rich output — syntax-highlighted code blocks, elapsed response time, typo suggestions for slash commands
  • Python SDK — use KubeQClient directly in your own scripts and tools

Installation

pip install kube-q

Or via Homebrew:

brew tap MSKazemi/kube-q
brew install kube-q

Or install from source:

git clone https://github.com/MSKazemi/kube_q
cd kube_q
pip install -e .

Requires Python 3.12+.


Quick start

# Start the interactive REPL (connects to localhost:8000 by default)
kq

# Point at a remote API
kq --url https://kube-q.example.com

# Single query and exit
kq --query "show me all pods in the default namespace"

# Pipe-friendly plain text output
kq --query "list failing deployments" --output plain

# List recent sessions
kq --list

# Search across all past conversations
kq --search "pod crash"

# Resume a previous session
kq --session-id <id>

In-REPL commands

Conversation

Command Description
/new Start a new conversation (clears history, generates new ID)
/id Show the current conversation ID
/state Show full session state — ID, user, messages, tokens, namespace, HITL flag
/save [file] Save conversation to a Markdown file
/clear Clear the terminal screen
/help Show full in-REPL help
/quit / /exit / /q Exit kube-q

Namespace

Command Description
/ns <name> Set active namespace — prepended to every query automatically
/ns Clear the active namespace

Session history

Command Description
/sessions List recent sessions (same as kq --list)
/forget Delete the current session from local history (server data untouched)

History & branching

Command Description
/search <query> Full-text search across all past sessions with highlighted snippets
/branch Fork this conversation at the current point into a new independent session
/branches List all forks of (and siblings of) this session
/title <text> Rename the current session

FTS5 boolean syntax is supported: /search pods AND NOT staging

Token usage

Command Description
/tokens Show token counts and estimated cost for this session
/cost Alias for /tokens

Human-in-the-Loop

Command Description
/approve Approve a pending HITL action — the AI executes it
/deny Deny a pending HITL action — nothing is applied

Keyboard shortcuts:

Key Action
Enter Send message
Alt+Enter or EscEnter Insert newline (multi-line input)
Tab Auto-complete slash commands
/ Scroll through input history
Ctrl+C Cancel current input
Ctrl+D Exit the session

File attachments

Embed a file's contents directly in your message using @:

what is wrong with this deployment? @deployment.yaml
compare these two configs: @old.yaml @new.yaml
what is wrong here? @pod.yaml @service.yaml

Supports: yaml, json, py, sh, go, tf, toml, js, ts, rs, java, xml, html, md, txt, log, and more. Limit: 100 KB per file. Quote paths with spaces: @"my file.yaml".


CLI reference

kq [options]

Flags

Flag Default Description
--url URL http://localhost:8000 kube-q API base URL (env: KUBE_Q_URL)
--query / -q TEXT Run a single query and exit
--no-stream off Disable streaming — wait for full response
--session-id ID Resume a previous session by ID
--list List recent sessions and exit
--search QUERY Full-text search across session history and exit
--user-id ID auto Persistent user ID (saved to ~/.kube-q/user-id)
--api-key KEY Bearer token for auth-enabled servers (env: KUBE_Q_API_KEY)
--ca-cert PATH Custom CA certificate bundle for TLS
--output {rich,plain} rich rich for markdown rendering, plain for raw text
--model NAME kubeintellect-v2 Model name sent in requests (env: KUBE_Q_MODEL)
--user-name NAME You Your display name in the prompt (env: KUBE_Q_USER_NAME)
--agent-name NAME kube-q Assistant name in saved conversations (env: KUBE_Q_AGENT_NAME)
--no-banner off Suppress logo (useful for screen recordings)
--debug off Log raw HTTP requests/responses to stderr and ~/.kube-q/kube-q.log
--version Print version and exit

Session history

kube-q saves every conversation to a local SQLite database at ~/.kube-q/history.db. Nothing is sent to or read from the server — this is a local-only mirror.

# See recent sessions
kq --list

# Resume from where you left off
kq --session-id <id>

# Search across everything you've ever discussed
kq --search "deployment rollback"
kq --search "pods AND crash"

Inside the REPL, /sessions, /forget, /search, /branch, /branches, and /title give you full control over history.

Branching forks a conversation at the current message count. The original session is never modified — you get a new independent session you can take in a different direction. Branches show up in kq --list as regular sessions.


Token & cost tracking

After every response kube-q shows the token count in the footer:

kube-q  (1.2s · 460 tokens)

Use /tokens or /cost for a session summary:

┌─ Token Usage ─────────────────────────┐
│ This session:                         │
│   Prompt:     1,240 tokens            │
│   Completion: 3,890 tokens            │
│   Total:      5,130 tokens            │
│   Requests:   8                       │
│   Est. cost:  $0.0312                 │
│                                       │
│ Last response:                        │
│   120 in → 340 out ($0.0024)          │
└───────────────────────────────────────┘

Cost estimates are labeled "Est." — not exact. Built-in rates for kubeintellect-v2, gpt-4o, gpt-4o-mini, and claude-sonnet-4-6. Override for custom backends:

KUBE_Q_COST_PER_1K_PROMPT=0.002
KUBE_Q_COST_PER_1K_COMPLETION=0.008

If the server doesn't emit a usage block, the footer omits the token count — no errors, no noise.


Configuration

kube-q loads configuration from .env files and environment variables. Priority order (highest wins):

CLI flag  >  shell env var  >  ./.env  >  ~/.kube-q/.env  >  default

.env files

Location Priority Use case
~/.kube-q/.env lower Persistent user-level defaults
./.env (current directory) higher Project-local or per-cluster overrides

Shell-exported variables always win over .env files.

All supported variables

KUBE_Q_URL=http://localhost:8000
KUBE_Q_API_KEY=your-key-here
KUBE_Q_MODEL=kubeintellect-v2
KUBE_Q_TIMEOUT=120
KUBE_Q_HEALTH_TIMEOUT=5
KUBE_Q_NAMESPACE_TIMEOUT=3
KUBE_Q_STARTUP_RETRY_TIMEOUT=300
KUBE_Q_STARTUP_RETRY_INTERVAL=5
KUBE_Q_STREAM=true
KUBE_Q_OUTPUT=rich                  # rich | plain
KUBE_Q_LOG_LEVEL=INFO               # DEBUG | INFO | WARNING | ERROR
KUBE_Q_USER_NAME=You
KUBE_Q_AGENT_NAME=kube-q
KUBE_Q_COST_PER_1K_PROMPT=0.003    # override cost rate for /tokens
KUBE_Q_COST_PER_1K_COMPLETION=0.006

Example — per-cluster setup

# .env in your cluster's working directory
KUBE_Q_URL=https://kube-q.prod.example.com
KUBE_Q_API_KEY=prod-secret-key
KUBE_Q_USER_NAME=alice

Run kq from that directory and it picks up the settings automatically.

Quick one-time setup (pip users)

mkdir -p ~/.kube-q
cat >> ~/.kube-q/.env <<'EOF'
KUBE_Q_URL=https://kube-q.example.com
KUBE_Q_API_KEY=your-key-here
EOF

Authentication

When the server has API key authentication enabled, requests without a valid key are rejected with HTTP 401. kube-q shows a clear message:

Authentication required. Set KUBE_Q_API_KEY or pass --api-key with a valid key.
Ask your administrator for an API key.

When auth is disabled on the server, no key is needed.


Human-in-the-Loop (HITL)

When the AI backend requests approval before executing a potentially destructive action, kube-q pauses:

╭─ Action requires approval ──────────────────╮
│ Action requires approval.                   │
│ Type /approve to proceed or /deny to cancel.│
╰─────────────────────────────────────────────╯
HITL> /approve

The prompt changes to HITL> while an action is pending. Type /approve to execute it or /deny to cancel.


Python SDK

kube_q.core exposes a typed SDK you can use directly in scripts, notebooks, or other tools — no CLI required.

from kube_q.core.client import KubeQClient
from kube_q.core.events import TokenEvent, FinalEvent

client = KubeQClient(url="http://localhost:8000", api_key="...")

# Non-streaming query
result = client.query("why are my pods failing?")
print(result["text"])

# Streaming — typed event objects
for event in client.stream("list all deployments in default namespace"):
    match event:
        case TokenEvent(data=d):
            print(d.content, end="", flush=True)
        case FinalEvent():
            break

All backend events are modelled as a typed Pydantic discriminated union in kube_q.core.events:

Event type Data fields
token content, role
status phase, message
tool_call tool_name, args, call_id, dry_run
tool_result call_id, ok, summary, truncated
hitl_request action, risk, diff, approval_id
usage prompt_tokens, completion_tokens, total_tokens, model
final content, usage, elapsed_ms
error code, message, retryable

Web frontend

The web/ directory contains a Next.js web UI for kube-q.

Browser chat

Three-pane desktop layout (resizable panels):

  • Chat panel — streaming markdown responses with react-markdown + syntax highlighting
  • Reasoning timeline — live status, tool calls, and tool results as they happen
  • Terminal panel — xterm.js view of tool execution output

Tabbed mobile layout, dark mode, and bearer-token auth gate included.

PTY terminal (full CLI in the browser)

The /pty route spawns kq in a pseudo-terminal via WebSocket. It's a pure byte relay — the Python CLI handles all logic; xterm.js renders it.

cd web
npm install
npm run dev:pty     # starts Next.js + pty-server on separate ports

Open http://localhost:3000/pty to get a full terminal running your local kq binary in the browser.


Data & privacy

  • Session history is stored locally only at ~/.kube-q/history.db (SQLite). Nothing is sent to the kube-q server.
  • Conversations may contain sensitive cluster data. Use /save with care — saved files go wherever you point them.
  • The user ID (~/.kube-q/user-id) is stored with 0600 permissions.
  • Logs are written to ~/.kube-q/kube-q.log (rotating, 5 MB × 3 files).

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kube_q-1.4.0.tar.gz (129.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kube_q-1.4.0-py3-none-any.whl (46.6 kB view details)

Uploaded Python 3

File details

Details for the file kube_q-1.4.0.tar.gz.

File metadata

  • Download URL: kube_q-1.4.0.tar.gz
  • Upload date:
  • Size: 129.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kube_q-1.4.0.tar.gz
Algorithm Hash digest
SHA256 a3dd2bd6776575a565de8f1a8263c9cf16096c8027f171fdaedf87b9f28286d3
MD5 00b738ae65800e075dddb78462e98a17
BLAKE2b-256 5fad61c085ad0e41869330c1b8805a354296db6247e3e1089d595d173bf2aa54

See more details on using hashes here.

Provenance

The following attestation bundles were made for kube_q-1.4.0.tar.gz:

Publisher: publish.yml on MSKazemi/kube_q

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kube_q-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: kube_q-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 46.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kube_q-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bfcca02f88c35017e6f5d5f0a67935f60f3b2b271a5b10f3d1634be4cb12951c
MD5 ec8f2eb7069c7d88f3f44a97a71fc468
BLAKE2b-256 4bfaa157b4a6e31e8c1b757edb8937713598d59ec4210c50511db3103208356d

See more details on using hashes here.

Provenance

The following attestation bundles were made for kube_q-1.4.0-py3-none-any.whl:

Publisher: publish.yml on MSKazemi/kube_q

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page