Skip to main content

Maximal token-efficient RAG for headless Claude. Uses your existing claude CLI; auth-agnostic; slice-level retrieval.

Project description

jragmunch-cli

PyPI version Downloads Python versions License GitHub stars

Maximal token-efficient RAG for headless Claude. Uses your existing claude CLI; auth-agnostic; slice-level retrieval powered by jcodemunch-mcp.

Billing: subscription by default, API on opt-in

By default, jragmunch never bills your Anthropic API account. It strips ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN from the subprocess environment before spawning claude, so the CLI uses your Max / Pro Claude OAuth login while respecting their TOS — you pay $0 in dollars, the work counts against your subscription's session limits.

If you want to bill via the API instead, pass --use-api:

jragmunch --use-api ask "..."

Every verb prints the cost split:

[tokens in=24 out=1273  cost actual=$0.0000 (notional=$0.5334, auth=subscription)  time=27549ms]
  • actual — what you were really billed (always $0 in subscription mode).
  • notional — what the work would have cost via the API. claude -p computes this regardless of auth mode; we surface it as a "what it might have cost" yardstick.
  • authsubscription or api. Run jragmunch doctor to see your resolved mode.

When to use which mode

Anthropic's Claude Code Legal and Compliance docs distinguish individual ordinary use from business / always-on / multi-contributor use. jragmunch's defaults are tuned to that line.

You are… Recommended mode Why
A solo developer running verbs interactively on your own machine subscription (default) Anthropic explicitly permits "ordinary, individual usage of Claude Code."
A solo developer wiring jragmunch review into your own personal repo's CI with CLAUDE_CODE_OAUTH_TOKEN subscription (default) Permitted as long as you're the only contributor whose work it acts on.
A team running CI bots on a shared/commercial repo --use-api Anthropic requires API keys for "business or always-on deployments."
Multi-developer or commercial automation --use-api Subscriptions are not the right billing surface for shared use.
Heavy parallel fan-out (refactor --parallel 16, tests --max 100) --use-api High-throughput patterns aren't what subscription session limits are designed for.

When in doubt, pass --use-api and bring your own ANTHROPIC_API_KEY.

Multi-profile users (CLAUDE_CONFIG_DIR)

If you swap between Claude profiles (work and personal, for example) by exporting CLAUDE_CONFIG_DIR, jragmunch propagates that variable to the spawned claude -p subprocess automatically. You can also set it explicitly per-invocation with --config-dir, which overrides any inherited value:

# Inherit from shell env
CLAUDE_CONFIG_DIR=~/.claude.work jragmunch ask "..."

# Or set explicitly per-call
jragmunch --config-dir ~/.claude.personal ask "..."

What jragmunch is not

jragmunch is not an "agent harness" or a re-implementation of Claude Code. It shells out to the official claude CLI binary you installed via npm install -g @anthropic-ai/claude-code and parses its --output-format stream-json output. It does not replace, wrap, or proxy Anthropic's models — it just gives claude -p better retrieval via MCP. Anthropic's TOS permits this category of usage; the policy nuance above is about where you run it, not what tool you run.

Why

Headless Claude (claude -p) is the right substrate for retrieval-driven workflows — code Q&A, diff-aware review, batch refactors, "chat with your repo" use cases. The default pattern is "stuff the relevant files into the prompt and pray," which burns tokens on code the model never needed.

jragmunch wraps claude -p with jcodemunch pre-wired so the model retrieves slices on demand instead of receiving giant context dumps. (For team or always-on CI usage, see the auth-mode table above — pass --use-api and bring your own API key.)

Install

pip install jragmunch
jragmunch doctor

Requires the claude CLI on PATH (npm install -g @anthropic-ai/claude-code) and jcodemunch-mcp registered as an MCP server.

Usage

jragmunch ask "how does auth work in this repo"
jragmunch ask "what does AuthMiddleware.verify do" --json
jragmunch index --repo .
jragmunch run "Refactor the rate-limiter to use a token bucket"

Try the side-by-side demo: AskClaude.py

AskClaude.py (in the repo root) is an interactive script that asks one question and shows you, in plain English, what jragmunch saved you.

git clone https://github.com/jgravelle/jragmunch-cli
cd jragmunch-cli
pip install -e ".[dev]"
pip install tiktoken              # optional, for accurate token estimates
python AskClaude.py

It prompts for a local repo path and a question, then prints the answer followed by a comparison block:

In its raw form, your request may have used as many as 799,037 tokens,
at a cost of $11.99.

Using jRagMunch, our call to Opus 4.7 only used 24,771 tokens.

By using your subscription WITHIN THE TERMS OF ANTHROPIC'S TOS, you paid
$0.00 and used a nearly imperceptible fractional percentage of your quota.

The "raw" number is a local projection of what pasting the entire repo into the prompt would have cost (capped at the model's input window). The jragmunch number is the actual marginal tokens this call consumed (input + cache creation + output — cache reads excluded since those are already-paid context being re-presented). Cost figures price the naive projection at Opus 4.7's uncached input rate; subscription mode pays $0 either way.

Use it as a one-shot demo, a sanity check on your own repos, or a template for embedding jragmunch in other tools.

Verbs (v0.1)

Verb Status Purpose
doctor shipped Verify claude + MCP wiring
ask shipped Retrieval-augmented Q&A
index shipped Index a repo via jcodemunch
run shipped Power-user prompt passthrough
review shipped Diff-aware PR review
changelog shipped Summarize changes since tag
refactor shipped Fan-out batch refactor
tests shipped Generate tests for untested symbols
sweep shipped Pattern-driven cleanup

Principles

  • Auth-agnostic. Whatever auth the local claude binary uses, jragmunch uses.
  • Slice, don't dump. Default behavior is jcodemunch retrieval.
  • Structured output. Every verb returns JSON with citations and _meta (tokens, cost, wall time).
  • Composable. --print-command shows the exact claude -p invocation that would run.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jragmunch-0.4.7.tar.gz (28.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jragmunch-0.4.7-py3-none-any.whl (30.3 kB view details)

Uploaded Python 3

File details

Details for the file jragmunch-0.4.7.tar.gz.

File metadata

  • Download URL: jragmunch-0.4.7.tar.gz
  • Upload date:
  • Size: 28.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for jragmunch-0.4.7.tar.gz
Algorithm Hash digest
SHA256 ed5b9b0f965f67d26d8d850bed36a1bde3355052c7ca279251b0ee460ab00399
MD5 ebcfbe9d4ff4924809e9c2ee98e38126
BLAKE2b-256 b4cdd30491817722f52590f00d2d1fb095b6d16c1fe1a2e94e433506429c0863

See more details on using hashes here.

File details

Details for the file jragmunch-0.4.7-py3-none-any.whl.

File metadata

  • Download URL: jragmunch-0.4.7-py3-none-any.whl
  • Upload date:
  • Size: 30.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for jragmunch-0.4.7-py3-none-any.whl
Algorithm Hash digest
SHA256 0dbb914ecdde3289f9f9d1cf10afccd457c5c63eeddf01f1d12f482d63d6cc15
MD5 9f953aaf3b2021eb30eafc930d97041c
BLAKE2b-256 e84da5a20fcff792c115a8690fc654ca7ddb48073904d061df96267d9396b03a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page