Skip to main content

Maximal token-efficient RAG for headless Claude. Uses your existing claude CLI; auth-agnostic; slice-level retrieval.

Project description

jragmunch-cli

PyPI version Downloads Python versions License GitHub stars

Maximal token-efficient RAG for headless Claude. Uses your existing claude CLI; auth-agnostic; slice-level retrieval powered by jcodemunch-mcp.

Billing: subscription by default, API on opt-in

By default, jragmunch never bills your Anthropic API account. It strips ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN from the subprocess environment before spawning claude, so the CLI uses your Max / Pro Claude OAuth login while respecting their TOS — you pay $0 in dollars, the work counts against your subscription's session limits.

If you want to bill via the API instead, pass --use-api:

jragmunch --use-api ask "..."

Every verb prints the cost split:

[tokens in=24 out=1273  cost actual=$0.0000 (notional=$0.5334, auth=subscription)  time=27549ms]
  • actual — what you were really billed (always $0 in subscription mode).
  • notional — what the work would have cost via the API. claude -p computes this regardless of auth mode; we surface it as a "what it might have cost" yardstick.
  • authsubscription or api. Run jragmunch doctor to see your resolved mode.

When to use which mode

Anthropic's Claude Code Legal and Compliance docs distinguish individual ordinary use from business / always-on / multi-contributor use. jragmunch's defaults are tuned to that line.

You are… Recommended mode Why
A solo developer running verbs interactively on your own machine subscription (default) Anthropic explicitly permits "ordinary, individual usage of Claude Code."
A solo developer wiring jragmunch review into your own personal repo's CI with CLAUDE_CODE_OAUTH_TOKEN subscription (default) Permitted as long as you're the only contributor whose work it acts on.
A team running CI bots on a shared/commercial repo --use-api Anthropic requires API keys for "business or always-on deployments."
Multi-developer or commercial automation --use-api Subscriptions are not the right billing surface for shared use.
Heavy parallel fan-out (refactor --parallel 16, tests --max 100) --use-api High-throughput patterns aren't what subscription session limits are designed for.

When in doubt, pass --use-api and bring your own ANTHROPIC_API_KEY.

Multi-profile users (CLAUDE_CONFIG_DIR)

If you swap between Claude profiles (work and personal, for example) by exporting CLAUDE_CONFIG_DIR, jragmunch propagates that variable to the spawned claude -p subprocess automatically. You can also set it explicitly per-invocation with --config-dir, which overrides any inherited value:

# Inherit from shell env
CLAUDE_CONFIG_DIR=~/.claude.work jragmunch ask "..."

# Or set explicitly per-call
jragmunch --config-dir ~/.claude.personal ask "..."

What jragmunch is not

jragmunch is not an "agent harness" or a re-implementation of Claude Code. It shells out to the official claude CLI binary you installed via npm install -g @anthropic-ai/claude-code and parses its --output-format stream-json output. It does not replace, wrap, or proxy Anthropic's models — it just gives claude -p better retrieval via MCP. Anthropic's TOS permits this category of usage; the policy nuance above is about where you run it, not what tool you run.

Why

Headless Claude (claude -p) is the right substrate for retrieval-driven workflows — code Q&A, diff-aware review, batch refactors, "chat with your repo" use cases. The default pattern is "stuff the relevant files into the prompt and pray," which burns tokens on code the model never needed.

jragmunch wraps claude -p with jcodemunch pre-wired so the model retrieves slices on demand instead of receiving giant context dumps. (For team or always-on CI usage, see the auth-mode table above — pass --use-api and bring your own API key.)

Install

pip install jragmunch
jragmunch doctor

Requires the claude CLI on PATH (npm install -g @anthropic-ai/claude-code) and jcodemunch-mcp registered as an MCP server.

Usage

jragmunch ask "how does auth work in this repo"
jragmunch ask "what does AuthMiddleware.verify do" --json
jragmunch index --repo .
jragmunch run "Refactor the rate-limiter to use a token bucket"

Try the side-by-side demo: AskClaude.py

AskClaude.py (in the repo root) is an interactive script that asks one question and shows you, in plain English, what jragmunch saved you.

git clone https://github.com/jgravelle/jragmunch-cli
cd jragmunch-cli
pip install -e ".[dev]"
pip install tiktoken              # optional, for accurate token estimates
python AskClaude.py

It prompts for a local repo path and a question, then prints the answer followed by a comparison block:

In its raw form, your request may have used as many as 799,037 tokens,
at a cost of $11.99.

Using jRagMunch, our call to Opus 4.7 only used 24,771 tokens.

By using your subscription WITHIN THE TERMS OF ANTHROPIC'S TOS, you paid
$0.00 and used a nearly imperceptible fractional percentage of your quota.

The "raw" number is a local projection of what pasting the entire repo into the prompt would have cost (capped at the model's input window). The jragmunch number is the actual marginal tokens this call consumed (input + cache creation + output — cache reads excluded since those are already-paid context being re-presented). Cost figures price the naive projection at Opus 4.7's uncached input rate; subscription mode pays $0 either way.

Use it as a one-shot demo, a sanity check on your own repos, or a template for embedding jragmunch in other tools.

Verbs (v0.1)

Verb Status Purpose
doctor shipped Verify claude + MCP wiring
ask shipped Retrieval-augmented Q&A
index shipped Index a repo via jcodemunch
run shipped Power-user prompt passthrough
review shipped Diff-aware PR review
changelog shipped Summarize changes since tag
refactor shipped Fan-out batch refactor
tests shipped Generate tests for untested symbols
sweep shipped Pattern-driven cleanup

Principles

  • Auth-agnostic. Whatever auth the local claude binary uses, jragmunch uses.
  • Slice, don't dump. Default behavior is jcodemunch retrieval.
  • Structured output. Every verb returns JSON with citations and _meta (tokens, cost, wall time).
  • Composable. --print-command shows the exact claude -p invocation that would run.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jragmunch-0.4.6.tar.gz (28.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jragmunch-0.4.6-py3-none-any.whl (30.1 kB view details)

Uploaded Python 3

File details

Details for the file jragmunch-0.4.6.tar.gz.

File metadata

  • Download URL: jragmunch-0.4.6.tar.gz
  • Upload date:
  • Size: 28.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for jragmunch-0.4.6.tar.gz
Algorithm Hash digest
SHA256 91d6a435d07d8fea5a3f9c5237feb472381b05edf6cc03873c87ee2ea7258af6
MD5 27f36c8a41cd26d7ace96512c26cc8a3
BLAKE2b-256 73b0bfab5671ad7bf5ee7b9d7b417b6578aed515653bd903d199832dc596e14b

See more details on using hashes here.

File details

Details for the file jragmunch-0.4.6-py3-none-any.whl.

File metadata

  • Download URL: jragmunch-0.4.6-py3-none-any.whl
  • Upload date:
  • Size: 30.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for jragmunch-0.4.6-py3-none-any.whl
Algorithm Hash digest
SHA256 d2af0c6bf3e0c4ca736e015f9c54ec89e7b292a6fb797a5471ee3c02d756366b
MD5 0493b655ef4d4422e93cd751709cddd1
BLAKE2b-256 b83a715c56d10bf904422d01b31ec4377ccbe836f53cd796995cd6c1f97f1bcd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page