Cut AI coding token costs by 96-98% — curated context, MCP server, and tray app for AI coding tools

These details have not been verified by PyPI

Project links

Project description

cram-ai

Slash AI coding token costs by injecting only what the model needs — nothing more.

AI coding tools auto-index your entire repo at session start. That indexing generates cache writes — the most expensive token type (3–4× the cost of reads). cram-ai replaces auto-indexing with a small set of curated files that give the model exactly what it needs: repo structure, key decisions, and focused excerpts of only the files relevant to your current task.

Benchmarks

cram-ai itself (31 source files, Python CLI tool)

	Tokens	Sonnet cost/session	Opus cost/session
Without cram — full repo auto-indexed	49,683	$0.186	$0.932
Without cram — orientation set only¹	19,687	$0.074	$0.369
With cram — ARCHITECTURE + SYMBOLS + task context	1,898	$0.007	$0.036

96% token reduction. $0.18 saved per session. $18 over 100 sessions (Sonnet).

pallets/flask (118 source files, Python web framework)

	Tokens	Sonnet cost/session	Opus cost/session
Without cram — full repo auto-indexed	171,641	$0.644	$3.22
Without cram — orientation set only¹	60,863	$0.228	$1.14
With cram — ARCHITECTURE + SYMBOLS + task context	5,929	$0.022	$0.111

96.5% token reduction. $0.62 saved per session. $62 over 100 sessions (Sonnet).

hoppscotch/hoppscotch (2,151 source files, TypeScript monorepo)

	Tokens	Sonnet cost/session
Without cram — full repo auto-indexed	418,697	$1.57
With cram	7,239	$0.027

98.3% reduction. $154 saved over 100 sessions (Sonnet).

¹ Orientation set = file tree + README + pyproject.toml/package.json + 5 largest source files. A realistic estimate for tools that don't index everything.
Pricing: Claude Sonnet 4.6 cache write $3.75/M, Opus 4.8 $18.75/M. Savings scale with team size and session frequency.

How it works

AI agents spend most tokens on orientation — finding relevant files, understanding structure, reading configs. cram-ai replaces that with a curated map the model reads instead of building itself.

your-repo/
└── .cram-ai-context/
    ├── ARCHITECTURE.md   ← repo structure, tech stack, key files (auto-generated by Haiku)
    ├── DECISIONS.md      ← architectural decisions you want the AI to respect
    ├── SYMBOLS.md        ← public function/class index across all source files (auto-generated)
    └── CURRENT_TASK.md   ← per-session: task + focused excerpts of relevant files

SYMBOLS.md is the key accuracy improvement. Rather than asking a model to guess which files matter based on filenames alone, cram maps every source file to its public identifiers (api/routes.py: handle_rate_limit, check_throttle, apply_backoff). The model uses that map to select files and identify the exact functions to excerpt — so "fix the rate limiter" finds check_throttle even if the words don't match.

cram task "..." runs before every session:

[1/4] Loads SYMBOLS.md — 455 identifiers across 65 files, zero LLM calls
[2/4] Sends architecture + symbol index to Haiku → model returns path | RelevantFunc, OtherClass
[3/4] Extracts identifier-focused excerpts — only the lines that contain those functions, plus context window
[4/4] Writes to your tool's instruction file, warns if below cache minimum for your model

All stages stream live to the popup so you see exactly what's happening.

Quick start

pip install cram-ai

cd your-repo
cram init                              # one-time setup — scans repo, generates docs, indexes symbols
cram task "add login validation"       # run before every session
# → context pre-loaded into your AI tool
cram sync                              # run after every commit (or fires automatically via git hook)

First command to context ready: under 60 seconds.

CLI commands

Command	When to run	What it does
`cram init`	Once per repo	Scans structure, generates `ARCHITECTURE.md` + `SYMBOLS.md` via Haiku
`cram task "..."`	Before every session	Identifies relevant files by symbol, inlines focused excerpts
`cram continue`	Mid-session before committing	Extends grace period — prevents context reset on mid-task commits
`cram sync`	After every commit	Updates `ARCHITECTURE.md` + `SYMBOLS.md` from git diff
`cram decide "..."`	When making arch choices	Appends a dated decision entry to `DECISIONS.md`
`cram status`	Anytime	Shows `.cram-ai-context/` files and freshness

Provider support

The tool is model-agnostic. Set AICONTEXT_MODEL to any provider:

# Claude CLI (default — works inside Claude Code with no API key)
cram init

# Anthropic SDK
export ANTHROPIC_API_KEY=sk-...
export AICONTEXT_MODEL=anthropic/claude-haiku-4-5-20251001
cram init

# OpenAI
export OPENAI_API_KEY=sk-...
export AICONTEXT_MODEL=openai/gpt-4o-mini
cram init

# Google Gemini
export GEMINI_API_KEY=...
export AICONTEXT_MODEL=gemini/gemini-2.0-flash
cram init

# Local (Ollama — free, no key needed)
export AICONTEXT_MODEL=ollama/mistral
cram init

Also supports: AWS Bedrock, GCP Vertex AI, Azure OpenAI, custom LiteLLM proxies — auto-discovered from env/credentials.

Session discipline

The context files handle orientation. These rules handle the rest:

Run cram task "..." before every session — never let the model hunt for files itself.
Hard session boundary — end the session the moment a feature works. New code = growing context = rising cost.
Mid-task commit? Run cram continue first to extend the grace period.
Run cram sync after every commit — keeps ARCHITECTURE.md and SYMBOLS.md accurate.
Architectural decision? Run cram decide "use Redis for sessions" — keeps DECISIONS.md current without opening the file.

Environment variables

Variable	Default	Description
`AICONTEXT_MODEL`	auto-detected	Model for context tasks — bare alias or `provider/model`
`ANTHROPIC_API_KEY`	—	Anthropic API key (optional inside Claude Code)
`AICONTEXT_MAX_FILES`	`5`	Max files inlined per task
`AICONTEXT_MAX_LINES`	`300`	Max lines per ARCHITECTURE.md
`AICONTEXT_MAX_EXCERPT_LINES`	`80`	Max lines excerpted per file in `CURRENT_TASK.md`
`CRAM_TASK_GRACE_SECONDS`	`600`	Seconds after `cram task` before a commit resets context

Works with any AI coding tool

Tool	How context loads
Claude Code	Reads `.cram-ai-context/` recursively — all files auto-loaded
Cursor	Writes to `.cursor/rules/cram-task.md` — auto-loaded by Cursor
Windsurf	Writes to `.windsurf/rules/cram-task.md` — auto-loaded
Codex	Writes to `.cram-ai-context/AGENTS.md` — auto-loaded
GitHub Copilot	Writes to `.github/cram-task.md` — include once in `copilot-instructions.md`

For non-Claude tools, cram automatically prepends a compact architecture summary so the model has repo orientation even without recursive file loading.

Running tests

pip install pytest
pytest

57 passing tests, no API key required. All model calls are mocked.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

Jun 10, 2026

0.2.1

Jun 8, 2026

0.2.0

Jun 8, 2026

This version

0.1.0

Jun 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cram_ai-0.1.0.tar.gz (62.6 kB view details)

Uploaded Jun 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cram_ai-0.1.0-py3-none-any.whl (65.4 kB view details)

Uploaded Jun 8, 2026 Python 3

File details

Details for the file cram_ai-0.1.0.tar.gz.

File metadata

Download URL: cram_ai-0.1.0.tar.gz
Upload date: Jun 8, 2026
Size: 62.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for cram_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f4b761e618c9ca5f173b75cab659ef173f6ba38125364c3e768c080341d0c628`
MD5	`d4d53002cf8bbbd8c3e6fd312e4844cc`
BLAKE2b-256	`d716da6faacd5fed219309295d201d07e833d9126d954bccc58020e83c5d6fdc`

See more details on using hashes here.

File details

Details for the file cram_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: cram_ai-0.1.0-py3-none-any.whl
Upload date: Jun 8, 2026
Size: 65.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for cram_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7f917c9f0cec6fbb31cb61c9872eb0b4925a97141f9459da86e0e0bfd8cbaaa6`
MD5	`b8b0e6ec0766f89c3c75144d35555119`
BLAKE2b-256	`e94aa917f178cb59b7cedfc8b653cf0f735116fb6d8fafd6fb613ea20b01a67b`

See more details on using hashes here.

cram-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

cram-ai

Benchmarks

cram-ai itself (31 source files, Python CLI tool)

pallets/flask (118 source files, Python web framework)

hoppscotch/hoppscotch (2,151 source files, TypeScript monorepo)

How it works

Quick start

CLI commands

Provider support

Session discipline

Environment variables

Works with any AI coding tool

Running tests

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes