Extract the essence of your codebase. Auto-generate AGENTS.md, CLAUDE.md, .cursorrules and more.
Project description
What is saar?
saar is a CLI that analyzes your codebase and writes an AGENTS.md: a precise context file that every AI coding tool reads automatically.
One command. Claude Code, Cursor, claude.ai, Copilot, Gemini CLI. They all stop guessing and start knowing.
The problem
I asked Claude to install a package. It said npm install. My project uses bun. The build broke. I spent 20 minutes confused.
This happens to every developer using AI tools. Every week.
- AI writes
npm installin yourbunrepo - AI invents a new exception class when you already have 218
- AI uses the wrong auth decorator from 10 available options
- AI uses
import loggingwhen your team standardized onstructlog - Every session starts from zero. No memory of how your project actually works.
The fix exists: a context file that tells the AI exactly how your codebase works. But writing one well is hard, they go stale fast, and nobody maintains them.
saar automates the hard part.
Quick start
pip install saar
cd your-project
saar extract .
Done. AGENTS.md is in your project root. Every AI tool picks it up automatically.
What you see when it runs:
saar analyzing your-project...
Backend FastAPI Python (47 files)
Frontend React TypeScript Vite bun
Auth get_current_active_superuser (from app.api.deps)
Logging structlog
Exceptions APIError, AuthenticationError, LimitCheckError (+6 more)
Scale 694 functions 276 files 96% typed
wrote AGENTS.md (72 lines)
Claude knows your project.
saar found your auth pattern, your logging library, your exception classes, and your package manager. You didn't tell it any of that.
How it works
your repo
|
v
saar extract .
|
+-- static analysis ------- detects stack, auth, logging, naming, exceptions
|
+-- guided interview ------- 5 questions for tribal knowledge:
| off-limits files, domain terms, team gotchas
|
+-- AGENTS.md -------------- ~100 lines, picked up automatically by:
Claude Code, Cursor, claude.ai, Copilot, Gemini CLI
saar generates short, precise files. Not 300-line dumps. ETH Zurich (Feb 2026, arxiv:2602.11988) showed that long LLM-generated context files reduce task success and increase costs 20%+. saar's default is 100 lines. Focused. Nothing wasted.
Before / After
Without saar (claude.ai, no context):
Q: Add debug logging to the Python endpoint.
import logging
logger = logging.getLogger(__name__)
Wrong. This codebase uses structlog.
With saar (same question, AGENTS.md loaded):
Q: Add debug logging to the Python endpoint.
import structlog
logger = structlog.get_logger(__name__)
# structlog: structured JSON output, standard for this project
Right. First try. No back-and-forth.
This is a real test result from a controlled eval on the PostHog codebase. 174 Python files use import logging. Claude follows the majority without context. AGENTS.md overrides it.
When your codebase changes, saar tells you
saar diff .
saar checking your-project for changes...
AGENTS.md last generated: 14 days ago
Changed since last extract:
~ Package manager changed: npm -> bun
+ New exception class: RateLimitError
+ New auth pattern detected
Run saar extract . to update.
Your AGENTS.md was telling Claude to use npm. saar caught it before you committed broken code.
Keep corrections over time
AI gets something wrong? Add it once. Never see that mistake again.
saar add "Never use npm, this project uses bun"
saar add --off-limits "billing/ -- legacy Stripe integration, frozen until Q3"
saar add --domain "Workspace = tenant, not a directory"
saar add --verify "source venv/bin/activate && pytest tests/ -v"
No re-analysis. Each correction appends to .saar/config.json and gets included next time you run saar extract.
saar vs everything else
| Feature | saar | /init (Claude Code) |
manual |
|---|---|---|---|
| Detects package manager | ✅ | basic | you write it |
| Detects logging library | ✅ | ✗ | you write it |
| Detects auth patterns | ✅ | basic | you write it |
| Detects exception classes | ✅ | ✗ | you write it |
| Tribal knowledge interview | ✅ | ✗ | you know it |
| Output size | ~100 lines | 300+ lines | up to you |
Staleness detection (saar diff) |
✅ | ✗ | ✗ |
Quality linting (saar lint) |
✅ | ✗ | ✗ |
| Works with all AI tools | ✅ | Claude only | ✅ |
| Free + fully local | ✅ | ✅ | ✅ |
Claude Code's /init is useful. But it generates bloated files that ETH Zurich showed hurt performance. saar generates focused files and keeps them honest over time.
All commands
# Generate
saar extract . # AGENTS.md (default, ~100 lines)
saar extract . --format claude # CLAUDE.md
saar extract . --format cursorrules # .cursorrules
saar extract . --format all # all formats at once
saar extract . --no-interview # skip questions, use cached answers
saar extract . --verbose # remove 100-line cap, full output
saar extract . --include packages/api # monorepo subset
# Maintain
saar diff . # detect what changed since last extract
saar add "rule" # add correction without re-running
saar add --off-limits "path/" # mark file/dir as off-limits for AI
saar add --domain "term = definition" # add domain vocabulary
saar add --verify "command" # set the verification workflow
# Quality
saar lint . # check AGENTS.md for SA001-SA005 violations
saar stats . # score your AGENTS.md (0-100)
saar check . # CI primitive: exits 1 if stale or incomplete
# AI enrichment (requires ANTHROPIC_API_KEY)
saar enrich # use Claude to sharpen raw interview answers
# OCI integration
saar extract . --index # generate AGENTS.md + index into OCI
saar lint
saar lint .
AGENTS.md:5:1: SA004 Generic filler: 'Write clean code' -- AI already knows this
AGENTS.md:12:1: SA001 Duplicate rule: already appears on line 3
Found 2 violations. Run saar stats . for a full quality score.
Like ruff, but for your context file. Catches:
SA001duplicate rulesSA002orphaned section headersSA003vague rules under 6 wordsSA004generic filler (write clean code, follow best practices)SA005emojis that waste instruction budget
saar check (CI)
# .github/workflows/ci.yml
- run: saar check .
Exits 0 if AGENTS.md is fresh and complete. Exits 1 with a specific message if not. Never let a stale context file slip into production.
OCI — semantic search via MCP
saar generates your AGENTS.md. OpenCodeIntel (OCI) indexes your codebase for per-task context via MCP.
saar extract . --index
Once indexed, Claude Desktop and Claude Code get a new tool:
codeintel:get_context_for_task("add rate limiting to the settings endpoints")
Returns:
- backend/routes/settings.py (94% relevance)
- backend/middleware/auth.py (81% relevance)
- Rule: use LimitCheckError, not a new exception
- Rule: require_auth on all user endpoints
Instead of exploring 30k tokens of files, Claude gets the exact 3 files and 2 rules for the task.
What saar detects
Python — FastAPI / Flask / Django, auth middleware and decorators, logging library, exception class hierarchy, ORM patterns, naming conventions
TypeScript/JS — React / Next.js / Express, package manager (bun / pnpm / npm / yarn), TanStack Query / SWR patterns, component library, custom hooks, common imports
Both — critical files (most depended-on), circular dependencies, canonical examples per category, existing team rules (reads CLAUDE.md, .cursorrules, CONVENTIONS.md)
Installation
# Recommended
pipx install saar
# Standard
pip install saar
# With AI enrichment
pip install "saar[enrich]"
export ANTHROPIC_API_KEY=sk-ant-...
Requires Python 3.10+. No account. No API key for core features. Runs entirely on your machine.
Contributing
saar is MIT licensed. Everything is public: commits, decisions, benchmarks.
git clone https://github.com/OpenCodeIntel/saar.git
cd saar
python -m venv venv && source venv/bin/activate
pip install -e ".[dev]"
pytest tests/ -v # 548 tests
ruff check saar/ tests/ # lint
# verify saar on itself
saar extract . --no-interview
saar lint .
saar stats .
Good first issues: good first issue
If you're building a feature, open an issue first. Saves everyone time.
Why I built this
I'm Devanshu, MS Software Engineering at Northeastern, solo founder building this in the open.
I got tired of AI tools that sounded smart but didn't know my project. Every session: wrong package manager, wrong exception class, wrong import. The fix was obvious: give the AI a context file. The problem was nobody maintained those files, they went stale, and most were full of generic filler the AI already knew.
So I built saar. It generates the file, keeps it short, tells you when it's stale, and lints it for quality. Runs locally. Costs nothing. Works with every AI tool you already use.
The code is all here. The benchmarks are all here. Nothing hidden.
Community
- Issues: github.com/OpenCodeIntel/saar/issues
- Discussions: github.com/OpenCodeIntel/saar/discussions
- Website: getsaar.com
- OCI: opencodeintel.com
License
MIT. Free forever. Do whatever you want with it.
getsaar.com · PyPI · MIT License
If saar saved you time, a star on GitHub helps others find it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file saar-0.5.14.tar.gz.
File metadata
- Download URL: saar-0.5.14.tar.gz
- Upload date:
- Size: 456.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04b3d669abf7e25771fff93aa7fa772da93dfedee21d4d2510afd4b8792cb3b6
|
|
| MD5 |
c3b03a47a57c441fa293446c6e63cede
|
|
| BLAKE2b-256 |
b5ed99a86bbfaa0dc8964e19499b2ae2cbd0e72ac0198b13b46edce0600c320f
|
File details
Details for the file saar-0.5.14-py3-none-any.whl.
File metadata
- Download URL: saar-0.5.14-py3-none-any.whl
- Upload date:
- Size: 106.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58584d70efbb2597df912ec4f450786b7a3c3bf91cae3bb300e214ada30e369b
|
|
| MD5 |
4e3ea8907c27a4fcbfe4f18e1b96db28
|
|
| BLAKE2b-256 |
4a218216da4038beb42037d9549b7aaa9668702da35e6e51342e5c0560cd12d3
|