Skip to main content

Extract the essence of your codebase. Auto-generate AGENTS.md, CLAUDE.md, .cursorrules and more.

Project description


PyPI Downloads Tests License: MIT Python 3.10+


Typing SVG


getsaar.com  ·  Docs  ·  PyPI  ·  Issues  ·  OCI



What is saar?

saar is a CLI that analyzes your codebase and writes an AGENTS.md: a precise context file that every AI coding tool reads automatically.

One command. Claude Code, Cursor, claude.ai, Copilot, Gemini CLI. They all stop guessing and start knowing.


The problem

I asked Claude to install a package. It said npm install. My project uses bun. The build broke. I spent 20 minutes confused.

This happens to every developer using AI tools. Every week.

  • AI writes npm install in your bun repo
  • AI invents a new exception class when you already have 218
  • AI uses the wrong auth decorator from 10 available options
  • AI uses import logging when your team standardized on structlog
  • Every session starts from zero. No memory of how your project actually works.

The fix exists: a context file that tells the AI exactly how your codebase works. But writing one well is hard, they go stale fast, and nobody maintains them.

saar automates the hard part.


Quick start

pip install saar
cd your-project
saar extract .

Done. AGENTS.md is in your project root. Every AI tool picks it up automatically.

What you see when it runs:

saar analyzing your-project...

  Backend     FastAPI  Python (47 files)
  Frontend    React  TypeScript  Vite  bun
  Auth        get_current_active_superuser  (from app.api.deps)
  Logging     structlog
  Exceptions  APIError, AuthenticationError, LimitCheckError (+6 more)
  Scale       694 functions  276 files  96% typed

  wrote AGENTS.md  (72 lines)
  Claude knows your project.

saar found your auth pattern, your logging library, your exception classes, and your package manager. You didn't tell it any of that.


How it works

your repo
    |
    v
saar extract .
    |
    +-- static analysis ------- detects stack, auth, logging, naming, exceptions
    |
    +-- guided interview ------- 5 questions for tribal knowledge:
    |                             off-limits files, domain terms, team gotchas
    |
    +-- AGENTS.md -------------- ~100 lines, picked up automatically by:
                                  Claude Code, Cursor, claude.ai, Copilot, Gemini CLI

saar generates short, precise files. Not 300-line dumps. ETH Zurich (Feb 2026, arxiv:2602.11988) showed that long LLM-generated context files reduce task success and increase costs 20%+. saar's default is 100 lines. Focused. Nothing wasted.


Before / After

Without saar (claude.ai, no context):

Q: Add debug logging to the Python endpoint.

import logging
logger = logging.getLogger(__name__)

Wrong. This codebase uses structlog.

With saar (same question, AGENTS.md loaded):

Q: Add debug logging to the Python endpoint.

import structlog
logger = structlog.get_logger(__name__)
# structlog: structured JSON output, standard for this project

Right. First try. No back-and-forth.

This is a real test result from a controlled eval on the PostHog codebase. 174 Python files use import logging. Claude follows the majority without context. AGENTS.md overrides it.


When your codebase changes, saar tells you

saar diff .
saar checking your-project for changes...

  AGENTS.md last generated: 14 days ago

  Changed since last extract:
  ~ Package manager changed: npm -> bun
  + New exception class: RateLimitError
  + New auth pattern detected

  Run saar extract . to update.

Your AGENTS.md was telling Claude to use npm. saar caught it before you committed broken code.


Keep corrections over time

AI gets something wrong? Add it once. Never see that mistake again.

saar add "Never use npm, this project uses bun"
saar add --off-limits "billing/ -- legacy Stripe integration, frozen until Q3"
saar add --domain "Workspace = tenant, not a directory"
saar add --verify "source venv/bin/activate && pytest tests/ -v"

No re-analysis. Each correction appends to .saar/config.json and gets included next time you run saar extract.


saar vs everything else

Feature saar /init (Claude Code) manual
Detects package manager basic you write it
Detects logging library you write it
Detects auth patterns basic you write it
Detects exception classes you write it
Tribal knowledge interview you know it
Output size ~100 lines 300+ lines up to you
Staleness detection (saar diff)
Quality linting (saar lint)
Works with all AI tools Claude only
Free + fully local

Claude Code's /init is useful. But it generates bloated files that ETH Zurich showed hurt performance. saar generates focused files and keeps them honest over time.


All commands

# Generate
saar extract .                          # AGENTS.md (default, ~100 lines)
saar extract . --format claude          # CLAUDE.md
saar extract . --format cursorrules     # .cursorrules
saar extract . --format all             # all formats at once
saar extract . --no-interview           # skip questions, use cached answers
saar extract . --verbose                # remove 100-line cap, full output
saar extract . --include packages/api   # monorepo subset

# Maintain
saar diff .                             # detect what changed since last extract
saar add "rule"                         # add correction without re-running
saar add --off-limits "path/"           # mark file/dir as off-limits for AI
saar add --domain "term = definition"   # add domain vocabulary
saar add --verify "command"             # set the verification workflow

# Quality
saar lint .                             # check AGENTS.md for SA001-SA005 violations
saar stats .                            # score your AGENTS.md (0-100)
saar check .                            # CI primitive: exits 1 if stale or incomplete

# AI enrichment (requires ANTHROPIC_API_KEY)
saar enrich                             # use Claude to sharpen raw interview answers

# OCI integration
saar extract . --index                  # generate AGENTS.md + index into OCI

saar lint

saar lint .

  AGENTS.md:5:1:  SA004  Generic filler: 'Write clean code' -- AI already knows this
  AGENTS.md:12:1: SA001  Duplicate rule: already appears on line 3

  Found 2 violations.  Run saar stats . for a full quality score.

Like ruff, but for your context file. Catches:

  • SA001 duplicate rules
  • SA002 orphaned section headers
  • SA003 vague rules under 6 words
  • SA004 generic filler (write clean code, follow best practices)
  • SA005 emojis that waste instruction budget

saar check (CI)

# .github/workflows/ci.yml
- run: saar check .

Exits 0 if AGENTS.md is fresh and complete. Exits 1 with a specific message if not. Never let a stale context file slip into production.


OCI — semantic search via MCP

saar generates your AGENTS.md. OpenCodeIntel (OCI) indexes your codebase for per-task context via MCP.

saar extract . --index

Once indexed, Claude Desktop and Claude Code get a new tool:

codeintel:get_context_for_task("add rate limiting to the settings endpoints")

Returns:
  - backend/routes/settings.py (94% relevance)
  - backend/middleware/auth.py (81% relevance)
  - Rule: use LimitCheckError, not a new exception
  - Rule: require_auth on all user endpoints

Instead of exploring 30k tokens of files, Claude gets the exact 3 files and 2 rules for the task.

opencodeintel.com · MCP setup


What saar detects

Python — FastAPI / Flask / Django, auth middleware and decorators, logging library, exception class hierarchy, ORM patterns, naming conventions

TypeScript/JS — React / Next.js / Express, package manager (bun / pnpm / npm / yarn), TanStack Query / SWR patterns, component library, custom hooks, common imports

Both — critical files (most depended-on), circular dependencies, canonical examples per category, existing team rules (reads CLAUDE.md, .cursorrules, CONVENTIONS.md)


Installation

# Recommended
pipx install saar

# Standard
pip install saar

# With AI enrichment
pip install "saar[enrich]"
export ANTHROPIC_API_KEY=sk-ant-...

Requires Python 3.10+. No account. No API key for core features. Runs entirely on your machine.


Contributing

saar is MIT licensed. Everything is public: commits, decisions, benchmarks.

git clone https://github.com/OpenCodeIntel/saar.git
cd saar
python -m venv venv && source venv/bin/activate
pip install -e ".[dev]"

pytest tests/ -v        # 548 tests
ruff check saar/ tests/ # lint

# verify saar on itself
saar extract . --no-interview
saar lint .
saar stats .

Good first issues: good first issue

If you're building a feature, open an issue first. Saves everyone time.


Why I built this

I'm Devanshu, MS Software Engineering at Northeastern, solo founder building this in the open.

I got tired of AI tools that sounded smart but didn't know my project. Every session: wrong package manager, wrong exception class, wrong import. The fix was obvious: give the AI a context file. The problem was nobody maintained those files, they went stale, and most were full of generic filler the AI already knew.

So I built saar. It generates the file, keeps it short, tells you when it's stale, and lints it for quality. Runs locally. Costs nothing. Works with every AI tool you already use.

The code is all here. The benchmarks are all here. Nothing hidden.


Community


License

MIT. Free forever. Do whatever you want with it.


getsaar.com  ·  PyPI  ·  MIT License

If saar saved you time, a star on GitHub helps others find it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

saar-0.5.14.tar.gz (456.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

saar-0.5.14-py3-none-any.whl (106.3 kB view details)

Uploaded Python 3

File details

Details for the file saar-0.5.14.tar.gz.

File metadata

  • Download URL: saar-0.5.14.tar.gz
  • Upload date:
  • Size: 456.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for saar-0.5.14.tar.gz
Algorithm Hash digest
SHA256 04b3d669abf7e25771fff93aa7fa772da93dfedee21d4d2510afd4b8792cb3b6
MD5 c3b03a47a57c441fa293446c6e63cede
BLAKE2b-256 b5ed99a86bbfaa0dc8964e19499b2ae2cbd0e72ac0198b13b46edce0600c320f

See more details on using hashes here.

File details

Details for the file saar-0.5.14-py3-none-any.whl.

File metadata

  • Download URL: saar-0.5.14-py3-none-any.whl
  • Upload date:
  • Size: 106.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for saar-0.5.14-py3-none-any.whl
Algorithm Hash digest
SHA256 58584d70efbb2597df912ec4f450786b7a3c3bf91cae3bb300e214ada30e369b
MD5 4e3ea8907c27a4fcbfe4f18e1b96db28
BLAKE2b-256 4a218216da4038beb42037d9549b7aaa9668702da35e6e51342e5c0560cd12d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page