Skip to main content

Your AI agent writes code — but does it know WHY your code exists? Decision provenance, quality contracts, and AI anti-pattern detection for coding agents.

Project description

codebase-intel

Your AI agent writes code. But does it know why your code exists?

Stars MIT Python 3.11+ MCP 19 Languages


AI coding agents can autocomplete, refactor, and generate code. But they still fail at the things that matter most in production:

  • They don't know your team decided to use token bucket over sliding window — and why
  • They don't know the compliance team requires rate limit headers on every response
  • They don't know that changing config.py will break billing and analytics
  • They generate code that looks right but violates your project's architectural patterns

codebase-intel fixes this. It's the context layer that sits between your codebase and any AI agent — providing not just what code exists, but why it exists, what rules it must follow, and what breaks if you change it.


Before vs After

WITHOUT codebase-intel                    WITH codebase-intel
─────────────────────                     ───────────────────

Agent reads: every file in the dir        Agent reads: only what matters
Tokens used: 16,063                       Tokens used: 5,955
Knows why code exists: No                 Knows why: Yes (13 decisions)
Quality guardrails: None                  Guardrails: 4 contracts enforced
Drift awareness: None                     Drift: stale context detected
Impact analysis: None                     Impact: knows what else breaks

Result: faster but fragile                Result: faster AND correct

Real benchmarks on production codebases

Project Files Naive Tokens codebase-intel Reduction Decisions Contracts
FastAPI monolith (auth + frontend) 359 16,063 5,955 63% 13 4
Microservice A (AI processing) 358 14,611 5,955 59% 0 7
Microservice B (document generation) 87 2,461 1,275 48% 0 6
Microservice C (user management) 153 5,904 1,476 75% 0 4

Numbers from codebase-intel benchmark on real production repos. The token reduction comes from targeted graph traversal. The decisions and contracts are what no other tool provides.


What makes this different

There are great tools for code graphs (code-review-graph is excellent — 6K+ stars). We don't compete with them. We solve what they don't:

Capability Graph-only tools codebase-intel
Code graph + dependencies Yes Yes (19 languages)
Decision Journalwhy code is the way it is No Yes
Quality Contracts — rules AI must follow No Yes
AI Anti-pattern Detection — catches hallucinated imports, over-abstraction No Yes
Drift Detection — stale context, context rot alerts No Yes
Token Budgeting — fits context to any agent's window No Yes
Live Analytics — prove efficiency over time No Yes

The missing layer

What exists today:          What codebase-intel adds:

Code → Graph → Agent        Code → Graph ──────────────────→ Agent
                                     ↓                         ↑
                              Decision Journal ──→ WHY ────────┤
                              Quality Contracts → RULES ───────┤
                              Drift Detector ──→ WARNINGS ─────┘
                              
                              "Here are the 3 files that matter,
                               the decision your team made 6 months ago,
                               and the 2 rules you must not violate."

Quick Start

pip install codebase-intel

# Initialize on your project
cd your-project
codebase-intel init

# See what it found
codebase-intel status

# Mine decisions from git history
codebase-intel mine --save

# Run benchmarks (see the before/after)
codebase-intel benchmark

# View live dashboard
codebase-intel dashboard

Connect to Claude Code

{
  "mcpServers": {
    "codebase-intel": {
      "command": "codebase-intel",
      "args": ["serve", "/path/to/your/project"]
    }
  }
}

Now your agent automatically gets relevant context, decisions, and contracts before writing code.


The Three Pillars

1. Decision Journal — "Why does this code exist?"

Every team makes hundreds of decisions that never get documented. Why did you choose Postgres over Mongo? Why is auth middleware structured that way? Why was the sliding window approach rejected?

codebase-intel captures these from git history automatically and links them to code:

# .codebase-intel/decisions/DEC-042.yaml
id: DEC-042
title: "Use token bucket for rate limiting"
status: active
context: "Payment endpoint was getting hammered during flash sales"
decision: "Token bucket algorithm with per-user buckets, 100 req/min"
alternatives:
  - name: sliding_window
    rejection_reason: "Memory overhead too high at scale"
constraints:
  - description: "Must not add >2ms p99 latency"
    source: sla
    is_hard: true
code_anchors:
  - "src/middleware/rate_limiter.py:15-82"

Without this: your agent proposes sliding window (the exact approach you rejected 6 months ago). With this: your agent sees the decision, follows it, and respects the SLA constraint.

2. Quality Contracts — "What rules must AI follow?"

Linters check syntax. Contracts enforce your project's patterns:

# .codebase-intel/contracts/api-rules.yaml
rules:
  - id: no-raw-sql
    name: No raw SQL in API layer
    severity: error
    pattern: "execute\\(.*SELECT|INSERT|UPDATE"
    fix_suggestion: "Use the repository pattern"

  - id: async-everywhere
    name: All I/O must be async
    severity: error
    pattern: "requests\\.(get|post)"
    fix_suggestion: "Use httpx.AsyncClient"

Built-in AI guardrails catch the patterns AI agents mess up most:

  • Hallucinated imports (modules that don't exist)
  • Over-abstraction (base classes with one subclass)
  • Unnecessary error handling for impossible conditions
  • Comments that restate code instead of explaining why
  • Features that weren't requested (YAGNI violations)

3. Drift Detection — "Is our context still valid?"

Context rots. Decisions get outdated. Code anchors point to deleted files. codebase-intel detects this:

$ codebase-intel drift

╭──────────────── Drift Report ────────────────╮
│ Overall: MEDIUM                               │
│ 3 items need attention                        │
╰───────────────────────────────────────────────╯

- [MEDIUM] Decision DEC-012 anchored to deleted file
- [MEDIUM] Decision DEC-008 is past its review date
- [LOW] 2 files changed since last graph index

19 Languages

Full tree-sitter parsing via tree-sitter-language-pack:

Category Languages
Web JavaScript, TypeScript, TSX
Backend Python, Java, Go, Ruby, PHP, Elixir
Systems Rust, C, C++
Mobile Swift, Kotlin, Dart
Other C#, Scala, Lua, Haskell

CLI Commands

codebase-intel init [path]              # Initialize — build graph, create configs
codebase-intel analyze [--incremental]  # Rebuild or update the code graph
codebase-intel mine [--save]            # Mine git history for decision candidates
codebase-intel drift                    # Run drift detection
codebase-intel benchmark                # Measure token efficiency (before/after)
codebase-intel dashboard                # Live efficiency tracking over time
codebase-intel serve                    # Start MCP server
codebase-intel status                   # Component health check

MCP Tools (7 tools)

Tool What it does
get_context The main tool. Assembles relevant files + decisions + contracts within a token budget.
query_graph Query dependencies, dependents, or run impact analysis.
get_decisions Get architectural decisions relevant to specific files.
get_contracts Get quality contracts for files you're editing.
check_drift Verify context freshness before trusting old decisions.
impact_analysis "What breaks if I change this file?"
get_status Health check — graph stats, decision count, contract count.

Community Contract Packs

Pre-built quality rules for popular frameworks:

Pack Rules Covers
fastapi.yaml 10 Layered architecture, Pydantic schemas, async, Depends(), secrets
react-typescript.yaml 11 Functional components, no any, custom hooks, lazy loading
nodejs-express.yaml 12 Error handling, helmet, rate limiting, structured logging
cp community-contracts/fastapi.yaml .codebase-intel/contracts/

Architecture

AI Agent (any) ──→ MCP Server (7 tools)
                        │
                  Context Orchestrator
                  (token budgeting, priority, conflicts)
                   ╱          │          ╲
          Code Graph    Decision     Quality
          (19 langs)    Journal      Contracts
          SQLite+WAL    YAML files   YAML+builtins
                   ╲          │          ╱
                    Drift Detector
                    (staleness, rot, orphans)
                          │
                    Analytics Tracker
                    (live efficiency metrics)

The Philosophy

We don't make AI agents smarter. We make them informed.

An agent with 1M tokens of context is like a developer with access to every file in the company — overwhelming and unfocused. An agent with codebase-intel is like a developer who just had a 5-minute conversation with the senior engineer: "Here's what you need to know, here's why we did it this way, and here are the three things you absolutely cannot break."

We don't compete with graph tools. We complete them.

Code graphs answer "what depends on what." That's necessary but not sufficient. codebase-intel answers the harder questions: Why was this decision made? What constraints apply? What will the compliance team flag? What did we already try and reject?

We don't hide the truth. We prove it.

Run codebase-intel benchmark on your project. See the numbers. Run codebase-intel dashboard over time. Watch the improvement. Every claim is backed by reproducible, project-specific data.


Contributing

Areas with the most impact:

  1. Contract packs — Share quality rules for your framework
  2. Language extraction — Improve parsing for specific languages
  3. Decision mining — Better git history analysis
  4. Benchmarks — Run against your repos, share results
git clone https://github.com/MutharasuArchunan13/codebase-intel.git
cd codebase-intel
pip install -e ".[dev]"
pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codebase_intel-0.2.0.tar.gz (342.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codebase_intel-0.2.0-py3-none-any.whl (150.2 kB view details)

Uploaded Python 3

File details

Details for the file codebase_intel-0.2.0.tar.gz.

File metadata

  • Download URL: codebase_intel-0.2.0.tar.gz
  • Upload date:
  • Size: 342.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for codebase_intel-0.2.0.tar.gz
Algorithm Hash digest
SHA256 3415486f2c8b1098cf600fa2f9bd4bc6dcfd3c3d46de7bd1d026ea60c5ab3a48
MD5 2d5feba8284337d4aafdc1a822271af9
BLAKE2b-256 d4c014674a24afc827d94d8866ed253c906b241df07f67b73ba1eae154c956ab

See more details on using hashes here.

File details

Details for the file codebase_intel-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: codebase_intel-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 150.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for codebase_intel-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3d17b64887424e8444b1770770651b17a612a01c66036b87645d25e9015e3d5f
MD5 89570c7398e11702809257585ee23c05
BLAKE2b-256 486cf62d466a59ea13ac830d539bb08e24a04880489de4eae2cc22272a6f8fff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page