The universal project brain — scan any codebase, generate context files for every AI coding tool.
Project description
codebase-md
The universal project brain that works with every AI coding tool.
One command scans your codebase and generates context files for Claude Code, Cursor, Codex, Windsurf, and more — auto-detected conventions, dependency health, architecture maps, and smart context routing. Stays fresh via git hooks.
Why?
Every AI coding tool needs project context to work well. But each tool has its own format:
- Claude Code wants
CLAUDE.md - Cursor wants
.cursorrules - Codex wants
codex.md - Windsurf wants
.windsurfrules
Writing and maintaining these manually is tedious. codebase-md scans your project once and generates all of them from a single source of truth.
Features
- Universal output — generates 6 formats from one scan (CLAUDE.md, .cursorrules, AGENTS.md, codex.md, .windsurfrules, PROJECT_CONTEXT.md)
- Auto-detected conventions — naming style, import patterns, file organization, design patterns (powered by tree-sitter AST)
- Dependency intelligence — health scores, version diffs, breaking change detection, migration plans with code impact
- Architecture mapping — detects monolith/monorepo/microservice/library/CLI patterns, entry points, modules
- Smart context routing — query-based context retrieval with TF-IDF relevance scoring
- Git integration — hooks for auto-regeneration on commit, contributor analysis, file hotspots
- Multi-language — Python, JavaScript, TypeScript (50+ file extensions recognized)
v0.1.0 — Current Status
Alpha release — this is the first public release of codebase-md. Core functionality is working and tested, but APIs and output formats may change between minor versions. Please pin your version (
pip install codebase-md==0.1.0) and report issues.
What Works Well
- Single-command scan —
codebase scan .analyzes your entire project in seconds - 5 output formats — CLAUDE.md, AGENTS.md, .cursorrules, codex.md, .windsurfrules (+ generic PROJECT_CONTEXT.md)
- Language detection — Python, TypeScript, JavaScript, Go, Rust and 50+ file extensions
- Dependency parsing — requirements.txt, pyproject.toml, package.json, go.mod, Cargo.toml, Gemfile
- Convention inference — naming style, import patterns, file organization, design patterns (via tree-sitter AST)
- Architecture detection — monolith, monorepo, microservice, library, CLI tool
- Git insights — commit history, contributor analysis, file change hotspots
- Dependency health — live registry queries (PyPI, npm) with health scoring and breaking change detection
- Smart context routing — TF-IDF relevance scoring for query-based context retrieval
Known Limitations
- AST grammars — tree-sitter support is limited to Python, JavaScript, and TypeScript; Go and Rust are parsed via heuristics
- No incremental mode — every scan re-analyzes the full project (no watch/diff mode yet)
- Large monorepos — projects with >10,000 files may experience slower scan times
- Network dependency — DepShift registry queries (PyPI/npm health checks) require network access; use
--offlineto skip - No Windows CI — tested on Linux and macOS; Windows should work but is not yet part of CI
Tested Against
The test suite (354 tests) validates against these project archetypes:
| Fixture | Type | Languages |
|---|---|---|
| Python CLI | CLI tool | Python |
| FastAPI App | Web API | Python |
| Next.js App | Full-stack | TypeScript, JavaScript |
| Go CLI | CLI tool | Go |
| Rust CLI | CLI tool | Rust |
| Mixed Language | Multi-lang | Python, JS, Go |
| Monorepo | Monorepo | Multiple |
| Empty Repo | Edge case | — |
Integration tests also run against real-world repositories (see test_real_repos.py).
Installation
From PyPI
pip install codebase-md
With AST support (recommended)
pip install "codebase-md[ast]"
From GitHub (latest dev)
pip install git+https://github.com/sauravanand542/codebase-md.git
For development
git clone https://github.com/sauravanand542/codebase-md.git
cd codebase-md
pip install -e ".[dev,ast]"
Quick Start
# Initialize config in your project
cd your-project/
codebase init
# Scan your codebase (builds internal project model)
codebase scan .
# Generate context files for all AI tools
codebase generate .
That's it. You now have CLAUDE.md, .cursorrules, AGENTS.md, codex.md, .windsurfrules, and PROJECT_CONTEXT.md in your project root.
Commands
codebase scan
Scans your project and builds a complete model: languages, architecture, dependencies, conventions, modules, git history.
codebase scan . # Scan current directory
codebase scan /path/to/project # Scan a specific project
codebase generate
Generates context files from the last scan.
codebase generate . # Generate all formats
codebase generate . --format claude # Generate only CLAUDE.md
codebase deps
Dependency health dashboard — checks versions against registries, computes health scores.
codebase deps . # Health dashboard (queries PyPI/npm)
codebase deps . --offline # Offline mode (no network)
codebase deps . --upgrade typer # Migration plan for a specific package
codebase context
Query relevant project context with smart ranking.
codebase context "architecture" # Find architecture info
codebase context "dependencies" --max 3 # Top 3 relevant chunks
codebase context "how to test" --compact # Content-only output
codebase hooks
Install git hooks for automatic regeneration.
codebase hooks install . # Install post-commit hooks
codebase hooks status . # Show installed hooks
codebase hooks remove . # Remove hooks
codebase init
Initialize .codebase/ configuration directory.
codebase init # Creates .codebase/config.yaml
Output Formats
| Format | File | AI Tool | Description |
|---|---|---|---|
claude |
CLAUDE.md |
Claude Code | Structured markdown with project summary, architecture, conventions |
cursor |
.cursorrules |
Cursor | Coding rules, language-specific guidance, tech stack |
agents |
AGENTS.md |
Multi-agent | Compact entry points, commands, architecture flow |
codex |
codex.md |
Codex CLI | Overview, setup, project structure, conventions |
windsurf |
.windsurfrules |
Windsurf | Rules-based format with architecture and file map |
generic |
PROJECT_CONTEXT.md |
Any tool | Complete markdown with all sections + metadata |
What Gets Detected
Languages & Frameworks
50+ file extensions recognized. Framework detection for Python (Django, FastAPI, Flask), JavaScript/TypeScript (React, Next.js, Express, Vue).
Architecture Patterns
Monolith, monorepo, microservice, library, CLI tool — detected from folder structure, entry points, and package layout.
Conventions
- Naming: snake_case, camelCase, PascalCase, kebab-case
- Imports: absolute, relative, mixed
- File organization: modular, layer-based, feature-based, flat
- Design patterns: model, view, controller, service, repository, etc.
Dependencies
Parses package.json, requirements.txt, pyproject.toml, go.mod, Cargo.toml, Gemfile. Health scoring via live registry queries (PyPI, npm).
Project Structure
src/codebase_md/
├── cli.py # Typer CLI — all commands
├── model/ # Pydantic v2 data models (frozen, validated)
├── scanner/ # Codebase analysis engine
│ ├── engine.py # Orchestrates all scanners
│ ├── language_detector.py
│ ├── structure_analyzer.py
│ ├── dependency_parser.py
│ ├── convention_inferrer.py # tree-sitter powered
│ ├── ast_analyzer.py # tree-sitter AST
│ └── git_analyzer.py
├── generators/ # Output format generators (plugin-style)
├── depshift/ # Dependency intelligence engine
│ ├── analyzer.py # Health scoring
│ ├── version_differ.py # Breaking change detection
│ ├── usage_mapper.py # Import → source location mapping
│ └── registries/ # PyPI + npm clients
├── context/ # Smart context routing
│ ├── chunker.py # 12 topic-based chunks
│ ├── ranker.py # 6-signal TF-IDF scoring
│ └── router.py # Query pipeline
├── persistence/ # .codebase/ state management
└── integrations/ # Git hooks, GitHub Actions
Configuration
After codebase init, edit .codebase/config.yaml:
version: 1
generators:
- claude
- cursor
- agents
- codex
- windsurf
- generic
scan:
exclude:
- node_modules
- .venv
- dist
- build
hooks:
post_commit: true
pre_push: false
Contributing
See CONTRIBUTING.md for development setup, coding conventions, and PR guidelines.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codebase_md-0.1.0.tar.gz.
File metadata
- Download URL: codebase_md-0.1.0.tar.gz
- Upload date:
- Size: 111.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f73dc16d69e3d75edbe0c33b76bc0f02b681fe2e7b360b0e620f7975866b9214
|
|
| MD5 |
2694d92c02ebda20c0321fdd2e2ba7ca
|
|
| BLAKE2b-256 |
a2b9059bdad7ae54cf1e2e0b68bb81e06f1c3b8a540b7dfea63ec4c0a0b8fa02
|
Provenance
The following attestation bundles were made for codebase_md-0.1.0.tar.gz:
Publisher:
publish.yml on sauravanand542/codebase-md
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
codebase_md-0.1.0.tar.gz -
Subject digest:
f73dc16d69e3d75edbe0c33b76bc0f02b681fe2e7b360b0e620f7975866b9214 - Sigstore transparency entry: 1048786316
- Sigstore integration time:
-
Permalink:
sauravanand542/codebase-md@1c2d0ee3abf381f162bb76601407005ce66a7365 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/sauravanand542
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1c2d0ee3abf381f162bb76601407005ce66a7365 -
Trigger Event:
push
-
Statement type:
File details
Details for the file codebase_md-0.1.0-py3-none-any.whl.
File metadata
- Download URL: codebase_md-0.1.0-py3-none-any.whl
- Upload date:
- Size: 104.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fcd3a14470cd3399d881ee9d705702eec155a52ef503d052875a8da1254eec59
|
|
| MD5 |
d490839ec59af3e140dd3af662c72dd8
|
|
| BLAKE2b-256 |
ca46b24e66185fb2a1fdc0998cefd6d9df8075f1d439daeb5c3ac50bfe09f4af
|
Provenance
The following attestation bundles were made for codebase_md-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on sauravanand542/codebase-md
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
codebase_md-0.1.0-py3-none-any.whl -
Subject digest:
fcd3a14470cd3399d881ee9d705702eec155a52ef503d052875a8da1254eec59 - Sigstore transparency entry: 1048786350
- Sigstore integration time:
-
Permalink:
sauravanand542/codebase-md@1c2d0ee3abf381f162bb76601407005ce66a7365 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/sauravanand542
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1c2d0ee3abf381f162bb76601407005ce66a7365 -
Trigger Event:
push
-
Statement type: