AI-powered documentation autopilot — commit code, docs update themselves. Local-first multi-agent pipeline analyzes diffs, finds related code via FAISS embeddings, and writes section-level doc updates. Works with any LLM, optional Notion sync.

These details have not been verified by PyPI

Project links

Project description

Codebase Cortex

Automatically keep your engineering documentation in sync with code.

Codebase Cortex is a local-first, multi-agent documentation engine that watches your codebase for changes and updates markdown documentation automatically. It uses LangGraph to orchestrate nine pipeline nodes that analyze code, route updates to specific sections, write docs, validate accuracy, generate indexes, create tasks, and produce sprint reports. Docs live as plain markdown files in your repo — no cloud dependency required. Optional sync to Notion is available via the DocBackend protocol.

graph LR
    A[Git Commit] --> B[CodeAnalyzer]
    B --> C[SemanticFinder]
    C --> D[SectionRouter]
    D --> E[DocWriter]
    E --> F[DocValidator]
    F --> G[TOCGenerator]
    G --> H[TaskCreator]
    H --> I[SprintReporter]
    I --> J[OutputRouter]
    J --> K[docs/]
    J --> L[Notion]

Features

Local-first documentation — Docs are plain markdown in your repo's docs/ directory. No cloud dependency required
Section-level updates — Only changed sections are rewritten, preserving human edits
Human-edit preservation — MetaIndex tracks section hashes and detects manual edits, which the pipeline protects
Semantic search — FAISS embeddings with TreeSitter-aware AST chunking find related code across your entire codebase
Incremental indexing — Only re-embeds changed files, with .cortexignore for custom exclusions
Draft banners — New pages are marked as drafts until reviewed; remove with cortex accept
Multi-backend output — DocBackend protocol with LocalMarkdownBackend (default) and NotionBackend
Output modes — apply writes directly, propose stages for review, dry-run previews only
Sprint reports — Weekly summaries generated from commit activity with run metrics
Task tracking — Automatically identifies undocumented areas and creates tasks
Cost tracking — Run metrics aggregate token usage, wall-clock time, and cost per pipeline run
CI/CD integration — cortex ci command outputs structured JSON for GitHub Actions and GitLab CI pipelines (PR impact analysis, post-merge doc updates)

Quick Start

Prerequisites

Python 3.11+
uv package manager
An LLM — cloud API key (Gemini, Anthropic, OpenRouter, OpenAI) or a local model (Ollama, vLLM, LM Studio)

Install

# Install from PyPI
pip install codebase-cortex

# Or with uv
uv tool install codebase-cortex

Both cortex and codebase-cortex commands are available after installation. If cortex conflicts with another package on your system, use codebase-cortex instead.

Install from source

git clone https://github.com/sarupurisailalith/codebase-cortex.git
cd codebase-cortex
uv sync
uv tool install .

Initialize in your project

cd /path/to/your-project

# Interactive setup — configures LLM, creates docs/ directory
cortex init

# Quick setup with defaults
cortex init --quick

# Run the pipeline
cortex run --once

The init wizard will:

Ask for your LLM provider and API key
Create a .cortex/ config directory and docs/ output directory
Optionally connect to Notion via OAuth
Optionally install a post-commit git hook

CLI Commands

Command	Description
`cortex init [--quick]`	Interactive setup wizard
`cortex run --once [--full] [--dry-run]`	Run the full pipeline once
`cortex status`	Show connection and config status
`cortex analyze`	One-shot diff analysis (no doc writes)
`cortex embed`	Rebuild the FAISS embedding index
`cortex config show`	Display current configuration
`cortex config set KEY VALUE`	Update a config value
`cortex diff`	Show proposed documentation changes
`cortex apply`	Apply proposed changes to `docs/`
`cortex discard`	Discard proposed changes
`cortex accept`	Remove draft banners after review
`cortex resolve`	Resolve merge conflicts in `docs/`
`cortex check`	Check documentation freshness
`cortex sync --target notion`	Sync local docs to Notion (OAuth flow)
`cortex ci [--on-pr] [--on-merge]`	CI/CD mode (JSON output for pipelines)
`cortex map`	Generate knowledge map from FAISS clusters

How It Works

Cortex creates a .cortex/ directory (gitignored) in your project repo that stores configuration, OAuth tokens, and the FAISS vector index. Documentation is written as markdown files to docs/. When you run the pipeline, nine nodes work in sequence:

graph TD
    START([Start]) --> CA[CodeAnalyzer]
    CA -->|Has analysis?| SF[SemanticFinder]
    CA -->|No changes| END1([End])
    SF --> SR[SectionRouter]
    SR --> DW[DocWriter]
    DW --> DV[DocValidator]
    DV --> TG[TOCGenerator]
    TG --> TC[TaskCreator]
    TC -->|Has updates?| SPR[SprintReporter]
    TC -->|Nothing to report| END2([End])
    SPR --> OR[OutputRouter]
    OR -->|apply| DOCS[docs/]
    OR -->|propose| STAGED[.cortex/proposed/]
    OR -->|dry-run| END3([End])

    style CA fill:#4A90D9,color:#fff
    style SF fill:#7B68EE,color:#fff
    style SR fill:#9B59B6,color:#fff
    style DW fill:#50C878,color:#fff
    style DV fill:#2ECC71,color:#fff
    style TG fill:#1ABC9C,color:#fff
    style TC fill:#FFB347,color:#fff
    style SPR fill:#FF6B6B,color:#fff
    style OR fill:#E74C3C,color:#fff

CodeAnalyzer — Parses git diffs (or scans the full codebase) and produces a structured analysis of what changed
SemanticFinder — Incrementally embeds changed files using TreeSitter AST chunking and searches the FAISS index for semantically related code
SectionRouter — Reads INDEX.md and .cortex-meta.json to triage which sections in which pages need updating, respecting human-edited sections
DocWriter — Reads only targeted sections by line range, generates updated content, and writes via the DocBackend protocol
DocValidator — Compares generated docs against actual code for factual accuracy, flagging low-confidence sections for human review
TOCGenerator — Regenerates TOC markers in updated files, refreshes INDEX.md and .cortex-meta.json, records run metrics
TaskCreator — Identifies documentation gaps and creates task entries
SprintReporter — Synthesizes all activity into a weekly sprint summary with run metrics
OutputRouter — Applies the configured output mode (apply, propose, or dry-run)

Per-Repo Configuration

your-project/
├── .cortex/                    # Created by cortex init (gitignored)
│   ├── .env                    # LLM model, API keys, doc settings
│   ├── .gitignore              # Ignores everything in .cortex/
│   ├── .cortexignore           # User-defined FAISS indexing exclusions
│   ├── notion_tokens.json      # OAuth tokens (if Notion connected)
│   ├── page_cache.json         # Tracked Notion pages (if connected)
│   ├── proposed/               # Staged changes (propose mode)
│   └── faiss_index/            # Vector embeddings
│       ├── index.faiss
│       ├── chunks.json
│       ├── id_map.json
│       └── file_hashes.json
├── docs/                       # Generated documentation (local backend)
│   ├── INDEX.md                # Auto-generated index with heading tree
│   ├── .cortex-meta.json       # Section hashes, human-edit tracking
│   └── *.md                    # Documentation pages
└── src/

Supported LLM Providers

Cortex uses LiteLLM as a unified LLM interface. Any LiteLLM-compatible model works — cloud APIs, local models, or self-hosted endpoints. LiteLLM supports 100+ providers.

Provider	Example Model	Config
Google Gemini	`gemini/gemini-2.5-flash-lite`	`GOOGLE_API_KEY`
Anthropic	`anthropic/claude-sonnet-4-20250514`	`ANTHROPIC_API_KEY`
OpenRouter	`openrouter/google/gemini-2.5-flash-lite`	`OPENROUTER_API_KEY`
Ollama (local)	`ollama/llama3`	No key needed (runs locally)
vLLM (local)	`hosted_vllm/model-name`	`LLM_API_BASE=http://localhost:8000`
LM Studio	`lm_studio/model-name`	`LLM_API_BASE=http://localhost:1234/v1`
OpenAI	`gpt-4o`	`OPENAI_API_KEY`
Azure OpenAI	`azure/gpt-4o`	`AZURE_API_KEY` + `AZURE_API_BASE`

For local models, set the model name and optionally LLM_API_BASE in .cortex/.env:

LLM_MODEL=ollama/llama3
# or
LLM_MODEL=hosted_vllm/my-model
LLM_API_BASE=http://localhost:8000

Documentation

Document	Description
Architecture	System design, data flow, pipeline nodes
CLI Reference	All commands, options, and examples
Agents	How each pipeline node works
Configuration	Setup, LLM providers, environment variables
Notion Integration	OAuth flow, sync protocol, page management
Embeddings & Search	FAISS index, TreeSitter chunking, semantic search
CI/CD Integration	GitHub Actions, GitLab CI, branch strategies
Contributing	Development setup, testing, project structure

Changelog

0.2.0

Redesign: Local-first multi-backend documentation engine — docs live as markdown in docs/, no cloud dependency required
Pipeline: 9-node LangGraph pipeline with conditional routing (added SectionRouter, DocValidator, TOCGenerator, OutputRouter)
LLM: LiteLLM unified interface replaces all langchain LLM providers
Backends: DocBackend protocol with LocalMarkdownBackend (default) and NotionBackend
Embeddings: TreeSitter AST-aware code chunking with regex fallback; incremental FAISS index rebuild (only re-embeds changed files)
MetaIndex: .cortex-meta.json tracks section hashes and detects human edits for preservation
Exclusions: .cortexignore for user-defined FAISS indexing exclusions
CLI: 11 new commands (config, diff, apply, discard, accept, resolve, check, sync, map, and more)
Output modes: apply (default), propose (staged for review), dry-run (preview only)
Draft banners: New pages marked as drafts until reviewed with cortex accept
Metrics: Run metrics aggregation (token usage, cost, timing) via LangGraph state reducer
Branches: Branch strategy enforcement (main-only or branch-aware)
OAuth: Service connection pattern for Notion OAuth integration

0.1.4

Fix: Resolved duplicate child pages caused by emoji title mismatch between Notion and local cache
Fix: DocWriter now uses normalized title matching for section-level merges (prevents creating duplicates when LLM returns titles with/without emoji)
Fix: Parent page creation now warns user to verify page location in Notion workspace

0.1.3

Fix: API key input is now masked during cortex init
Fix: Sprint Log uses replace_content instead of appending on every run

0.1.2

Fix: Dynamic parent page title (uses repo directory name instead of hardcoded "Codebase Cortex")
Fix: Child page bootstrap only checks local cache, no longer adopts unrelated workspace pages

0.1.1

Initial public release

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

Mar 16, 2026

This version

0.2.0

Mar 11, 2026

0.1.4

Mar 9, 2026

0.1.3

Mar 7, 2026

0.1.2

Mar 6, 2026

0.1.1

Mar 6, 2026

0.1.0

Mar 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codebase_cortex-0.2.0.tar.gz (119.9 kB view details)

Uploaded Mar 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codebase_cortex-0.2.0-py3-none-any.whl (92.8 kB view details)

Uploaded Mar 11, 2026 Python 3

File details

Details for the file codebase_cortex-0.2.0.tar.gz.

File metadata

Download URL: codebase_cortex-0.2.0.tar.gz
Upload date: Mar 11, 2026
Size: 119.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.5

File hashes

Hashes for codebase_cortex-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`046ed5971c49a9854f08ffc78ec4eca8aed268dd5dfd2e6832c6b5c55891a61c`
MD5	`2dc645a325d4de149e680dcec7f3af94`
BLAKE2b-256	`80f7161b223ff742228f0e2e8b8edad6c036ea95c134f0e036bb71bbb3cebeb8`

See more details on using hashes here.

File details

Details for the file codebase_cortex-0.2.0-py3-none-any.whl.

File metadata

Download URL: codebase_cortex-0.2.0-py3-none-any.whl
Upload date: Mar 11, 2026
Size: 92.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.5

File hashes

Hashes for codebase_cortex-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fc9ca64f377dc7753c7d32bc8c733ec0b9730ad0b5f604ff372c28cd1c29eeff`
MD5	`ecc91ee1be0483abc04dfe5466f577bf`
BLAKE2b-256	`5deae4770c32168cab13f280d69eed52bd22293fe8c1fe83e4a342c479a92d22`

See more details on using hashes here.

codebase-cortex 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Codebase Cortex

Features

Quick Start

Prerequisites

Install

Initialize in your project

CLI Commands

How It Works

Per-Repo Configuration

Supported LLM Providers

Documentation

Changelog

0.2.0

0.1.4

0.1.3

0.1.2

0.1.1

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes