Skip to main content

AI Agent for Codebase Documentation

Project description

๐Ÿฆ… CodiLay

The Living Reference for Your Codebase โ€” An AI agent that traces the "wires" of your project to build, update, and chat with your documentation.

License: MIT Python: 3.11+ PRs: Welcome


CodiLay is not just a static documentation generator; it's an agentic documentary researcher. It reads your code, understands module connections via The Wire Model, and maintains a persistent, searchable knowledge base that you can browse via a Web UI or talk to through an interactive Chat.


๐Ÿš€ Experience CodiLay

1. Installation

Install from PyPI (Recommended)

# Basic installation
pip install codilay

# Install with all features (Web UI + Watch mode)
pip install "codilay[all]"

# For a global CLI installation (recommended)
pipx install codilay

Install from Source

# Clone the repository
git clone https://github.com/HarmanPreet-Singh-XYT/codilay.git
cd codilay

# Install with Web UI support
pip install -e ".[serve]"

# Install with Watch mode support
pip install -e ".[watch]"

# Install everything (Web UI + Watch mode)
pip install -e ".[all]"

2. First-Time Setup

Forget about exporting API keys every time. Run the setup wizard to securely store your keys.

codilay setup
codilay

Running codilay with no arguments opens the Interactive Control Center, allowing you to manage projects, configure providers, and launch scans without memorizing flags.


๐Ÿ›  Features

๐Ÿง  The Wire Model

CodiLay treats every import, function call, and variable reference as a Wire.

  • Open Wires: Unresolved references that the agent is "hunting" for.
  • Closed Wires: Successfully traced connections that form segments of the dependency graph.

โšก๏ธ Smart Triage

Before burning tokens, CodiLay performs a high-speed Triage Phase. It classifies files into:

  • Core: Full architectural analysis and documentation.
  • Skim: Metadata and signatures only (saves tokens on simple utilities).
  • Skip: Ignores boilerplate, generated code, and platform-specific noise.

๐Ÿ”„ Git-Aware Incremental Updates

CodiLay is repo-aware. If you've only changed 2 files in a 500-file project, codilay . will:

  1. Detect the delta via Git.
  2. Invalidate only the affected documentation sections.
  3. Re-open wires related to the changed code.
  4. Re-calculate the local impact to keep your CODEBASE.md current.

๐Ÿ’ฌ Interactive Chat & Memory

Ask questions about your codebase using codilay chat ..

  • RAG + Deep Search: It uses your documentation for fast answers but can "escalate" to reading source code for implementation details.
  • Memory: The agent remembers your preferences and facts about the codebase across sessions.
  • Promote to Doc: Found a great explanation in chat? Use /promote to turn the AI's answer into a permanent section of your documentation.

๐ŸŒ Web Documentation Browser

The Web UI isn't just a readerโ€”it's an interactive intelligence layer.

  • Layer 1: The Reader: High-fidelity rendering of your sections and graph.
  • Layer 2: The Chatbot: Quick Q&A from documented context.
  • Layer 3: The Deep Agent: Reaches into source code to verify facts.
codilay serve .

๐Ÿ‘ Watch Mode

Run CodiLay in the background and automatically update documentation when files change. Uses filesystem events (via watchdog) with configurable debouncing to avoid redundant re-runs.

# Watch the current directory, auto-update on save
codilay watch .

# Custom debounce delay (5 seconds)
codilay watch . --debounce 5

# Verbose output for debugging
codilay watch . -v

๐Ÿงฉ IDE Integration (VSCode Extension)

A VSCode extension that surfaces documentation inline alongside the file you're editing. Features include:

  • Sidebar tree view of all documented sections
  • Webview panel showing full documentation for the active file
  • Inline decorations highlighting documented symbols
  • Quick commands for asking questions, viewing the graph, and searching conversations

Install from vscode-extension/ directory โ€” see the extension README for details.

๐Ÿค– AI Context Export

Export your documentation in a compact, token-efficient format designed for feeding into another LLM's context window. Supports markdown, XML, and JSON formats with optional token budgets.

# Export as compact markdown (default)
codilay export .

# Export as XML with a 4000-token budget
codilay export . --format xml --max-tokens 4000

# Export as JSON, exclude the dependency graph
codilay export . -f json --no-graph -o context.json

๐Ÿ“Š Documentation Diff

See a section-by-section changelog of what shifted in your documentation between runs. Unlike codilay diff (which shows git-level file changes), diff-doc compares the actual documentation content.

# Show what changed in the docs since the last run
codilay diff-doc .

# Output as JSON for programmatic use
codilay diff-doc . --json-output

Snapshots are saved automatically after every codilay run, so diffs are always available.

๐ŸŽฏ Triage Tuning

Flag incorrect triage decisions to improve future runs. Corrections are stored per-project and automatically applied during the triage phase of subsequent runs.

# Flag a file that was skimmed but should be core
codilay triage-feedback add . src/auth/handler.py skim core -r "Contains critical auth logic"

# Flag a pattern (glob-based)
codilay triage-feedback add . "tests/**" core skip --pattern -r "Tests should be skipped"

# List all stored feedback
codilay triage-feedback list .

# Set a hint for a project type
codilay triage-feedback hint . react "Treat all hooks/ files as core"

# Remove feedback for a specific file
codilay triage-feedback remove . src/auth/handler.py

# Clear all feedback
codilay triage-feedback clear . --yes

๐Ÿ” Graph Filters

Filter the dependency graph by wire type, file layer, module, or connection count. Essential for reducing noise on large repositories.

# Show only import-type wires
codilay graph . --wire-type import

# Filter to a specific directory layer
codilay graph . --layer src/api

# Show only nodes with 3+ connections, outgoing edges only
codilay graph . --min-connections 3 --direction outgoing

# Combine filters, exclude tests
codilay graph . -w import -l src/core -x "tests/**"

# List available filter values for a project
codilay graph . --list-filters

# Output as JSON
codilay graph . --json-output

๐Ÿง  Team Memory

A shared knowledge base for teams working on the same project. Record facts, architectural decisions, coding conventions, and file annotations โ€” all stored per-project and surfaced to the AI during documentation and chat.

# Add a team member
codilay team add-user . alice --display-name "Alice Chen"

# Record a fact
codilay team add-fact . "We use Celery for async tasks" -c architecture -a alice -t backend -t infra

# Vote on a fact
codilay team vote . <fact-id> up

# Record an architectural decision
codilay team add-decision . "Use PostgreSQL over MySQL" "Better JSON support, needed for our schema" -a alice -f src/db/

# Add a coding convention
codilay team add-convention . "Error Handling" "All API endpoints must return structured error responses" -e '{"error": "message", "code": 400}' -a alice

# Annotate a specific file
codilay team annotate . src/api/routes.py "This file is getting too large, plan to split by domain" -a alice -l 1-50

# List everything
codilay team facts .                   # All facts
codilay team facts . -c architecture   # Facts by category
codilay team decisions .               # All decisions
codilay team decisions . -s active     # Active decisions only
codilay team conventions .             # All conventions
codilay team annotations .             # All annotations
codilay team annotations . -f src/api/routes.py  # Per-file
codilay team users .                   # All members

๐Ÿ”Ž Conversation Search

Full-text search across all past chat conversations โ€” not just the current session. Uses an inverted index with TF-IDF scoring for fast, relevant results.

# Search all conversations
codilay search . "authentication flow"

# Top 5 results, assistant messages only
codilay search . "error handling" --top 5 --role assistant

# Search within a specific conversation
codilay search . "database migration" -c <conversation-id>

# Rebuild the index (after manual edits to chat files)
codilay search . "query" --rebuild

๐Ÿ“… Scheduled Re-runs

Automatically trigger documentation updates on a cron schedule or when new commits land on a branch. Runs as a background daemon with PID file management.

# Update docs every day at 2am
codilay schedule set . --cron "0 2 * * *"

# Update on every new commit to main
codilay schedule set . --on-commit --branch main

# Combine: cron + commit triggers
codilay schedule set . --cron "0 2 * * *" --on-commit

# Check current schedule
codilay schedule status .

# Start the scheduler (foreground)
codilay schedule start .

# Start with verbose logging
codilay schedule start . -v

# Stop a running scheduler
codilay schedule stop .

# Disable the schedule
codilay schedule disable .

โŒจ๏ธ CLI Reference

Command Action
codilay Launch the Interactive Menu
codilay . Document the current directory (incremental)
codilay chat . Start a Chat session about the project
codilay serve . Launch the Web UI
codilay status . Show documentation coverage and stale sections
codilay diff . See what changed in files since the last run
codilay setup Configure default provider, model, and API keys
codilay keys Manage stored API keys
codilay clean . Wipe all generated artifacts
codilay watch . Watch for file changes, auto-update docs
codilay export . Export docs in AI-friendly format (markdown/xml/json)
codilay diff-doc . Show section-level documentation diff between runs
codilay triage-feedback Manage triage corrections (add/list/hint/clear/remove)
codilay graph . View and filter the dependency graph
codilay team Manage shared team knowledge (facts/decisions/conventions)
codilay search . "query" Full-text search across all past conversations
codilay schedule Configure and run scheduled doc updates (set/start/stop)

โš™๏ธ Project Configuration

Place a codilay.config.json in your root for project-specific behavior:

{
  "ignore": ["dist/**", "**/tests/**"],
  "notes": "This is a React/Next.js frontend using Tailwind.",
  "instructions": "Focus on data-fetching patterns and state management.",
  "entryHint": "src/main.py",
  "llm": {
    "provider": "anthropic",
    "model": "claude-3-5-sonnet-latest",
    "baseUrl": "https://api.anthropic.com",
    "maxTokensPerCall": 4096
  },
  "triage": {
    "mode": "smart",
    "includeTests": false,
    "forceInclude": ["critical_logic/*.py"],
    "forceSkip": ["legacy_v1/*.js"]
  },
  "chunking": {
    "tokenThreshold": 6000,
    "maxChunkTokens": 4000,
    "overlapRatio": 0.10
  },
  "parallel": {
    "enabled": true,
    "maxWorkers": 4
  }
}

๐Ÿ“‹ Configuration Fields

Category Key Type Description
General ignore List[str] Glob patterns for files/folders to exclude from scans.
notes str High-level project context provided to the AI.
instructions str Specific documentation style or domain instructions.
entryHint str Point to the main entry file to help trace wires.
skipGenerated List[str] Optional override for default generated/lock file ignores.
LLM provider str AI provider (e.g., anthropic, openai, google, ollama).
model str Model identifier (e.g., claude-3-5-sonnet-latest).
baseUrl str Custom API base URL (useful for local models or proxies).
maxTokensPerCall int Maximum output tokens per individual agent call.
Triage mode str Default classification strategy (smart, core, skim, skip).
includeTests bool Whether to process test files (defaults to false).
forceInclude List[str] Patterns to always treat as Core documentation.
forceSkip List[str] Patterns to always ignore.
Chunking tokenThreshold int Files larger than this (in tokens) are split into chunks.
maxChunkTokens int Target token count for each detail chunk.
overlapRatio float Contextual overlap between chunks (e.g. 0.10 for 10%).
Parallel enabled bool Enable/disable concurrent processing of files within the same tier.
maxWorkers int Max number of concurrent LLM calls.

๐ŸŒ Multi-Provider Support

CodiLay is provider-agnostic. Power it with:

  • Cloud: Anthropic (Sonnet/Haiku), OpenAI (GPT-4o), Google Gemini.
  • Local: Ollama, Groq, Llama Cloud.
  • Specialty: DeepSeek, Mistral.
  • Custom: Any OpenAI-compatible endpoint.

๐Ÿ“‚ Project Structure

src/codilay/
โ”œโ”€โ”€ cli.py              # Command parsing & Interactive Menu
โ”œโ”€โ”€ scanner.py          # Git-aware file walking
โ”œโ”€โ”€ triage.py           # AI-powered file categorization
โ”œโ”€โ”€ processor.py        # The Agent Loop & Large file chunking
โ”œโ”€โ”€ wire_manager.py     # Linkage & Dependency resolution
โ”œโ”€โ”€ docstore.py         # Living CODEBASE.md management
โ”œโ”€โ”€ chatstore.py        # Persistent memory & Chat history
โ”œโ”€โ”€ server.py           # FastAPI Intelligence Server (Web UI + API)
โ”œโ”€โ”€ watcher.py          # File system watcher (watch mode)
โ”œโ”€โ”€ exporter.py         # AI-friendly doc export (markdown/xml/json)
โ”œโ”€โ”€ doc_differ.py       # Section-level doc diffing & version snapshots
โ”œโ”€โ”€ triage_feedback.py  # Triage correction store & feedback loop
โ”œโ”€โ”€ graph_filter.py     # Dependency graph filtering engine
โ”œโ”€โ”€ team_memory.py      # Shared team knowledge base
โ”œโ”€โ”€ search.py           # Full-text conversation search (inverted index)
โ”œโ”€โ”€ scheduler.py        # Cron & commit-based auto re-runs
โ””โ”€โ”€ web/                # Premium Glassmorphic Frontend

vscode-extension/       # VSCode extension for inline doc surfacing
โ”œโ”€โ”€ package.json
โ”œโ”€โ”€ tsconfig.json
โ””โ”€โ”€ src/extension.ts

๐Ÿค Contributing

We love contributors! Trace your own wires into the project by checking out CONTRIBUTING.md.

  1. Fork the repo.
  2. Install dev deps: pip install -e ".[all,dev]"
  3. Test: pytest
  4. Submit a PR.

๐Ÿ“œ License

Distributed under the MIT License. See LICENSE for details.


Generated by CodiLay โ€” Documenting the future, one wire at a time.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codilay-0.1.1.tar.gz (208.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codilay-0.1.1-py3-none-any.whl (178.8 kB view details)

Uploaded Python 3

File details

Details for the file codilay-0.1.1.tar.gz.

File metadata

  • Download URL: codilay-0.1.1.tar.gz
  • Upload date:
  • Size: 208.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for codilay-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a8dbbbbd1106aea45e3b61db348975bfbd0c1b11c9b1e79b6c326095105c60af
MD5 a26a4a7a4654bce1258418df8b3f463e
BLAKE2b-256 24e8a5cce84f0e98375508dff9387b706049ac2a20e249b9052dc7f0e0cba622

See more details on using hashes here.

File details

Details for the file codilay-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: codilay-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 178.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for codilay-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 01e8d4abcf99bf551d3aaf21e9618ee98de29e6364fea7f342076c337d133d0e
MD5 95863e8cbae8249be300794d86a476d9
BLAKE2b-256 aeddecdd2cf018ff52ac7274ce55d46be6902b5dc9ded4d5739025e836abd1da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page