Skip to main content

AI-native code indexing tool for large codebases

Project description

codeindex

PyPI version Python 3.10+ License: MIT Tests

Universal Code Parser — Best-in-class multi-language AST parser for AI-assisted development.

codeindex extracts symbols, inheritance relationships, call graphs, and imports from Python, PHP, and Java using tree-sitter. Perfect for feeding structured code data to AI tools, knowledge graphs, and code intelligence platforms.


For LoomGraph Developers: FOR_LOOMGRAPH.md (quick start) | docs/guides/loomgraph-integration.md (full guide)


Features

  • Multi-language AST parsing — Python, PHP, Java via tree-sitter (TypeScript, Go, Rust, C# planned)
  • AI-powered documentation — Generate README files using Claude, GPT, or any AI CLI
  • Single file parsecodeindex parse <file> with JSON output for tool integration
  • Structured JSON output--output json for CI/CD, knowledge graphs, and downstream tools
  • Call relationship extraction — Function/method call graphs across Python, Java, PHP
  • Inheritance extraction — Class hierarchy and interface relationships
  • Framework route extraction — ThinkPHP and Spring Boot route tables (more planned)
  • Technical debt analysis — Detect large files, god classes, symbol overload
  • Smart indexing — Tiered documentation (overview → navigation → detailed) optimized for AI agents
  • Adaptive symbol extraction — Dynamic 5–150 symbols per file based on size
  • CLAUDE.md injectioncodeindex init auto-configures Claude Code integration (v0.17.0)
  • Template-based test generation — YAML + Jinja2 for rapid language support (88–91% time savings)
  • Parallel scanning — Concurrent directory processing with configurable workers

Installation

codeindex uses lazy loading — language parsers are only imported when needed.

Quick Install

# All languages (recommended)
pip install ai-codeindex[all]

# Or specific languages only
pip install ai-codeindex[python]
pip install ai-codeindex[php]
pip install ai-codeindex[java]
pip install ai-codeindex[python,php]

Using pipx (Recommended for CLI use)

pipx install ai-codeindex[all]

From Source

git clone https://github.com/dreamlx/codeindex.git
cd codeindex
pip install -e ".[all]"

Quick Start

1. Initialize Your Project

cd /your/project
codeindex init

This creates:

  • .codeindex.yaml — scan configuration (languages, include/exclude patterns)
  • CLAUDE.md — injects codeindex instructions so Claude Code uses README_AI.md automatically
  • CODEINDEX.md — project-level documentation reference

2. Scan Your Codebase

# Scan all directories (structural documentation, no AI needed)
codeindex scan-all

# Scan a single directory
codeindex scan ./src/auth

# AI-enhanced documentation (requires ai_command in config)
codeindex scan-all --ai

# Preview AI prompt without executing
codeindex scan ./src/auth --ai --dry-run

3. Check Status

codeindex status
Indexing Status
───────────────────────────────
✅ src/auth/
✅ src/utils/
⚠️  src/api/ (no README_AI.md)
Indexed: 2/3 (67%)

4. Generate Indexes

# Global symbol index (PROJECT_SYMBOLS.md)
codeindex symbols

# Module overview (PROJECT_INDEX.md)
codeindex index

# Git change impact analysis
codeindex affected --since HEAD~5

More Commands

Command Description Guide
codeindex scan --output json JSON output for tools JSON Output Guide
codeindex parse <file> Parse single file to JSON LoomGraph Integration
codeindex tech-debt ./src Technical debt analysis Advanced Usage
codeindex hooks install Git hooks for auto-update Git Hooks Guide
codeindex config explain <param> Parameter help Configuration Guide

Claude Code Integration

v0.17.0: codeindex init automatically injects instructions into your project's CLAUDE.md, so Claude Code reads README_AI.md files first — no manual setup required.

# One command sets everything up
codeindex init

# Claude Code will now:
# ✅ Read README_AI.md before searching source files
# ✅ Use structured indexes for architecture understanding
# ✅ Navigate code via Serena MCP tools (find_symbol, etc.)

For manual setup, MCP skills (/mo:arch, /mo:index), and Git hooks integration, see the Claude Code Integration Guide.


Language Support

Language Status Since Key Features
Python ✅ Supported v0.1.0 Classes, functions, methods, imports, docstrings, inheritance, calls
PHP ✅ Supported v0.5.0 Classes (extends/implements), methods, properties, PHPDoc, inheritance, calls
Java ✅ Supported v0.7.0 Classes, interfaces, enums, records, annotations, Spring routes, Lombok, calls
TypeScript/JS 🧪 Tests Ready v0.14.0 Parser implementation in progress (Epic 15)
Go 📋 Planned Packages, interfaces, struct methods
Rust 📋 Planned Structs, traits, modules
C# 📋 Planned Classes, interfaces, .NET projects

Want to add a language? The template-based test system lets you contribute by writing YAML specs — no Python knowledge required. See CONTRIBUTING.md for details.

Framework Route Extraction

Framework Language Status
ThinkPHP PHP ✅ Stable (v0.5.0)
Spring Boot Java ✅ Stable (v0.8.0)
Laravel PHP 📋 Planned
FastAPI Python 📋 Planned
Django Python 📋 Planned
Express.js JS/TS 📋 Planned

How It Works

Directory → Scanner → Parser (tree-sitter) → Smart Writer → README_AI.md
  1. Scanner — walks directories, filters by config patterns
  2. Parser — extracts symbols (classes, functions, imports, calls, inheritance) via tree-sitter
  3. Smart Writer — generates tiered documentation with size limits (≤50KB)
  4. OutputREADME_AI.md optimized for AI consumption, or JSON for tool integration

Documentation

User Guides

Guide Description
Getting Started Installation and first scan
Configuration Guide All config options explained
Configuration Changelog Version-by-version config changes
Advanced Usage Parallel scanning, custom prompts
Git Hooks Integration Automated quality checks and doc updates
Claude Code Integration AI agent setup and MCP skills
JSON Output Integration Machine-readable output for tools
LoomGraph Integration Knowledge graph data pipeline

Developer Guides

Guide Description
CONTRIBUTING.md Development setup, TDD workflow, code style
CLAUDE.md Quick reference for Claude Code and contributors
Design Philosophy Core design principles and architecture
Release Automation 5-minute automated release workflow
Multi-Language Support Adding new language parsers
Language Support Contribution Template-based test generation for new languages

Planning


Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

git clone https://github.com/dreamlx/codeindex.git
cd codeindex
pip install -e ".[dev,all]"
make install-hooks
make test

Release Process (Maintainers)

make release VERSION=0.17.0
# GitHub Actions: tests → PyPI publish → GitHub Release

See Release Automation Guide for details.


Roadmap

Current version: v0.17.1

Recent milestones:

  • v0.17.0 — CLAUDE.md injection via codeindex init
  • v0.16.0 — CLI UX restructuring (structural mode default, --ai opt-in)
  • v0.15.0 — Template-based test architecture migration
  • v0.14.0 — Interactive setup wizard, single file parse, parser modularization

Next:

  • Framework routes expansion: Express, Laravel, FastAPI, Django (Epic 17)
  • TypeScript parser implementation (Epic 15)
  • Go, Rust, C# language support

Moved to LoomGraph:

  • Code similarity search, refactoring suggestions, team collaboration, IDE integration

See Strategic Roadmap for detailed plans.


License

MIT License — see LICENSE file for details.

Acknowledgments

  • tree-sitter — fast, incremental parsing
  • Claude CLI — AI integration inspiration
  • All contributors and users

Support


Made with ❤️ by the codeindex team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_codeindex-0.17.1.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_codeindex-0.17.1-py3-none-any.whl (143.0 kB view details)

Uploaded Python 3

File details

Details for the file ai_codeindex-0.17.1.tar.gz.

File metadata

  • Download URL: ai_codeindex-0.17.1.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_codeindex-0.17.1.tar.gz
Algorithm Hash digest
SHA256 17f5b5260d0b275d5a03b2da22a95f9021a6e3002e55c1162620dc20fd260c1a
MD5 6a637d22478f2c908db2d6202cc7862d
BLAKE2b-256 2866c762f6b8b17c93ef3753ee9cb1c2eab13b2a0064b73f4e08e693388618e1

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_codeindex-0.17.1.tar.gz:

Publisher: publish.yml on dreamlx/codeindex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_codeindex-0.17.1-py3-none-any.whl.

File metadata

  • Download URL: ai_codeindex-0.17.1-py3-none-any.whl
  • Upload date:
  • Size: 143.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_codeindex-0.17.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1254b6755f4952046bcb2a5b402eb75d6f8fb56e11cec845df9ddda9551420e8
MD5 7d8d9166cc0cc022888e0db628fa5ba7
BLAKE2b-256 7d13e021f405106848950fe2a04d4c388428a7b30e25d4b3b1435ef533cd714e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_codeindex-0.17.1-py3-none-any.whl:

Publisher: publish.yml on dreamlx/codeindex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page