Skip to main content

AI-native code indexing tool for large codebases

Project description

codeindex

PyPI version Python 3.10+ License: MIT Tests

Universal Code Parser — Best-in-class multi-language AST parser for AI-assisted development.

codeindex extracts symbols, inheritance relationships, call graphs, and imports from Python, PHP, and Java using tree-sitter. Perfect for feeding structured code data to AI tools, knowledge graphs, and code intelligence platforms.


For LoomGraph Developers: FOR_LOOMGRAPH.md (quick start) | docs/guides/loomgraph-integration.md (full guide)


Features

  • Multi-language AST parsing — Python, PHP, Java via tree-sitter (TypeScript, Go, Rust, C# planned)
  • AI-powered documentation — Generate README files using Claude, GPT, or any AI CLI
  • Single file parsecodeindex parse <file> with JSON output for tool integration
  • Structured JSON output--output json for CI/CD, knowledge graphs, and downstream tools
  • Call relationship extraction — Function/method call graphs across Python, Java, PHP
  • Inheritance extraction — Class hierarchy and interface relationships
  • Framework route extraction — ThinkPHP and Spring Boot route tables (more planned)
  • Technical debt analysis — Detect large files, god classes, symbol overload
  • Smart indexing — Tiered documentation (overview → navigation → detailed) optimized for AI agents
  • Adaptive symbol extraction — Dynamic 5–150 symbols per file based on size
  • CLAUDE.md injectioncodeindex init auto-configures Claude Code integration (v0.17.0)
  • Template-based test generation — YAML + Jinja2 for rapid language support (88–91% time savings)
  • Parallel scanning — Concurrent directory processing with configurable workers

Installation

codeindex uses lazy loading — language parsers are only imported when needed.

Quick Install

# All languages (recommended)
pip install ai-codeindex[all]

# Or specific languages only
pip install ai-codeindex[python]
pip install ai-codeindex[php]
pip install ai-codeindex[java]
pip install ai-codeindex[python,php]

Using pipx (Recommended for CLI use)

pipx install ai-codeindex[all]

From Source

git clone https://github.com/dreamlx/codeindex.git
cd codeindex
pip install -e ".[all]"

Quick Start

1. Initialize Your Project

cd /your/project
codeindex init

This creates:

  • .codeindex.yaml — scan configuration (languages, include/exclude patterns)
  • CLAUDE.md — injects codeindex instructions so Claude Code uses README_AI.md automatically
  • CODEINDEX.md — project-level documentation reference

2. Scan Your Codebase

# Scan all directories (structural documentation, no AI needed)
codeindex scan-all

# Scan a single directory
codeindex scan ./src/auth

# AI-enhanced documentation (requires ai_command in config)
codeindex scan-all --ai

# Preview AI prompt without executing
codeindex scan ./src/auth --ai --dry-run

3. Check Status

codeindex status
Indexing Status
───────────────────────────────
✅ src/auth/
✅ src/utils/
⚠️  src/api/ (no README_AI.md)
Indexed: 2/3 (67%)

4. Generate Indexes

# Global symbol index (PROJECT_SYMBOLS.md)
codeindex symbols

# Module overview (PROJECT_INDEX.md)
codeindex index

# Git change impact analysis
codeindex affected --since HEAD~5

More Commands

Command Description Guide
codeindex scan --output json JSON output for tools JSON Output Guide
codeindex parse <file> Parse single file to JSON LoomGraph Integration
codeindex tech-debt ./src Technical debt analysis Advanced Usage
codeindex hooks install Git hooks for auto-update Git Hooks Guide
codeindex config explain <param> Parameter help Configuration Guide

Claude Code Integration

v0.17.0: codeindex init automatically injects instructions into your project's CLAUDE.md, so Claude Code reads README_AI.md files first — no manual setup required.

# One command sets everything up
codeindex init

# Claude Code will now:
# ✅ Read README_AI.md before searching source files
# ✅ Use structured indexes for architecture understanding
# ✅ Navigate code via Serena MCP tools (find_symbol, etc.)

For manual setup, MCP skills (/mo:arch, /mo:index), and Git hooks integration, see the Claude Code Integration Guide.


Language Support

Language Status Since Key Features
Python ✅ Supported v0.1.0 Classes, functions, methods, imports, docstrings, inheritance, calls
PHP ✅ Supported v0.5.0 Classes (extends/implements), methods, properties, PHPDoc, inheritance, calls
Java ✅ Supported v0.7.0 Classes, interfaces, enums, records, annotations, Spring routes, Lombok, calls
TypeScript/JS 🧪 Tests Ready v0.14.0 Parser implementation in progress (Epic 15)
Go 📋 Planned Packages, interfaces, struct methods
Rust 📋 Planned Structs, traits, modules
C# 📋 Planned Classes, interfaces, .NET projects

Want to add a language? The template-based test system lets you contribute by writing YAML specs — no Python knowledge required. See CONTRIBUTING.md for details.

Framework Route Extraction

Framework Language Status
ThinkPHP PHP ✅ Stable (v0.5.0)
Spring Boot Java ✅ Stable (v0.8.0)
Laravel PHP 📋 Planned
FastAPI Python 📋 Planned
Django Python 📋 Planned
Express.js JS/TS 📋 Planned

How It Works

Directory → Scanner → Parser (tree-sitter) → Smart Writer → README_AI.md
  1. Scanner — walks directories, filters by config patterns
  2. Parser — extracts symbols (classes, functions, imports, calls, inheritance) via tree-sitter
  3. Smart Writer — generates tiered documentation with size limits (≤50KB)
  4. OutputREADME_AI.md optimized for AI consumption, or JSON for tool integration

Documentation

User Guides

Guide Description
Getting Started Installation and first scan
Configuration Guide All config options explained
Advanced Usage Parallel scanning, custom prompts
Git Hooks Integration Automated quality checks and doc updates
Claude Code Integration AI agent setup and MCP skills
JSON Output Integration Machine-readable output for tools
LoomGraph Integration Knowledge graph data pipeline

Developer Guides

Guide Description
CONTRIBUTING.md Development setup, TDD workflow, code style
CLAUDE.md Quick reference for Claude Code and contributors
Design Philosophy Core design principles and architecture
Release Automation 5-minute automated release workflow
Multi-Language Support Adding new language parsers
Language Support Contribution Template-based test generation for new languages

Planning


Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

git clone https://github.com/dreamlx/codeindex.git
cd codeindex
pip install -e ".[dev,all]"
make install-hooks
make test

Release Process (Maintainers)

make release VERSION=0.17.0
# GitHub Actions: tests → PyPI publish → GitHub Release

See Release Automation Guide for details.


Roadmap

Current version: v0.17.2

Recent milestones:

  • v0.17.0 — CLAUDE.md injection via codeindex init
  • v0.16.0 — CLI UX restructuring (structural mode default, --ai opt-in)
  • v0.15.0 — Template-based test architecture migration
  • v0.14.0 — Interactive setup wizard, single file parse, parser modularization

Next:

  • Framework routes expansion: Express, Laravel, FastAPI, Django (Epic 17)
  • TypeScript parser implementation (Epic 15)
  • Go, Rust, C# language support

Moved to LoomGraph:

  • Code similarity search, refactoring suggestions, team collaboration, IDE integration

See Strategic Roadmap for detailed plans.


License

MIT License — see LICENSE file for details.

Acknowledgments

  • tree-sitter — fast, incremental parsing
  • Claude CLI — AI integration inspiration
  • All contributors and users

Support


Made with ❤️ by the codeindex team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_codeindex-0.17.2.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_codeindex-0.17.2-py3-none-any.whl (142.9 kB view details)

Uploaded Python 3

File details

Details for the file ai_codeindex-0.17.2.tar.gz.

File metadata

  • Download URL: ai_codeindex-0.17.2.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_codeindex-0.17.2.tar.gz
Algorithm Hash digest
SHA256 14a4bf4e0fa2915de18f49cbcb256c7ce6f1a7b347d05d14a2ea9ee518908af4
MD5 30e0802cf1bef99f4c7b834ed5a95c39
BLAKE2b-256 ffea31cfcb1c001c3e7962c9b93a36c751c5fc8d99a8be7e4e15a710a8bce07f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_codeindex-0.17.2.tar.gz:

Publisher: publish.yml on dreamlx/codeindex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_codeindex-0.17.2-py3-none-any.whl.

File metadata

  • Download URL: ai_codeindex-0.17.2-py3-none-any.whl
  • Upload date:
  • Size: 142.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_codeindex-0.17.2-py3-none-any.whl
Algorithm Hash digest
SHA256 faf3319b12ed87a9723437bce9ea122cc7f5aee66609fc42885e45f36a720c94
MD5 0e6a237f80c267477e812ce1ce69ddd7
BLAKE2b-256 cae3071d58f10f4ff0133a038023f5fe8a18d4e3fa74e393e7ea9c9c9cb94845

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_codeindex-0.17.2-py3-none-any.whl:

Publisher: publish.yml on dreamlx/codeindex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page