AI-native code indexing tool for large codebases
Project description
codeindex
Universal Code Parser — Best-in-class multi-language AST parser for AI-assisted development.
codeindex extracts symbols, inheritance relationships, call graphs, and imports from Python, PHP, and Java using tree-sitter. Perfect for feeding structured code data to AI tools, knowledge graphs, and code intelligence platforms.
For LoomGraph Developers:
FOR_LOOMGRAPH.md(quick start) |docs/guides/loomgraph-integration.md(full guide)
Features
- Multi-language AST parsing — Python, PHP, Java via tree-sitter (TypeScript, Go, Rust, C# planned)
- AI-powered documentation — Generate README files using Claude, GPT, or any AI CLI
- Single file parse —
codeindex parse <file>with JSON output for tool integration - Structured JSON output —
--output jsonfor CI/CD, knowledge graphs, and downstream tools - Call relationship extraction — Function/method call graphs across Python, Java, PHP
- Inheritance extraction — Class hierarchy and interface relationships
- Framework route extraction — ThinkPHP and Spring Boot route tables (more planned)
- Technical debt analysis — Detect large files, god classes, symbol overload
- Smart indexing — Tiered documentation (overview → navigation → detailed) optimized for AI agents
- Adaptive symbol extraction — Dynamic 5–150 symbols per file based on size
- CLAUDE.md injection —
codeindex initauto-configures Claude Code integration (v0.17.0) - Template-based test generation — YAML + Jinja2 for rapid language support (88–91% time savings)
- Parallel scanning — Concurrent directory processing with configurable workers
Installation
codeindex uses lazy loading — language parsers are only imported when needed.
Quick Install
# All languages (recommended)
pip install ai-codeindex[all]
# Or specific languages only
pip install ai-codeindex[python]
pip install ai-codeindex[php]
pip install ai-codeindex[java]
pip install ai-codeindex[python,php]
Using pipx (Recommended for CLI use)
pipx install ai-codeindex[all]
From Source
git clone https://github.com/dreamlx/codeindex.git
cd codeindex
pip install -e ".[all]"
Quick Start
1. Initialize Your Project
cd /your/project
codeindex init
This creates:
.codeindex.yaml— scan configuration (languages, include/exclude patterns)CLAUDE.md— injects codeindex instructions so Claude Code uses README_AI.md automaticallyCODEINDEX.md— project-level documentation reference
2. Scan Your Codebase
# Scan all directories (structural documentation, no AI needed)
codeindex scan-all
# Scan a single directory
codeindex scan ./src/auth
# AI-enhanced documentation (requires ai_command in config)
codeindex scan-all --ai
# Preview AI prompt without executing
codeindex scan ./src/auth --ai --dry-run
3. Check Status
codeindex status
Indexing Status
───────────────────────────────
✅ src/auth/
✅ src/utils/
⚠️ src/api/ (no README_AI.md)
Indexed: 2/3 (67%)
4. Generate Indexes
# Global symbol index (PROJECT_SYMBOLS.md)
codeindex symbols
# Module overview (PROJECT_INDEX.md)
codeindex index
# Git change impact analysis
codeindex affected --since HEAD~5
More Commands
| Command | Description | Guide |
|---|---|---|
codeindex scan --output json |
JSON output for tools | JSON Output Guide |
codeindex parse <file> |
Parse single file to JSON | LoomGraph Integration |
codeindex tech-debt ./src |
Technical debt analysis | Advanced Usage |
codeindex hooks install |
Git hooks for auto-update | Git Hooks Guide |
codeindex config explain <param> |
Parameter help | Configuration Guide |
Claude Code Integration
v0.17.0: codeindex init automatically injects instructions into your project's CLAUDE.md, so Claude Code reads README_AI.md files first — no manual setup required.
# One command sets everything up
codeindex init
# Claude Code will now:
# ✅ Read README_AI.md before searching source files
# ✅ Use structured indexes for architecture understanding
# ✅ Navigate code via Serena MCP tools (find_symbol, etc.)
For manual setup, MCP skills (/mo:arch, /mo:index), and Git hooks integration, see the Claude Code Integration Guide.
Language Support
| Language | Status | Since | Key Features |
|---|---|---|---|
| Python | ✅ Supported | v0.1.0 | Classes, functions, methods, imports, docstrings, inheritance, calls |
| PHP | ✅ Supported | v0.5.0 | Classes (extends/implements), methods, properties, PHPDoc, inheritance, calls |
| Java | ✅ Supported | v0.7.0 | Classes, interfaces, enums, records, annotations, Spring routes, Lombok, calls |
| TypeScript/JS | 🧪 Tests Ready | v0.14.0 | Parser implementation in progress (Epic 15) |
| Go | 📋 Planned | — | Packages, interfaces, struct methods |
| Rust | 📋 Planned | — | Structs, traits, modules |
| C# | 📋 Planned | — | Classes, interfaces, .NET projects |
Want to add a language? The template-based test system lets you contribute by writing YAML specs — no Python knowledge required. See CONTRIBUTING.md for details.
Framework Route Extraction
| Framework | Language | Status |
|---|---|---|
| ThinkPHP | PHP | ✅ Stable (v0.5.0) |
| Spring Boot | Java | ✅ Stable (v0.8.0) |
| Laravel | PHP | 📋 Planned |
| FastAPI | Python | 📋 Planned |
| Django | Python | 📋 Planned |
| Express.js | JS/TS | 📋 Planned |
How It Works
Directory → Scanner → Parser (tree-sitter) → Smart Writer → README_AI.md
- Scanner — walks directories, filters by config patterns
- Parser — extracts symbols (classes, functions, imports, calls, inheritance) via tree-sitter
- Smart Writer — generates tiered documentation with size limits (≤50KB)
- Output —
README_AI.mdoptimized for AI consumption, or JSON for tool integration
Documentation
User Guides
| Guide | Description |
|---|---|
| Getting Started | Installation and first scan |
| Configuration Guide | All config options explained |
| Configuration Changelog | Version-by-version config changes |
| Advanced Usage | Parallel scanning, custom prompts |
| Git Hooks Integration | Automated quality checks and doc updates |
| Claude Code Integration | AI agent setup and MCP skills |
| JSON Output Integration | Machine-readable output for tools |
| LoomGraph Integration | Knowledge graph data pipeline |
Developer Guides
| Guide | Description |
|---|---|
| CONTRIBUTING.md | Development setup, TDD workflow, code style |
| CLAUDE.md | Quick reference for Claude Code and contributors |
| Design Philosophy | Core design principles and architecture |
| Release Automation | 5-minute automated release workflow |
| Multi-Language Support | Adding new language parsers |
| Language Support Contribution | Template-based test generation for new languages |
Planning
- Strategic Roadmap — long-term vision and priorities
- Changelog — version history and breaking changes
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
git clone https://github.com/dreamlx/codeindex.git
cd codeindex
pip install -e ".[dev,all]"
make install-hooks
make test
Release Process (Maintainers)
make release VERSION=0.17.0
# GitHub Actions: tests → PyPI publish → GitHub Release
See Release Automation Guide for details.
Roadmap
Current version: v0.17.1
Recent milestones:
- v0.17.0 — CLAUDE.md injection via
codeindex init - v0.16.0 — CLI UX restructuring (structural mode default,
--aiopt-in) - v0.15.0 — Template-based test architecture migration
- v0.14.0 — Interactive setup wizard, single file parse, parser modularization
Next:
- Framework routes expansion: Express, Laravel, FastAPI, Django (Epic 17)
- TypeScript parser implementation (Epic 15)
- Go, Rust, C# language support
Moved to LoomGraph:
- Code similarity search, refactoring suggestions, team collaboration, IDE integration
See Strategic Roadmap for detailed plans.
License
MIT License — see LICENSE file for details.
Acknowledgments
- tree-sitter — fast, incremental parsing
- Claude CLI — AI integration inspiration
- All contributors and users
Support
- Questions: GitHub Discussions
- Bugs: GitHub Issues
- Feature Requests: GitHub Issues
Made with ❤️ by the codeindex team
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_codeindex-0.17.1.tar.gz.
File metadata
- Download URL: ai_codeindex-0.17.1.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17f5b5260d0b275d5a03b2da22a95f9021a6e3002e55c1162620dc20fd260c1a
|
|
| MD5 |
6a637d22478f2c908db2d6202cc7862d
|
|
| BLAKE2b-256 |
2866c762f6b8b17c93ef3753ee9cb1c2eab13b2a0064b73f4e08e693388618e1
|
Provenance
The following attestation bundles were made for ai_codeindex-0.17.1.tar.gz:
Publisher:
publish.yml on dreamlx/codeindex
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ai_codeindex-0.17.1.tar.gz -
Subject digest:
17f5b5260d0b275d5a03b2da22a95f9021a6e3002e55c1162620dc20fd260c1a - Sigstore transparency entry: 944497964
- Sigstore integration time:
-
Permalink:
dreamlx/codeindex@333b76eff0535cee05d7dc74a86d71e4175af4b8 -
Branch / Tag:
refs/tags/v0.17.1 - Owner: https://github.com/dreamlx
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@333b76eff0535cee05d7dc74a86d71e4175af4b8 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ai_codeindex-0.17.1-py3-none-any.whl.
File metadata
- Download URL: ai_codeindex-0.17.1-py3-none-any.whl
- Upload date:
- Size: 143.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1254b6755f4952046bcb2a5b402eb75d6f8fb56e11cec845df9ddda9551420e8
|
|
| MD5 |
7d8d9166cc0cc022888e0db628fa5ba7
|
|
| BLAKE2b-256 |
7d13e021f405106848950fe2a04d4c388428a7b30e25d4b3b1435ef533cd714e
|
Provenance
The following attestation bundles were made for ai_codeindex-0.17.1-py3-none-any.whl:
Publisher:
publish.yml on dreamlx/codeindex
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ai_codeindex-0.17.1-py3-none-any.whl -
Subject digest:
1254b6755f4952046bcb2a5b402eb75d6f8fb56e11cec845df9ddda9551420e8 - Sigstore transparency entry: 944497966
- Sigstore integration time:
-
Permalink:
dreamlx/codeindex@333b76eff0535cee05d7dc74a86d71e4175af4b8 -
Branch / Tag:
refs/tags/v0.17.1 - Owner: https://github.com/dreamlx
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@333b76eff0535cee05d7dc74a86d71e4175af4b8 -
Trigger Event:
push
-
Statement type: