Skip to main content

AI-native code indexing tool for large codebases

Project description

codeindex

PyPI version Python 3.10+ License: MIT Tests

Universal Code Parser — Best-in-class multi-language AST parser for AI-assisted development.

codeindex extracts symbols, inheritance relationships, call graphs, and imports from Python, PHP, Java, TypeScript, and JavaScript using tree-sitter. Perfect for feeding structured code data to AI tools, knowledge graphs, and code intelligence platforms.


For LoomGraph Developers: FOR_LOOMGRAPH.md (quick start) | docs/guides/loomgraph-integration.md (full guide)


Features

  • Multi-language AST parsing — Python, PHP, Java, TypeScript, JavaScript via tree-sitter (Go, Rust, C# planned)
  • AI-powered documentation — Generate README files using Claude, GPT, or any AI CLI
  • Single file parsecodeindex parse <file> with JSON output for tool integration
  • Structured JSON output--output json for CI/CD, knowledge graphs, and downstream tools
  • Call relationship extraction — Function/method call graphs across Python, Java, PHP, TypeScript, JavaScript
  • Inheritance extraction — Class hierarchy and interface relationships
  • Framework route extraction — ThinkPHP and Spring Boot route tables (more planned)
  • Technical debt analysis — Detect large files, god classes, symbol overload
  • Smart indexing — Tiered documentation (overview → navigation → detailed) optimized for AI agents
  • Adaptive symbol extraction — Dynamic 5–150 symbols per file based on size
  • CLAUDE.md injectioncodeindex init auto-configures Claude Code integration (v0.17.0)
  • Template-based test generation — YAML + Jinja2 for rapid language support (88–91% time savings)
  • Parallel scanning — Concurrent directory processing with configurable workers

Installation

codeindex uses lazy loading — language parsers are only imported when needed.

Quick Install

# All languages (recommended)
pip install ai-codeindex[all]

# Or specific languages only
pip install ai-codeindex[python]
pip install ai-codeindex[php]
pip install ai-codeindex[java]
pip install ai-codeindex[typescript]
pip install ai-codeindex[python,php]

Using pipx (Recommended for CLI use)

pipx install ai-codeindex[all]

From Source

git clone https://github.com/dreamlx/codeindex.git
cd codeindex
pip install -e ".[all]"

Quick Start

1. Initialize Your Project

cd /your/project
codeindex init

This creates:

  • .codeindex.yaml — scan configuration (languages, include/exclude patterns)
  • CLAUDE.md — injects codeindex instructions so Claude Code uses README_AI.md automatically
  • CODEINDEX.md — project-level documentation reference

2. Scan Your Codebase

# Scan all directories (structural documentation, no AI needed)
codeindex scan-all

# Scan a single directory
codeindex scan ./src/auth

# AI-enhanced documentation (requires ai_command in config)
codeindex scan-all --ai

# Preview AI prompt without executing
codeindex scan ./src/auth --ai --dry-run

3. Check Status

codeindex status
Indexing Status
───────────────────────────────
✅ src/auth/
✅ src/utils/
⚠️  src/api/ (no README_AI.md)
Indexed: 2/3 (67%)

4. Generate Indexes

# Global symbol index (PROJECT_SYMBOLS.md)
codeindex symbols

# Module overview (PROJECT_INDEX.md)
codeindex index

# Git change impact analysis
codeindex affected --since HEAD~5

More Commands

Command Description Guide
codeindex scan --output json JSON output for tools JSON Output Guide
codeindex parse <file> Parse single file to JSON LoomGraph Integration
codeindex tech-debt ./src Technical debt analysis Advanced Usage
codeindex hooks install Git hooks for auto-update Git Hooks Guide
codeindex config explain <param> Parameter help Configuration Guide

Claude Code Integration

v0.17.0: codeindex init automatically injects instructions into your project's CLAUDE.md, so Claude Code reads README_AI.md files first — no manual setup required.

# One command sets everything up
codeindex init

# Claude Code will now:
# ✅ Read README_AI.md before searching source files
# ✅ Use structured indexes for architecture understanding
# ✅ Navigate code via Serena MCP tools (find_symbol, etc.)

For manual setup, MCP skills (/mo:arch, /mo:index), and Git hooks integration, see the Claude Code Integration Guide.


Language Support

Language Status Since Key Features
Python ✅ Supported v0.1.0 Classes, functions, methods, imports, docstrings, inheritance, calls
PHP ✅ Supported v0.5.0 Classes (extends/implements), methods, properties, PHPDoc, inheritance, calls
Java ✅ Supported v0.7.0 Classes, interfaces, enums, records, annotations, Spring routes, Lombok, calls
TypeScript/JS ✅ Supported v0.19.0 Classes, interfaces, enums, type aliases, arrow functions, JSX/TSX, imports/exports, calls
Go 📋 Planned Packages, interfaces, struct methods
Rust 📋 Planned Structs, traits, modules
C# 📋 Planned Classes, interfaces, .NET projects

Want to add a language? The template-based test system lets you contribute by writing YAML specs — no Python knowledge required. See CONTRIBUTING.md for details.

Framework Route Extraction

Framework Language Status
ThinkPHP PHP ✅ Stable (v0.5.0)
Spring Boot Java ✅ Stable (v0.8.0)
Laravel PHP 📋 Planned
FastAPI Python 📋 Planned
Django Python 📋 Planned
Express.js JS/TS 📋 Planned

How It Works

Directory → Scanner → Parser (tree-sitter) → Smart Writer → README_AI.md
  1. Scanner — walks directories, filters by config patterns
  2. Parser — extracts symbols (classes, functions, imports, calls, inheritance) via tree-sitter
  3. Smart Writer — generates tiered documentation with size limits (≤50KB)
  4. OutputREADME_AI.md optimized for AI consumption, or JSON for tool integration

Documentation

User Guides

Guide Description
Getting Started Installation and first scan
Configuration Guide All config options explained
Advanced Usage Parallel scanning, custom prompts
Git Hooks Integration Automated quality checks and doc updates
Claude Code Integration AI agent setup and MCP skills
JSON Output Integration Machine-readable output for tools
LoomGraph Integration Knowledge graph data pipeline

Developer Guides

Guide Description
CONTRIBUTING.md Development setup, TDD workflow, code style
CLAUDE.md Quick reference for Claude Code and contributors
Design Philosophy Core design principles and architecture
Release Automation 5-minute automated release workflow
Multi-Language Support Adding new language parsers
Language Support Contribution Template-based test generation for new languages

Planning


Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

git clone https://github.com/dreamlx/codeindex.git
cd codeindex
pip install -e ".[dev,all]"
make install-hooks
make test

Release Process (Maintainers)

make release VERSION=0.17.0
# GitHub Actions: tests → PyPI publish → GitHub Release

See Release Automation Guide for details.


Roadmap

Current version: v0.20.0

Recent milestones:

  • v0.17.0 — CLAUDE.md injection via codeindex init
  • v0.16.0 — CLI UX restructuring (structural mode default, --ai opt-in)
  • v0.15.0 — Template-based test architecture migration
  • v0.14.0 — Interactive setup wizard, single file parse, parser modularization

Next:

  • Framework routes expansion: Express, Laravel, FastAPI, Django (Epic 17)
  • Go, Rust, C# language support

Moved to LoomGraph:

  • Code similarity search, refactoring suggestions, team collaboration, IDE integration

See Strategic Roadmap for detailed plans.


License

MIT License — see LICENSE file for details.

Acknowledgments

  • tree-sitter — fast, incremental parsing
  • Claude CLI — AI integration inspiration
  • All contributors and users

Support


Made with ❤️ by the codeindex team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_codeindex-0.20.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_codeindex-0.20.0-py3-none-any.whl (151.9 kB view details)

Uploaded Python 3

File details

Details for the file ai_codeindex-0.20.0.tar.gz.

File metadata

  • Download URL: ai_codeindex-0.20.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_codeindex-0.20.0.tar.gz
Algorithm Hash digest
SHA256 dcacdc30b612569728b343d0a7865838ba7e301ec9413e70c81c7f5550554790
MD5 a8313fe1cebcee1b5a5c54264dffdb36
BLAKE2b-256 cb73b5aa6088e269c3b9aecb30cae6f0f534cd619548e2a264c2d3b33db3a960

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_codeindex-0.20.0.tar.gz:

Publisher: publish.yml on dreamlx/codeindex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_codeindex-0.20.0-py3-none-any.whl.

File metadata

  • Download URL: ai_codeindex-0.20.0-py3-none-any.whl
  • Upload date:
  • Size: 151.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_codeindex-0.20.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1839bf5214915b38b4197f90ddb32e87ae0d0eebcfde14b917fa036cb9963a5a
MD5 14b4284c59215ddfd52436d84d7779af
BLAKE2b-256 c41990945e4772479e74813273f29d98e0a05e19ebe45b8f9526bad91cf3072e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_codeindex-0.20.0-py3-none-any.whl:

Publisher: publish.yml on dreamlx/codeindex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page