MCP server that tracks file descriptions across codebases, enabling AI agents to efficiently navigate and understand code through searchable summaries and token-aware overviews.

These details have not been verified by PyPI

Project links

Project description

MCP Code Indexer 🚀

A production-ready Model Context Protocol (MCP) server that revolutionizes how AI agents navigate and understand codebases. Built for high-concurrency environments with advanced database resilience, the server provides instant access to intelligent descriptions, semantic search, and context-aware recommendations while maintaining 800+ writes/sec throughput.

🎯 What It Does

The MCP Code Indexer solves a critical problem for AI agents working with large codebases: understanding code structure without repeatedly scanning files. Instead of reading every file, agents can:

Query file purposes instantly with natural language descriptions
Search across codebases using full-text search
Get intelligent recommendations based on codebase size (overview vs search)
Merge branch descriptions with conflict resolution
Inherit descriptions from upstream repositories automatically

Perfect for AI-powered code review, refactoring tools, documentation generation, and codebase analysis workflows.

⚡ Quick Start

👨‍💻 For Developers

Get started integrating MCP Code Indexer into your AI agent workflow:

# Install the package
pip install mcp-code-indexer

# Start the MCP server
mcp-code-indexer

# Connect your MCP client and start using tools
# See API Reference for complete tool documentation

🔧 For System Administrators

Deploy and configure the server for your team:

# Production deployment with custom settings
mcp-code-indexer \
  --token-limit 64000 \
  --db-path /data/mcp-index.db \
  --cache-dir /var/cache/mcp \
  --log-level INFO

# Check installation
mcp-code-indexer --version

🎯 For Everyone

New to MCP Code Indexer? Start here:

Install: pip install mcp-code-indexer
Run: mcp-code-indexer --token-limit 32000
Connect: Use your favorite MCP client
Explore: Try the check_codebase_size tool first

Development Setup:

# Clone and setup for contributing
git clone https://github.com/fluffypony/mcp-code-indexer.git
cd mcp-code-indexer

# Install in development mode (required)
pip install -e .

# Run the server
mcp-code-indexer --token-limit 32000

🔗 Git Hook Integration

🚀 NEW Feature: Automated code indexing with AI-powered analysis! Keep your file descriptions synchronized automatically as your codebase evolves.

👤 For Users: Quick Setup

# Set your OpenRouter API key
export OPENROUTER_API_KEY="sk-or-v1-your-api-key-here"

# Test git hook functionality
mcp-code-indexer --githook

# Install post-commit hook
cp examples/git-hooks/post-commit .git/hooks/
chmod +x .git/hooks/post-commit

👨‍💻 For Developers: How It Works

The git hook integration provides intelligent automation:

📊 Git Analysis: Automatically analyzes git diffs after commits/merges
🤖 AI Processing: Uses OpenRouter API with Anthropic's Claude Sonnet 4
⚡ Smart Updates: Only processes files that actually changed
🔄 Overview Maintenance: Updates project overview when structure changes
🛡️ Error Isolation: Git operations continue even if indexing fails
⏱️ Rate Limiting: Built-in retry logic with exponential backoff

🎯 Key Benefits

💡 Zero Manual Work: Descriptions stay current without any effort
⚡ Performance: Only analyzes changed files, not entire codebase
🔒 Reliability: Robust error handling ensures git operations never fail
🎛️ Configurable: Support for custom models and timeout settings

Learn More: See Git Hook Setup Guide for complete configuration options and troubleshooting.

🔧 Development Setup

👨‍💻 For Contributors

Contributing to MCP Code Indexer? Follow these steps for a proper development environment:

# Setup development environment
git clone https://github.com/fluffypony/mcp-code-indexer.git
cd mcp-code-indexer

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install package in editable mode (REQUIRED for development)
pip install -e .

# Install development dependencies
pip install -e .[dev]

# Verify installation
python main.py --help
mcp-code-indexer --version

⚠️ Important: The editable install (pip install -e .) is required for development. The project uses proper PyPI package structure with absolute imports like from mcp_code_indexer.database.database import DatabaseManager. Without editable installation, you'll get ModuleNotFoundError exceptions.

🎯 Development Workflow

# Activate virtual environment
source venv/bin/activate

# Run the server directly
python main.py --token-limit 32000

# Or use the installed CLI command
mcp-code-indexer --token-limit 32000

# Run tests
python -m pytest tests/ -v

# Run with coverage
python -m pytest tests/ --cov=src --cov-report=html

# Format code
black src/ tests/
isort src/ tests/

# Type checking
mypy src/

🛠️ MCP Tools Available

The server provides 12 powerful MCP tools for intelligent codebase management. Whether you're an AI agent or human developer, these tools make navigating code effortless.

🎯 For Everyone: Start Here

check_codebase_size - Get instant recommendations for how to navigate your codebase
search_descriptions - Find files by what they do, not just their names
get_codebase_overview - Get a high-level understanding of any project

👨‍💻 For Developers: Core Operations

get_file_description - Retrieve stored file descriptions instantly
update_file_description - Store detailed file summaries and metadata
find_missing_descriptions - Scan projects for files without descriptions
update_missing_descriptions - Bulk update multiple file descriptions

🔍 For Advanced Users: Search & Discovery

get_all_descriptions - Complete hierarchical project structure
get_word_frequency - Technical vocabulary analysis with stop-word filtering
merge_branch_descriptions - Two-phase merge with conflict resolution
update_codebase_overview - Create comprehensive codebase documentation

🏥 For System Monitoring: Health & Performance

check_database_health - Real-time database health monitoring and diagnostics

💡 Pro Tip: Always start with check_codebase_size to get personalized recommendations for navigating your specific codebase.

🔗 Git Hook Integration

Keep your codebase documentation automatically synchronized with automated analysis on every commit, rebase, or merge:

# Analyze current staged changes
mcp-code-indexer --githook

# Analyze a specific commit
mcp-code-indexer --githook abc123def

# Analyze a commit range (perfect for rebases)
mcp-code-indexer --githook abc123 def456

🎯 Perfect for:

Automated documentation that never goes stale
Rebase-aware analysis that handles complex git operations
Zero-effort maintenance with background processing

See the Git Hook Setup Guide for complete installation instructions including post-commit, post-merge, and post-rewrite hooks.

🏗️ Architecture Highlights

🚀 Performance Optimized

SQLite with WAL mode for high-concurrency access (800+ writes/sec)
Smart connection pooling with optimized pool size (3 connections default)
FTS5 full-text search with prefix indexing for sub-100ms queries
Token-aware caching to minimize expensive operations
Write operation serialization to eliminate database lock conflicts

🛡️ Production Ready

Database resilience features with <2% error rate under high load
Exponential backoff retry logic with intelligent failure recovery
Comprehensive health monitoring with automatic pool refresh
Structured JSON logging with performance metrics tracking
Async-first design with proper resource cleanup
MCP protocol compliant with clean stdio streams
Upstream inheritance for fork workflows
Git integration with .gitignore support

👨‍💻 Developer Friendly

95%+ test coverage with async support and concurrent access tests
Integration tests for complete workflows including database stress testing
Performance benchmarks for large codebases with resilience validation
Clear error messages with MCP protocol compliance
Comprehensive configuration options for production tuning

📖 Documentation

👤 For Users

Git Hook Setup Guide - Automated code indexing setup
Configuration Guide - Production deployment and tuning

👨‍💻 For Developers

API Reference - Complete MCP tool documentation with examples
Architecture Overview - Technical deep dive into system design
Database Resilience Guide - Advanced database optimization and monitoring

🔧 For System Administrators

Performance Tuning Guide - High-concurrency deployment optimization
Monitoring & Diagnostics - Production monitoring setup and troubleshooting

🤝 For Contributors

Contributing Guide - Development setup and workflow guidelines

🚦 System Requirements

Python 3.8+ with asyncio support
SQLite 3.35+ (included with Python)
4GB+ RAM for large codebases (1000+ files)
SSD storage recommended for optimal performance

📊 Performance

Tested with codebases up to 10,000 files:

File description retrieval: < 10ms
Full-text search: < 100ms
Codebase overview generation: < 2s
Merge conflict detection: < 5s

🔧 Advanced Configuration

👨‍💻 For Developers: Basic Configuration

# Production setup with custom limits
mcp-code-indexer \
  --token-limit 50000 \
  --db-path /data/mcp-index.db \
  --cache-dir /tmp/mcp-cache \
  --log-level INFO

# Enable structured logging
export MCP_LOG_FORMAT=json
mcp-code-indexer

🔧 For System Administrators: Database Resilience Tuning

Configure advanced database resilience features for high-concurrency environments:

# High-performance production deployment
mcp-code-indexer \
  --token-limit 64000 \
  --db-path /data/mcp-index.db \
  --cache-dir /var/cache/mcp \
  --log-level INFO \
  --db-pool-size 5 \
  --db-retry-count 7 \
  --db-timeout 15.0 \
  --enable-wal-mode \
  --health-check-interval 20.0

# Environment variable configuration
export DB_POOL_SIZE=5
export DB_RETRY_COUNT=7
export DB_TIMEOUT=15.0
export DB_WAL_MODE=true
export DB_HEALTH_CHECK_INTERVAL=20.0
mcp-code-indexer --token-limit 64000

Configuration Options

Parameter	Default	Description	Use Case
`--db-pool-size`	3	Database connection pool size	Higher for more concurrent clients
`--db-retry-count`	5	Max retry attempts for failed operations	Increase for unstable environments
`--db-timeout`	10.0	Transaction timeout (seconds)	Increase for large operations
`--enable-wal-mode`	true	Enable WAL mode for concurrency	Always enable for production
`--health-check-interval`	30.0	Health monitoring interval (seconds)	Lower for faster issue detection

💡 Performance Tip: For environments with 10+ concurrent clients, use --db-pool-size 5 and --health-check-interval 15.0 for optimal throughput.

🤝 Integration Examples

With AI Agents

# Example: AI agent using MCP tools
async def analyze_codebase(project_path):
    # Check if codebase is large
    size_info = await mcp_client.call_tool("check_codebase_size", {
        "projectName": "my-project",
        "folderPath": project_path,
        "branch": "main"
    })
    
    if size_info["isLarge"]:
        # Use search for large codebases
        results = await mcp_client.call_tool("search_descriptions", {
            "projectName": "my-project", 
            "folderPath": project_path,
            "branch": "main",
            "query": "authentication logic"
        })
    else:
        # Get full overview for smaller projects
        overview = await mcp_client.call_tool("get_codebase_overview", {
            "projectName": "my-project",
            "folderPath": project_path, 
            "branch": "main"
        })

With CI/CD Pipelines

# Example: GitHub Actions integration
- name: Update Code Descriptions
  run: |
    python -c "
    import asyncio
    from mcp_client import MCPClient
    
    async def update_descriptions():
        client = MCPClient('mcp-code-indexer')
        
        # Find files without descriptions
        missing = await client.call_tool('find_missing_descriptions', {
            'projectName': '${{ github.repository }}',
            'folderPath': '.',
            'branch': '${{ github.ref_name }}'
        })
        
        # Process with AI and update...
    
    asyncio.run(update_descriptions())
    "

🧪 Testing

# Install with test dependencies
pip install mcp-code-indexer[test]

# Run full test suite
python -m pytest tests/ -v

# Run with coverage
python -m pytest tests/ --cov=src --cov-report=html

# Run performance tests
python -m pytest tests/ -m performance

# Run integration tests only
python -m pytest tests/integration/ -v

📈 Monitoring

The server provides structured JSON logs for monitoring:

{
  "timestamp": "2024-01-15T10:30:00Z",
  "level": "INFO",
  "message": "Tool search_descriptions completed",
  "tool_usage": {
    "tool_name": "search_descriptions",
    "success": true,
    "duration_seconds": 0.045,
    "result_size": 1247
  }
}

📋 Command Line Options

Server Mode (Default)

mcp-code-indexer [OPTIONS]

Options:
  --token-limit INT     Maximum tokens before recommending search (default: 32000)
  --db-path PATH        SQLite database path (default: ~/.mcp-code-index/tracker.db)
  --cache-dir PATH      Cache directory path (default: ~/.mcp-code-index/cache)
  --log-level LEVEL     Logging level: DEBUG|INFO|WARNING|ERROR|CRITICAL (default: INFO)

Git Hook Mode

mcp-code-indexer --githook [OPTIONS]

# Automated analysis of git changes using OpenRouter API
# Requires: OPENROUTER_API_KEY environment variable

Utility Commands

# List all projects and branches
mcp-code-indexer --getprojects

# Execute MCP tool directly
mcp-code-indexer --runcommand '{"method": "tools/call", "params": {...}}'

# Export descriptions for a project
mcp-code-indexer --dumpdescriptions PROJECT_ID [BRANCH]

🛡️ Security Features

Input validation on all MCP tool parameters
SQL injection protection via parameterized queries
File system sandboxing with .gitignore respect
Error sanitization to prevent information leakage
Async resource cleanup to prevent memory leaks

🚀 Next Steps

Ready to supercharge your AI agents with intelligent codebase navigation?

👤 Getting Started

Install and run your first server - Get up and running in 2 minutes
Set up git hooks - Automate your workflow
Configure for production - Deploy for your team

👨‍💻 For Developers

Explore the API tools - Master all 11 MCP tools
Understand the architecture - Deep dive into the technical design

🤝 Join the Community

Contribute to the project - Help make it even better
Report issues on GitHub - Share feedback and suggestions

🤝 Contributing

We welcome contributions! See our Contributing Guide for:

Development setup
Code style guidelines
Testing requirements
Pull request process

📄 License

MIT License - see LICENSE for details.

🙏 Built With

Model Context Protocol - The foundation for tool integration
tiktoken - Fast BPE tokenization
aiosqlite - Async SQLite operations
aiohttp - Async HTTP client for OpenRouter API
tenacity - Robust retry logic and rate limiting
Pydantic - Data validation and settings

Transform how your AI agents understand code! 🚀

🎯 New User? Get started in 2 minutes
👨‍💻 Developer? Explore the complete API
🔧 Production? Deploy with confidence

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

4.2.20

Feb 4, 2026

4.2.19

Feb 4, 2026

4.2.18

Feb 4, 2026

4.2.17

Feb 4, 2026

4.2.16

Feb 4, 2026

4.2.15

Aug 11, 2025

4.2.14

Aug 11, 2025

4.2.13

Aug 11, 2025

4.2.12

Aug 11, 2025

4.2.11

Aug 11, 2025

4.2.10

Aug 11, 2025

4.2.9

Aug 11, 2025

4.2.8

Aug 11, 2025

4.2.7

Aug 10, 2025

4.2.6

Aug 10, 2025

4.2.5

Aug 10, 2025

4.2.4

Aug 10, 2025

4.2.3

Aug 10, 2025

4.2.2

Aug 10, 2025

4.2.1

Aug 10, 2025

4.2.0

Aug 10, 2025

4.1.0

Aug 10, 2025

4.0.2

Jul 6, 2025

4.0.1

Jul 1, 2025

4.0.0

Jun 29, 2025

3.6.0

Jun 29, 2025

3.5.6

Jun 28, 2025

3.5.5

Jun 28, 2025

3.5.4

Jun 28, 2025

3.5.3

Jun 28, 2025

3.5.2

Jun 28, 2025

3.5.1

Jun 28, 2025

3.5.0

Jun 28, 2025

3.4.3

Jun 28, 2025

3.4.2

Jun 27, 2025

3.4.1

Jun 27, 2025

3.4.0

Jun 27, 2025

3.3.0

Jun 27, 2025

3.2.2

Jun 27, 2025

3.2.1

Jun 27, 2025

3.2.0

Jun 27, 2025

3.1.6

Jun 22, 2025

3.1.5

Jun 21, 2025

3.1.4

Jun 20, 2025

3.1.3

Jun 14, 2025

3.1.2

Jun 14, 2025

3.1.1

Jun 14, 2025

3.0.3

Jun 14, 2025

This version

3.0.2

Jun 14, 2025

3.0.0

Jun 14, 2025

2.4.0

Jun 14, 2025

2.3.0

Jun 13, 2025

2.2.1

Jun 13, 2025

2.2.0

Jun 13, 2025

2.1.0

Jun 13, 2025

2.0.2

Jun 13, 2025

2.0.1

Jun 13, 2025

2.0.0

Jun 13, 2025

1.9.1

Jun 12, 2025

1.9.0

Jun 12, 2025

1.8.1

Jun 12, 2025

1.8.0

Jun 12, 2025

1.7.0

Jun 12, 2025

1.6.5

Jun 12, 2025

1.6.4

Jun 12, 2025

1.6.3

Jun 12, 2025

1.6.2

Jun 12, 2025

1.6.1

Jun 12, 2025

1.6.0

Jun 12, 2025

1.5.0

Jun 12, 2025

1.4.1

Jun 12, 2025

1.4.0

Jun 12, 2025

1.2.4

Jun 12, 2025

1.2.3

Jun 12, 2025

1.2.2

Jun 12, 2025

1.2.1

Jun 12, 2025

1.2.0

Jun 12, 2025

1.1.5

Jun 12, 2025

1.1.3

Jun 11, 2025

1.1.2

Jun 11, 2025

1.1.1

Jun 11, 2025

1.1.0

Jun 11, 2025

1.0.9

Jun 11, 2025

1.0.8

Jun 11, 2025

1.0.7

Jun 11, 2025

1.0.6

Jun 11, 2025

1.0.5

Jun 11, 2025

1.0.4

Jun 11, 2025

1.0.3

Jun 11, 2025

1.0.2

Jun 11, 2025

1.0.1

Jun 11, 2025

1.0.0

Jun 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_code_indexer-3.0.2.tar.gz (918.5 kB view details)

Uploaded Jun 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcp_code_indexer-3.0.2-py3-none-any.whl (881.3 kB view details)

Uploaded Jun 14, 2025 Python 3

File details

Details for the file mcp_code_indexer-3.0.2.tar.gz.

File metadata

Download URL: mcp_code_indexer-3.0.2.tar.gz
Upload date: Jun 14, 2025
Size: 918.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for mcp_code_indexer-3.0.2.tar.gz
Algorithm	Hash digest
SHA256	`efcc882b4eb29903d7c3d8571fd4d9164dd1298c25d6ad75f16021168e504d0a`
MD5	`a118402b1a65a9bf325d69f6a834fa87`
BLAKE2b-256	`38eeba95ec197163e483e2f71d095a2ff192bb8559b9298ba853d9d6faed6ad3`

See more details on using hashes here.

File details

Details for the file mcp_code_indexer-3.0.2-py3-none-any.whl.

File metadata

Download URL: mcp_code_indexer-3.0.2-py3-none-any.whl
Upload date: Jun 14, 2025
Size: 881.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for mcp_code_indexer-3.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3414c5fc6402c9936bcd9774d12e637f801ddb7d704a4293c316473dd686448e`
MD5	`ca302f6e9ed97cd33aab816da1810840`
BLAKE2b-256	`8f821473c1abb45f514203606d81c969adafe442ab1df5ddac620d69a0f9a0fe`

See more details on using hashes here.

mcp-code-indexer 3.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MCP Code Indexer 🚀

🎯 What It Does

⚡ Quick Start

👨‍💻 For Developers

🔧 For System Administrators

🎯 For Everyone

🔗 Git Hook Integration

👤 For Users: Quick Setup

👨‍💻 For Developers: How It Works

🎯 Key Benefits

🔧 Development Setup

👨‍💻 For Contributors

🎯 Development Workflow

🛠️ MCP Tools Available

🎯 For Everyone: Start Here

👨‍💻 For Developers: Core Operations

🔍 For Advanced Users: Search & Discovery

🏥 For System Monitoring: Health & Performance

🔗 Git Hook Integration

🏗️ Architecture Highlights

🚀 Performance Optimized

🛡️ Production Ready

👨‍💻 Developer Friendly

📖 Documentation

👤 For Users

👨‍💻 For Developers

🔧 For System Administrators

🤝 For Contributors

🚦 System Requirements

📊 Performance

🔧 Advanced Configuration

👨‍💻 For Developers: Basic Configuration

🔧 For System Administrators: Database Resilience Tuning

Configuration Options

🤝 Integration Examples

With AI Agents

With CI/CD Pipelines

🧪 Testing

📈 Monitoring

📋 Command Line Options

Server Mode (Default)

Git Hook Mode

Utility Commands

🛡️ Security Features

🚀 Next Steps

👤 Getting Started

👨‍💻 For Developers

🤝 Join the Community

🤝 Contributing

📄 License

🙏 Built With

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes