Skip to main content

Fast code map generator for AI coding assistants - Save 99%+ tokens

Project description

Code Compass 🧭

Fast code map generator for AI coding assistants

Code Compass helps AI understand your codebase by generating concise, high-signal repository maps. It saves 99%+ tokens while preserving the most important context for AI-powered coding tasks.

License: MIT Python 3.11+ Tests: 44 passing


Why Code Compass?

When working with AI coding assistants (Claude, GPT, etc.), you face a fundamental problem: context limits. Sending your entire codebase is:

  • Expensive - Costs $0.47+ per query for medium projects
  • Slow - Takes 47+ seconds to process
  • Ineffective - AI gets overwhelmed by irrelevant details

Code Compass solves this by:

  • Saving 99%+ tokens - Only sends function signatures, not implementations
  • Identifying important files - Uses PageRank to rank files by importance
  • Fast indexing - Processes 800+ files/second
  • Smart caching - Only re-parses changed files

Installation

# Clone the repository
git clone https://github.com/Xiangyu-Li97/Code-Compass-v0.1.0-MVP.git
cd code-compass

# Install dependencies
pip install -e .

Quick Start

# 1. Index your project
code-compass index /path/to/your/project

# 2. Generate a code map
code-compass map

# 3. Find a symbol
code-compass find ClassName --fuzzy

# 4. View statistics
code-compass stats

Usage

Index a Project

# Index current directory
code-compass index .

# Index a specific directory
code-compass index /path/to/project

# Force re-index all files
code-compass index . --force

Generate a Code Map

# Generate text format (default)
code-compass map

# Generate JSON format
code-compass map --format json

# Include top 30% of files
code-compass map --top 0.3

# Limit symbols per file
code-compass map --max-symbols 20

# Save to file
code-compass map -o repo_map.txt

Find Symbols

# Exact match
code-compass find ClassName

# Fuzzy search
code-compass find Parser --fuzzy

# Show full signatures
code-compass find process_data -s

View Statistics

code-compass stats

Clear Cache

code-compass clear

Example Output

Text Format (for AI)

# Repository Map (Top 3 files, 15%)

## api.py (importance: 1.138)
⋮...
│def request(method, url, **kwargs):
│def get(url, params = None, **kwargs):
│def post(url, data = None, json = None, **kwargs):
⋮...

## models.py (importance: 0.856)
⋮...
│class User:
│  def __init__(self, name: str, email: str):
│  def save(self) -> bool:
⋮...

JSON Format (for tools)

{
  "total_files": 20,
  "included_files": 3,
  "files": [
    {
      "path": "api.py",
      "importance": 1.138,
      "symbols": [
        {
          "name": "request",
          "type": "function",
          "signature": "def request(method, url, **kwargs):",
          "line_start": 10
        }
      ]
    }
  ]
}

Performance Benchmarks

Tested on real-world open-source projects:

Project Files Symbols Index Time Speed Token Savings
requests 18 277 0.04s 497 f/s 99.0%
flask 24 407 0.05s 542 f/s 99.6%
django 901 11,072 1.55s 863 f/s 83.0%

AI Workflow Validation (requests library):

  • Traditional method: ~46,923 tokens, $0.47, 47s
  • Code Compass: ~209 tokens, $0.002, 0.2s
  • Savings: 99.6% tokens, 99.6% cost, 99.6% time

How It Works

  1. Parse - Extracts function/class signatures using Python AST
  2. Index - Caches results in SQLite for fast retrieval
  3. Analyze - Builds dependency graph and computes PageRank
  4. Generate - Selects top N% files and formats for AI

Why PageRank?

PageRank identifies the most "important" files in your codebase by analyzing the dependency graph. Files that are imported by many other files get higher scores.

Example from Django:

  • db/models/functions/datetime.py (score: 41.1) - Core database functions
  • utils/copy.py (score: 17.1) - Widely-used utilities
  • utils/inspect.py (score: 16.9) - Reflection tools

Architecture

code_compass/
├── models.py          # Data structures (Symbol, FileInfo, RepoMap)
├── parsers/
│   └── python_parser.py  # AST-based Python parser
├── cache.py           # SQLite cache manager
├── graph.py           # Dependency graph & PageRank
├── map_generator.py   # Core map generation logic
├── formatter.py       # Output formatters (text/JSON)
└── cli.py             # Command-line interface

Supported Languages

  • Python - Full support
  • 🚧 JavaScript/TypeScript - Coming soon
  • 🚧 Java - Planned
  • 🚧 Go - Planned

Limitations

  • Syntax errors: Files with syntax errors are skipped (not parsed)
  • Dynamic imports: importlib, __import__() not tracked
  • Reflection: getattr(), eval() not analyzed
  • Monorepos: Best used on single projects, not multi-project repos

Use Cases

1. AI-Powered Code Review

code-compass map > context.txt
# Send context.txt to AI: "Review this codebase for security issues"

2. Onboarding New Developers

code-compass map --top 0.1 > overview.txt
# New dev reads overview.txt to understand core modules

3. Refactoring Planning

code-compass find OldClassName --fuzzy
# Find all occurrences before renaming

4. Documentation Generation

code-compass map --format json | your-doc-generator
# Generate API docs from signatures

FAQ

Q: Why not use ctags or LSP? A: ctags is too simple (no type annotations), LSP is too heavy (designed for IDEs). Code Compass is optimized for AI context generation.

Q: Why AST instead of Tree-sitter? A: AST is built-in, zero-dependency, and 100% accurate for valid Python code. Tree-sitter is better for real-time editing, which isn't our use case.

Q: How is this different from Aider's repomap? A: Code Compass is a standalone tool with caching, making it 10x+ faster for repeated queries. It can be integrated into any AI workflow.

Q: What about incomplete code? A: Code Compass is designed for indexing stable codebases (e.g., git commits), not real-time editing. Syntax errors are gracefully skipped.


Testing

We have comprehensive test coverage (44 test cases, 100% pass rate):

# Run all tests
./run_all_tests.sh

Test suites:

  • test_python_parser.py - AST parsing
  • test_cache.py - SQLite caching
  • test_formatter.py - Output formatting
  • test_relative_imports.py - Import resolution
  • test_type_annotations.py - Type annotation handling
  • test_cache_performance.py - Performance benchmarks
  • test_real_project.py - Integration tests

Documentation


Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Priority areas:

  • JavaScript/TypeScript parser
  • Automatic file watching
  • Additional language support
  • Performance optimizations

License

MIT License - see LICENSE for details.


Acknowledgments

  • Inspired by Aider's repomap
  • PageRank algorithm by Larry Page and Sergey Brin
  • Built with ❤️ for the AI coding community
  • Special thanks to Gemini for rigorous code review

Project Status

Version: 0.1.0 MVP

Completed:

  • ✅ Python parser with full type annotation support
  • ✅ SQLite caching with WAL mode optimization
  • ✅ Dependency graph with PageRank
  • ✅ Relative import resolution
  • ✅ Map generator with text/JSON output
  • ✅ Complete CLI tool
  • ✅ Comprehensive test suite (44 tests)
  • ✅ Empirical validation on real projects

Next Steps:

  • 🔄 JavaScript/TypeScript support
  • 🔄 Automatic file watching
  • 🔄 Token budget optimization
  • 🔄 VSCode extension

Made with 🧭 by Xiangyu Li

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_code_compass-0.1.0.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_code_compass-0.1.0-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file ai_code_compass-0.1.0.tar.gz.

File metadata

  • Download URL: ai_code_compass-0.1.0.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.5

File hashes

Hashes for ai_code_compass-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b44265c664122fec2ced62f3f86c124d377e957ce03ad022e3d1ff276a59da20
MD5 47122e8cf61ff28f3b97e38f58e63e2a
BLAKE2b-256 662578f13ca799413233a980b297a2dbbbef40db0fbd17fc37bf12cc98e6f8ba

See more details on using hashes here.

File details

Details for the file ai_code_compass-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_code_compass-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8140ea1dda2684a13ae080f152e8323fde2b2df67219a99bd3e8345e5461f738
MD5 aafec739b3b5b8f6a50b07d9cea85633
BLAKE2b-256 b1d735292c852c3c17962b8731605ade2c54e67d8c8058d8610c97949f7f0c49

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page