Skip to main content

Semantic file search for AI workstations using HNSW indexing

Project description

File Compass Logo

File Compass

Semantic file search for AI workstations using HNSW vector indexing

CI Codecov PyPI Python 3.10+ License

Find files by describing what you're looking for, not just by name

InstallationQuick StartMCP ServerHow It WorksContributing


Why File Compass?

Problem Solution
"Where's that database connection file?" file-compass search "database connection handling"
Keyword search misses semantic matches Vector embeddings understand meaning
Slow search across large codebases HNSW index: <100ms for 10K+ files
Need to integrate with AI assistants MCP server for Claude Code

Quick Start

# Install
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass && pip install -e .

# Pull embedding model
ollama pull nomic-embed-text

# Index your code
file-compass index -d "C:/Projects"

# Search semantically
file-compass search "authentication middleware"

Features

  • Semantic Search - Find files by describing what you're looking for
  • Quick Search - Instant filename/symbol search (no embedding required)
  • Multi-Language AST - Tree-sitter support for Python, JS, TS, Rust, Go
  • Result Explanations - Understand why each result matched
  • Local Embeddings - Uses Ollama (no API keys needed)
  • Fast Search - HNSW indexing for sub-second queries
  • Git-Aware - Optionally filter to git-tracked files only
  • MCP Server - Integrates with Claude Code and other MCP clients
  • Security Hardened - Input validation, path traversal protection

Installation

# Clone the repository
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass

# Create virtual environment
python -m venv venv
venv\Scripts\activate  # Windows
# or: source venv/bin/activate  # Linux/Mac

# Install dependencies
pip install -e .

# Pull the embedding model
ollama pull nomic-embed-text

Requirements

  • Python 3.10+
  • Ollama with nomic-embed-text model

Usage

Build the Index

# Index a directory
file-compass index -d "C:/Projects"

# Index multiple directories
file-compass index -d "C:/Projects" "D:/Code"

Search Files

# Semantic search
file-compass search "database connection handling"

# Filter by file type
file-compass search "training loop" --types python

# Git-tracked files only
file-compass search "API endpoints" --git-only

Quick Search (No Embeddings)

# Search by filename or symbol name
file-compass scan -d "C:/Projects"  # Build quick index

Check Status

file-compass status

MCP Server

File Compass includes an MCP server for integration with Claude Code and other AI assistants.

Available Tools

Tool Description
file_search Semantic search with explanations
file_preview Code preview with syntax highlighting
file_quick_search Fast filename/symbol search
file_quick_index_build Build the quick search index
file_actions Context, usages, related, history, symbols
file_index_status Check index statistics
file_index_scan Build or rebuild the full index

Claude Code Integration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "file-compass": {
      "command": "python",
      "args": ["-m", "file_compass.gateway"],
      "cwd": "C:/path/to/file-compass"
    }
  }
}

Configuration

Variable Default Description
FILE_COMPASS_DIRECTORIES F:/AI Comma-separated directories
FILE_COMPASS_OLLAMA_URL http://localhost:11434 Ollama server URL
FILE_COMPASS_EMBEDDING_MODEL nomic-embed-text Embedding model

How It Works

  1. Scanning - Discovers files matching configured extensions, respects .gitignore
  2. Chunking - Splits files into semantic pieces:
    • Python/JS/TS/Rust/Go: AST-aware via tree-sitter (functions, classes)
    • Markdown: Heading-based sections
    • JSON/YAML: Top-level keys
    • Other: Sliding window with overlap
  3. Embedding - Generates 768-dim vectors via Ollama
  4. Indexing - Stores vectors in HNSW index, metadata in SQLite
  5. Search - Embeds query, finds nearest neighbors, returns ranked results

Performance

Metric Value
Index Size ~1KB per chunk
Search Latency <100ms for 10K+ chunks
Quick Search <10ms for filename/symbol
Embedding Speed ~3-4s per chunk (local)

Architecture

file-compass/
├── file_compass/
│   ├── __init__.py      # Package init
│   ├── config.py        # Configuration
│   ├── embedder.py      # Ollama client with retry
│   ├── scanner.py       # File discovery
│   ├── chunker.py       # Multi-language AST chunking
│   ├── indexer.py       # HNSW + SQLite index
│   ├── quick_index.py   # Fast filename/symbol search
│   ├── explainer.py     # Result explanations
│   ├── merkle.py        # Incremental updates
│   ├── gateway.py       # MCP server
│   └── cli.py           # CLI
├── tests/               # 298 tests, 91% coverage
├── pyproject.toml
└── LICENSE

Security

  • Input Validation - All MCP inputs are validated
  • Path Traversal Protection - Files outside allowed directories blocked
  • SQL Injection Prevention - Parameterized queries only
  • Error Sanitization - Internal errors not exposed

Development

# Run tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=file_compass --cov-report=term-missing

# Type checking
mypy file_compass/

Related Projects

Part of MCP Tool Shop — the Compass Suite for AI-powered development:

Support

License

MIT License - see LICENSE for details.

Acknowledgments


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

file_compass-0.1.2.tar.gz (138.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

file_compass-0.1.2-py3-none-any.whl (46.8 kB view details)

Uploaded Python 3

File details

Details for the file file_compass-0.1.2.tar.gz.

File metadata

  • Download URL: file_compass-0.1.2.tar.gz
  • Upload date:
  • Size: 138.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for file_compass-0.1.2.tar.gz
Algorithm Hash digest
SHA256 25f4b6e3cfd10141dfd7f27a014b9772679aa034d51a7c0147b4907db159b4a3
MD5 44cd8271df628dc7cf5ce5aff6635d03
BLAKE2b-256 797aafc56aeb8fd815c42b479f46da55f81727bf2bb03c89b02fcc3654fa9cdf

See more details on using hashes here.

File details

Details for the file file_compass-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: file_compass-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 46.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for file_compass-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2e928ad7a9ef7ff211d822c9704fc0604586cd9889dd8f3d0ca64f814e9ea5fc
MD5 196d0be402dfc343517e1ba6e2201e97
BLAKE2b-256 e710660512b7f423df378fa269b176665e317c9ab0c71e2a9b028fafbc719594

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page