Skip to main content

Semantic file search for AI workstations using HNSW indexing

Project description

File Compass Logo

File Compass

Semantic file search for AI workstations using HNSW vector indexing

CI Codecov PyPI Python 3.10+ License

Find files by describing what you're looking for, not just by name

InstallationQuick StartMCP ServerHow It WorksContributing


Why File Compass?

Problem Solution
"Where's that database connection file?" file-compass search "database connection handling"
Keyword search misses semantic matches Vector embeddings understand meaning
Slow search across large codebases HNSW index: <100ms for 10K+ files
Need to integrate with AI assistants MCP server for Claude Code

Quick Start

# Install
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass && pip install -e .

# Pull embedding model
ollama pull nomic-embed-text

# Index your code
file-compass index -d "C:/Projects"

# Search semantically
file-compass search "authentication middleware"

Features

  • Semantic Search - Find files by describing what you're looking for
  • Quick Search - Instant filename/symbol search (no embedding required)
  • Multi-Language AST - Tree-sitter support for Python, JS, TS, Rust, Go
  • Result Explanations - Understand why each result matched
  • Local Embeddings - Uses Ollama (no API keys needed)
  • Fast Search - HNSW indexing for sub-second queries
  • Git-Aware - Optionally filter to git-tracked files only
  • MCP Server - Integrates with Claude Code and other MCP clients
  • Security Hardened - Input validation, path traversal protection

Installation

# Clone the repository
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass

# Create virtual environment
python -m venv venv
venv\Scripts\activate  # Windows
# or: source venv/bin/activate  # Linux/Mac

# Install dependencies
pip install -e .

# Pull the embedding model
ollama pull nomic-embed-text

Requirements

  • Python 3.10+
  • Ollama with nomic-embed-text model

Usage

Build the Index

# Index a directory
file-compass index -d "C:/Projects"

# Index multiple directories
file-compass index -d "C:/Projects" "D:/Code"

Search Files

# Semantic search
file-compass search "database connection handling"

# Filter by file type
file-compass search "training loop" --types python

# Git-tracked files only
file-compass search "API endpoints" --git-only

Quick Search (No Embeddings)

# Search by filename or symbol name
file-compass scan -d "C:/Projects"  # Build quick index

Check Status

file-compass status

MCP Server

File Compass includes an MCP server for integration with Claude Code and other AI assistants.

Available Tools

Tool Description
file_search Semantic search with explanations
file_preview Code preview with syntax highlighting
file_quick_search Fast filename/symbol search
file_quick_index_build Build the quick search index
file_actions Context, usages, related, history, symbols
file_index_status Check index statistics
file_index_scan Build or rebuild the full index

Claude Code Integration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "file-compass": {
      "command": "python",
      "args": ["-m", "file_compass.gateway"],
      "cwd": "C:/path/to/file-compass"
    }
  }
}

Configuration

Variable Default Description
FILE_COMPASS_DIRECTORIES F:/AI Comma-separated directories
FILE_COMPASS_OLLAMA_URL http://localhost:11434 Ollama server URL
FILE_COMPASS_EMBEDDING_MODEL nomic-embed-text Embedding model

How It Works

  1. Scanning - Discovers files matching configured extensions, respects .gitignore
  2. Chunking - Splits files into semantic pieces:
    • Python/JS/TS/Rust/Go: AST-aware via tree-sitter (functions, classes)
    • Markdown: Heading-based sections
    • JSON/YAML: Top-level keys
    • Other: Sliding window with overlap
  3. Embedding - Generates 768-dim vectors via Ollama
  4. Indexing - Stores vectors in HNSW index, metadata in SQLite
  5. Search - Embeds query, finds nearest neighbors, returns ranked results

Performance

Metric Value
Index Size ~1KB per chunk
Search Latency <100ms for 10K+ chunks
Quick Search <10ms for filename/symbol
Embedding Speed ~3-4s per chunk (local)

Architecture

file-compass/
├── file_compass/
│   ├── __init__.py      # Package init
│   ├── config.py        # Configuration
│   ├── embedder.py      # Ollama client with retry
│   ├── scanner.py       # File discovery
│   ├── chunker.py       # Multi-language AST chunking
│   ├── indexer.py       # HNSW + SQLite index
│   ├── quick_index.py   # Fast filename/symbol search
│   ├── explainer.py     # Result explanations
│   ├── merkle.py        # Incremental updates
│   ├── gateway.py       # MCP server
│   └── cli.py           # CLI
├── tests/               # 298 tests, 91% coverage
├── pyproject.toml
└── LICENSE

Security

  • Input Validation - All MCP inputs are validated
  • Path Traversal Protection - Files outside allowed directories blocked
  • SQL Injection Prevention - Parameterized queries only
  • Error Sanitization - Internal errors not exposed

Development

# Run tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=file_compass --cov-report=term-missing

# Type checking
mypy file_compass/

Related Projects

Part of MCP Tool Shop — the Compass Suite for AI-powered development:

Support

License

MIT License - see LICENSE for details.

Acknowledgments


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

file_compass-0.1.3.tar.gz (215.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

file_compass-0.1.3-py3-none-any.whl (46.5 kB view details)

Uploaded Python 3

File details

Details for the file file_compass-0.1.3.tar.gz.

File metadata

  • Download URL: file_compass-0.1.3.tar.gz
  • Upload date:
  • Size: 215.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for file_compass-0.1.3.tar.gz
Algorithm Hash digest
SHA256 6e11b1925db44a95c1eb4e828868dd296dc20da1873568161f05c1ec97766196
MD5 772921d25a583659c348cf255bc55652
BLAKE2b-256 090920a925a5c7af70436b3eb807fe75665372e067562190e011ae4714640241

See more details on using hashes here.

Provenance

The following attestation bundles were made for file_compass-0.1.3.tar.gz:

Publisher: publish.yml on mcp-tool-shop-org/file-compass

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file file_compass-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: file_compass-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 46.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for file_compass-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a92d1a501b882390a5355798a7b27a6bed751ad38c80cc8e4a95d6ab12c54372
MD5 7388815046f6f22bd3e1d8d69cc5da76
BLAKE2b-256 52c58b335358a5ef1c1683793523ed5d77f45278d25aa7b99e5b7e7640e2c79d

See more details on using hashes here.

Provenance

The following attestation bundles were made for file_compass-0.1.3-py3-none-any.whl:

Publisher: publish.yml on mcp-tool-shop-org/file-compass

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page