Skip to main content

Semantic file search for AI workstations using HNSW indexing

Project description

日本語 | 中文 | Español | Français | हिन्दी | Italiano | Português (BR)

File Compass

CI PyPI MIT License Landing Page

Semantic file search for AI workstations using HNSW vector indexing

Find files by describing what you're looking for, not just by name


Why File Compass?

Problem Solution
"Where's that database connection file?" file-compass search "database connection handling"
Keyword search misses semantic matches Vector embeddings understand meaning
Slow search across large codebases HNSW index: <100ms for 10K+ files
Need to integrate with AI assistants MCP server for Claude Code

Quick Start

# Install
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass && pip install -e .

# Pull embedding model
ollama pull nomic-embed-text

# Index your code
file-compass index -d "C:/Projects"

# Search semantically
file-compass search "authentication middleware"

Features

  • Semantic Search - Find files by describing what you're looking for
  • Quick Search - Instant filename/symbol search (no embedding required)
  • Multi-Language AST - Tree-sitter support for Python, JS, TS, Rust, Go
  • Result Explanations - Understand why each result matched
  • Local Embeddings - Uses Ollama (no API keys needed)
  • Fast Search - HNSW indexing for sub-second queries
  • Git-Aware - Optionally filter to git-tracked files only
  • MCP Server - Integrates with Claude Code and other MCP clients
  • Security Hardened - Input validation, path traversal protection

Installation

# Clone the repository
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass

# Create virtual environment
python -m venv venv
venv\Scripts\activate  # Windows
# or: source venv/bin/activate  # Linux/Mac

# Install dependencies
pip install -e .

# Pull the embedding model
ollama pull nomic-embed-text

Requirements

  • Python 3.10+
  • Ollama with nomic-embed-text model

Usage

Build the Index

# Index a directory
file-compass index -d "C:/Projects"

# Index multiple directories
file-compass index -d "C:/Projects" "D:/Code"

Search Files

# Semantic search
file-compass search "database connection handling"

# Filter by file type
file-compass search "training loop" --types python

# Git-tracked files only
file-compass search "API endpoints" --git-only

Quick Search (No Embeddings)

# Search by filename or symbol name
file-compass scan -d "C:/Projects"  # Build quick index

Check Status

file-compass status

MCP Server

File Compass includes an MCP server for integration with Claude Code and other AI assistants.

Available Tools

Tool Description
file_search Semantic search with explanations
file_preview Code preview with syntax highlighting
file_quick_search Fast filename/symbol search
file_quick_index_build Build the quick search index
file_actions Context, usages, related, history, symbols
file_index_status Check index statistics
file_index_scan Build or rebuild the full index

Claude Code Integration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "file-compass": {
      "command": "python",
      "args": ["-m", "file_compass.gateway"],
      "cwd": "C:/path/to/file-compass"
    }
  }
}

Configuration

Variable Default Description
FILE_COMPASS_DIRECTORIES F:/AI Comma-separated directories
FILE_COMPASS_OLLAMA_URL http://localhost:11434 Ollama server URL
FILE_COMPASS_EMBEDDING_MODEL nomic-embed-text Embedding model

How It Works

  1. Scanning - Discovers files matching configured extensions, respects .gitignore
  2. Chunking - Splits files into semantic pieces:
    • Python/JS/TS/Rust/Go: AST-aware via tree-sitter (functions, classes)
    • Markdown: Heading-based sections
    • JSON/YAML: Top-level keys
    • Other: Sliding window with overlap
  3. Embedding - Generates 768-dim vectors via Ollama
  4. Indexing - Stores vectors in HNSW index, metadata in SQLite
  5. Search - Embeds query, finds nearest neighbors, returns ranked results

Performance

Metric Value
Index Size ~1KB per chunk
Search Latency <100ms for 10K+ chunks
Quick Search <10ms for filename/symbol
Embedding Speed ~3-4s per chunk (local)

Architecture

file-compass/
├── file_compass/
│   ├── __init__.py      # Package init
│   ├── config.py        # Configuration
│   ├── embedder.py      # Ollama client with retry
│   ├── scanner.py       # File discovery
│   ├── chunker.py       # Multi-language AST chunking
│   ├── indexer.py       # HNSW + SQLite index
│   ├── quick_index.py   # Fast filename/symbol search
│   ├── explainer.py     # Result explanations
│   ├── merkle.py        # Incremental updates
│   ├── gateway.py       # MCP server
│   └── cli.py           # CLI
├── tests/               # 298 tests, 91% coverage
├── pyproject.toml
└── LICENSE

Security

  • Input Validation - All MCP inputs are validated
  • Path Traversal Protection - Files outside allowed directories blocked
  • SQL Injection Prevention - Parameterized queries only
  • Error Sanitization - Internal errors not exposed

Development

# Run tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=file_compass --cov-report=term-missing

# Type checking
mypy file_compass/

Related Projects

Part of MCP Tool Shop — the Compass Suite for AI-powered development:

Security & Data Scope

File Compass is a local-first semantic file search tool and MCP server.

  • Data accessed: Local file contents for indexing, HNSW vector index, SQLite metadata, Ollama embeddings (local inference)
  • Data NOT accessed: No cloud sync. No telemetry. No analytics. No external API calls beyond local Ollama
  • Permissions: File system read for indexing, write for index storage. Path traversal protection enforced

Full policy: SECURITY.md


Scorecard

Category Score
A. Security 10/10
B. Error Handling 10/10
C. Operator Docs 10/10
D. Shipping Hygiene 10/10
E. Identity (soft) 10/10
Overall 50/50

Support

License

MIT License - see LICENSE for details.

Acknowledgments


Built by MCP Tool Shop

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

file_compass-1.0.0.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

file_compass-1.0.0-py3-none-any.whl (47.1 kB view details)

Uploaded Python 3

File details

Details for the file file_compass-1.0.0.tar.gz.

File metadata

  • Download URL: file_compass-1.0.0.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for file_compass-1.0.0.tar.gz
Algorithm Hash digest
SHA256 37e9bc35c48cf73f9014c2db73f72fafd4a97fb584649778081c8d1047d5dbeb
MD5 7573145506dc075533f050e612ef5ef5
BLAKE2b-256 3f212c00e58ddbac8e5404e97e366fd522eb9b92077e7046cecbb33bce84e967

See more details on using hashes here.

File details

Details for the file file_compass-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: file_compass-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 47.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for file_compass-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6b3addc1ec6d2249e21d4957724e039b6ca9fcb6f74fdd2c88d612a5c66bc372
MD5 0b25864ccc44a9c9bd835d8ae5369056
BLAKE2b-256 37482cbc8d95820883c7f2f4f43ecb083be3a887fdda2170a24665d3a9448f44

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page