Semantic file search for AI workstations using HNSW indexing
Project description
日本語 | 中文 | Español | Français | हिन्दी | Italiano | Português (BR)
Semantic file search for AI workstations using HNSW vector indexing
Find files by describing what you're looking for, not just by name
Why File Compass?
| Problem | Solution |
|---|---|
| "Where's that database connection file?" | file-compass search "database connection handling" |
| Keyword search misses semantic matches | Vector embeddings understand meaning |
| Slow search across large codebases | HNSW index: <100ms for 10K+ files |
| Need to integrate with AI assistants | MCP server for Claude Code |
Quick Start
# Install
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass && pip install -e .
# Pull embedding model
ollama pull nomic-embed-text
# Index your code
file-compass index -d "C:/Projects"
# Search semantically
file-compass search "authentication middleware"
Features
- Semantic Search - Find files by describing what you're looking for
- Quick Search - Instant filename/symbol search (no embedding required)
- Multi-Language AST - Tree-sitter support for Python, JS, TS, Rust, Go
- Result Explanations - Understand why each result matched
- Local Embeddings - Uses Ollama (no API keys needed)
- Fast Search - HNSW indexing for sub-second queries
- Git-Aware - Optionally filter to git-tracked files only
- MCP Server - Integrates with Claude Code and other MCP clients
- Security Hardened - Input validation, path traversal protection
Installation
# Clone the repository
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
# or: source venv/bin/activate # Linux/Mac
# Install dependencies
pip install -e .
# Pull the embedding model
ollama pull nomic-embed-text
Requirements
- Python 3.10+
- Ollama with
nomic-embed-textmodel
Usage
Build the Index
# Index a directory
file-compass index -d "C:/Projects"
# Index multiple directories
file-compass index -d "C:/Projects" "D:/Code"
Search Files
# Semantic search
file-compass search "database connection handling"
# Filter by file type
file-compass search "training loop" --types python
# Git-tracked files only
file-compass search "API endpoints" --git-only
Quick Search (No Embeddings)
# Search by filename or symbol name
file-compass scan -d "C:/Projects" # Build quick index
Check Status
file-compass status
MCP Server
File Compass includes an MCP server for integration with Claude Code and other AI assistants.
Available Tools
| Tool | Description |
|---|---|
file_search |
Semantic search with explanations |
file_preview |
Code preview with syntax highlighting |
file_quick_search |
Fast filename/symbol search |
file_quick_index_build |
Build the quick search index |
file_actions |
Context, usages, related, history, symbols |
file_index_status |
Check index statistics |
file_index_scan |
Build or rebuild the full index |
Claude Code Integration
Add to your claude_desktop_config.json:
{
"mcpServers": {
"file-compass": {
"command": "python",
"args": ["-m", "file_compass.gateway"],
"cwd": "C:/path/to/file-compass"
}
}
}
Configuration
| Variable | Default | Description |
|---|---|---|
FILE_COMPASS_DIRECTORIES |
F:/AI |
Comma-separated directories |
FILE_COMPASS_OLLAMA_URL |
http://localhost:11434 |
Ollama server URL |
FILE_COMPASS_EMBEDDING_MODEL |
nomic-embed-text |
Embedding model |
How It Works
- Scanning - Discovers files matching configured extensions, respects
.gitignore - Chunking - Splits files into semantic pieces:
- Python/JS/TS/Rust/Go: AST-aware via tree-sitter (functions, classes)
- Markdown: Heading-based sections
- JSON/YAML: Top-level keys
- Other: Sliding window with overlap
- Embedding - Generates 768-dim vectors via Ollama
- Indexing - Stores vectors in HNSW index, metadata in SQLite
- Search - Embeds query, finds nearest neighbors, returns ranked results
Performance
| Metric | Value |
|---|---|
| Index Size | ~1KB per chunk |
| Search Latency | <100ms for 10K+ chunks |
| Quick Search | <10ms for filename/symbol |
| Embedding Speed | ~3-4s per chunk (local) |
Architecture
file-compass/
├── file_compass/
│ ├── __init__.py # Package init
│ ├── config.py # Configuration
│ ├── embedder.py # Ollama client with retry
│ ├── scanner.py # File discovery
│ ├── chunker.py # Multi-language AST chunking
│ ├── indexer.py # HNSW + SQLite index
│ ├── quick_index.py # Fast filename/symbol search
│ ├── explainer.py # Result explanations
│ ├── merkle.py # Incremental updates
│ ├── gateway.py # MCP server
│ └── cli.py # CLI
├── tests/ # 298 tests, 91% coverage
├── pyproject.toml
└── LICENSE
Security
- Input Validation - All MCP inputs are validated
- Path Traversal Protection - Files outside allowed directories blocked
- SQL Injection Prevention - Parameterized queries only
- Error Sanitization - Internal errors not exposed
Development
# Run tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=file_compass --cov-report=term-missing
# Type checking
mypy file_compass/
Related Projects
Part of MCP Tool Shop — the Compass Suite for AI-powered development:
- Tool Compass - Semantic MCP tool discovery
- Integradio - Vector-embedded Gradio components
- Backpropagate - Headless LLM fine-tuning
- Comfy Headless - ComfyUI without the complexity
Security & Data Scope
File Compass is a local-first semantic file search tool and MCP server.
- Data accessed: Local file contents for indexing, HNSW vector index, SQLite metadata, Ollama embeddings (local inference)
- Data NOT accessed: No cloud sync. No telemetry. No analytics. No external API calls beyond local Ollama
- Permissions: File system read for indexing, write for index storage. Path traversal protection enforced
Full policy: SECURITY.md
Scorecard
| Category | Score |
|---|---|
| A. Security | 10/10 |
| B. Error Handling | 10/10 |
| C. Operator Docs | 10/10 |
| D. Shipping Hygiene | 10/10 |
| E. Identity (soft) | 10/10 |
| Overall | 50/50 |
Support
- Questions / help: Discussions
- Bug reports: Issues
License
MIT License - see LICENSE for details.
Acknowledgments
- Ollama for local LLM inference
- hnswlib for fast vector search
- nomic-embed-text for embeddings
- tree-sitter for multi-language AST parsing
Built by MCP Tool Shop
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file file_compass-1.0.1.tar.gz.
File metadata
- Download URL: file_compass-1.0.1.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0321b0359fdcd9beebd75963bc019c4fd2a54268512c1acc631dff193a65ae4
|
|
| MD5 |
812b6cd5543cc0c9683040ed3d7a296f
|
|
| BLAKE2b-256 |
d4272ff7dc3518c6919b96e5e9cae8e0f1315ba6788d724d7e04cde76aa2cfd7
|
Provenance
The following attestation bundles were made for file_compass-1.0.1.tar.gz:
Publisher:
publish.yml on mcp-tool-shop-org/file-compass
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
file_compass-1.0.1.tar.gz -
Subject digest:
c0321b0359fdcd9beebd75963bc019c4fd2a54268512c1acc631dff193a65ae4 - Sigstore transparency entry: 1182551426
- Sigstore integration time:
-
Permalink:
mcp-tool-shop-org/file-compass@413f87dac988caf907ee24b321fa3e0e6d8673ae -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/mcp-tool-shop-org
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@413f87dac988caf907ee24b321fa3e0e6d8673ae -
Trigger Event:
release
-
Statement type:
File details
Details for the file file_compass-1.0.1-py3-none-any.whl.
File metadata
- Download URL: file_compass-1.0.1-py3-none-any.whl
- Upload date:
- Size: 48.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a586f2d6b27cabea0c1827ee0eec6160e637b72ad6d500784a12ce40d11d957
|
|
| MD5 |
d76453a31dbec645421095623fa0f096
|
|
| BLAKE2b-256 |
58beb31100eaabf741e543353b548cb10dcf173b43f3ba588102cf3c4c2756ec
|
Provenance
The following attestation bundles were made for file_compass-1.0.1-py3-none-any.whl:
Publisher:
publish.yml on mcp-tool-shop-org/file-compass
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
file_compass-1.0.1-py3-none-any.whl -
Subject digest:
9a586f2d6b27cabea0c1827ee0eec6160e637b72ad6d500784a12ce40d11d957 - Sigstore transparency entry: 1182551460
- Sigstore integration time:
-
Permalink:
mcp-tool-shop-org/file-compass@413f87dac988caf907ee24b321fa3e0e6d8673ae -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/mcp-tool-shop-org
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@413f87dac988caf907ee24b321fa3e0e6d8673ae -
Trigger Event:
release
-
Statement type: