Model Context Protocol server for IDE-like code navigation and semantic search
Project description
Sigil MCP Server
A Model Context Protocol (MCP) server that provides IDE-like code navigation and search for local repositories. Gives AI assistants like ChatGPT powerful code exploration capabilities including symbol search, trigram indexing, and semantic navigation.
Features
Hybrid Code Search
- Fast text search using trigram indexing (inspired by GitHub's Blackbird)
- Symbol-based search for functions, classes, methods, and variables
- Semantic code search with vector embeddings (optional)
- File structure view showing code outlines
- Automatic index updates with file watching (optional)
Production Ready
- Thread-safe concurrent access (SQLite WAL mode + RLock serialization)
- File watcher, HTTP handlers, and vector indexing run safely in parallel
- No "database is locked" errors from concurrent operations
Enterprise Security
- OAuth 2.0 authentication with PKCE support for remote access
- Local connection bypass (no auth needed for localhost)
- API key fallback and IP whitelisting
Available Tools
index_repository- Build searchable index with symbol extractionsearch_code- Fast substring search across repositoriesgoto_definition- Find symbol definitionslist_symbols- View file/repo structurebuild_vector_index- Generate semantic embeddings for code (optional)semantic_search- Natural language code search using embeddingslist_repos,read_repo_file,list_repo_files,search_repo- Basic operationsget_index_stats,ping- Server info and health checks
Quick Start
Installation
Clone and install dependencies:
git clone https://github.com/Superuser666-Sigil/SigilDERG-Custom-MCP.git
cd SigilDERG-Custom-MCP
pip install -e .
# Optional: Install file watching support
pip install -e .[watch]
# Optional: Install vector embeddings - choose based on your hardware:
# For sentence-transformers (NVIDIA GPUs, or CPU)
pip install -e .[embeddings-sentencetransformers]
# For OpenAI API (cloud-based)
pip install -e .[embeddings-openai]
# For llama.cpp - choose your acceleration:
pip install -e .[embeddings-llamacpp-cpu] # CPU only
pip install -e .[embeddings-llamacpp-cuda] # NVIDIA GPU (CUDA)
pip install -e .[embeddings-llamacpp-rocm] # AMD GPU (ROCm)
pip install -e .[embeddings-llamacpp-metal] # Apple Silicon (Metal)
# Or install all embedding providers (not recommended)
pip install -e .[embeddings-all]
Install Universal Ctags for symbol extraction (optional but recommended):
macOS: brew install universal-ctags
Ubuntu/Debian: sudo apt install universal-ctags
Arch Linux: sudo pacman -S ctags
Configuration
Copy the example config and edit with your repository paths:
cp config.example.json config.json
# Edit config.json
Example configuration:
{
"repositories": {
"my_project": "/absolute/path/to/your/project",
"another_repo": "/path/to/another/repo"
}
}
Alternatively, use environment variables:
export SIGIL_REPO_MAP="my_project:/path/to/project;another:/path/to/another"
Running the Server
python server.py
On first run, OAuth credentials will be generated. Save the Client ID and Client Secret for connecting from ChatGPT.
Connecting to ChatGPT
[!IMPORTANT] Using Cloudflare Tunnel? You must disable Bot Fight Mode or ChatGPT's OAuth will fail.
📖 See Cloudflare OAuth Issue & Solution for details.
- Expose via ngrok:
ngrok http 8000(or use Cloudflare Tunnel) - In ChatGPT, add MCP connector with OAuth authentication
- Use the OAuth credentials from server startup
- Start using: "Search my code for async functions"
Important: The server is configured for ChatGPT compatibility:
- DNS rebinding protection is disabled (ChatGPT sends ngrok Host headers)
- MCP endpoint mounted at root
/(not/mcp) - OAuth authentication remains active and required
See docs/CHATGPT_SETUP.md for detailed instructions.
Configuration
Using config.json
{
"server": {
"name": "sigil_repos",
"host": "127.0.0.1",
"port": 8000,
"log_level": "INFO"
},
"authentication": {
"enabled": true,
"oauth_enabled": true,
"allow_local_bypass": true,
"allowed_ips": []
},
"repositories": {
"repo_name": "/absolute/path/to/repo"
},
"watch": {
"enabled": true,
"debounce_seconds": 2.0,
"ignore_dirs": [".git", "__pycache__", "node_modules", "build"],
"ignore_extensions": [".pyc", ".so", ".pdf", ".png"]
},
"index": {
"path": "~/.sigil_index"
}
}
Using Environment Variables
export SIGIL_MCP_HOST=127.0.0.1
export SIGIL_MCP_PORT=8000
export SIGIL_MCP_AUTH_ENABLED=true
export SIGIL_MCP_OAUTH_ENABLED=true
export SIGIL_MCP_ALLOW_LOCAL_BYPASS=true
export SIGIL_MCP_WATCH_ENABLED=true
export SIGIL_MCP_WATCH_DEBOUNCE=2.0
export SIGIL_REPO_MAP="name1:/path/to/repo1;name2:/path/to/repo2"
export SIGIL_INDEX_PATH=~/.sigil_index
File Watching (Optional)
Enable automatic index updates when files change:
# Install watchdog
pip install .[watch]
# Enable in config.json or via environment
export SIGIL_MCP_WATCH_ENABLED=true
The server will:
- Granularly re-index individual files as they change (modified/created)
- Batch updates with configurable debounce (default 2 seconds)
- Smart filtering using configurable ignore patterns
Configure what to ignore in config.json:
{
"watch": {
"enabled": true,
"debounce_seconds": 2.0,
"ignore_dirs": [".git", "__pycache__", "coverage", "htmlcov"],
"ignore_extensions": [".pyc", ".so", ".pdf", ".png", ".jpg"]
}
}
Environment variables:
export SIGIL_MCP_WATCH_ENABLED=true
export SIGIL_MCP_WATCH_DEBOUNCE=2.0
Authentication
OAuth 2.0 (Recommended for Remote Access)
OAuth credentials are generated on first run. Supports PKCE for enhanced security and token-based authentication with refresh capabilities. See docs/OAUTH_SETUP.md for details.
Local Development
Localhost connections automatically bypass authentication. No credentials needed when connecting from 127.0.0.1.
API Key Fallback
export SIGIL_MCP_API_KEY=your_secure_key_here
See docs/SECURITY.md for security best practices.
Usage Examples
Once connected to ChatGPT as an MCP server:
You: "Index my project repository"
ChatGPT: Indexed 342 files, found 1,847 symbols in 3.2 seconds
You: "Find where the HttpClient class is defined"
ChatGPT: Found in project::src/http/client.py at line 45
You: "Search for async functions"
ChatGPT: Found 23 matches across 8 files
You: "Build vector index for semantic search"
ChatGPT: Indexed 856 chunks from 342 documents
You: "Find code that handles user authentication"
ChatGPT: Found 5 relevant code sections (semantic search):
- auth/handlers.py:45-145 (score: 0.89)
- middleware/auth.py:12-112 (score: 0.84)
...
```tGPT: Found 23 matches across 8 files
Architecture
Indexing Process
- File scanning (skips build artifacts)
- Content storage with SHA-256 deduplication
- Symbol extraction via universal-ctags
- Trigram inverted index generation
- Compression using zlib
Storage
~/.sigil_index/
├── repos.db # SQLite: repos, documents, symbols, embeddings
├── trigrams.db # SQLite: trigram inverted index
└── blobs/ # Compressed content
``` blobs/ # Compressed content
Performance
- Symbol lookup: O(log n) via SQLite indexes
- Text search: O(k) where k = trigrams * documents per trigram
- Typical query latency: 10-100ms
Security
Path Traversal Protection: All paths validated to prevent escaping repository roots
Authentication Layers: OAuth 2.0 (primary), Local bypass (localhost), API keys (fallback), IP whitelist (optional)
Protection: Source code requires authentication for remote access, OAuth credentials stored with 0600 permissions, tokens expire after 1 hour with refresh support, PKCE prevents authorization code interception
ChatGPT Compatibility: For ChatGPT MCP connector compatibility, DNS rebinding protection is disabled. This means:
- [NO] Host header validation: Disabled (accepts ngrok domains)
- [NO] Content-Type validation: Disabled (accepts application/octet-stream)
- [YES] OAuth 2.0 authentication: Active and required
- [YES] Bearer token validation: Active
- [YES] Token expiration: Enforced
See docs/SECURITY.md for detailed security documentation.
Troubleshooting
For detailed troubleshooting, see docs/TROUBLESHOOTING.md and docs/RUNBOOK.md.
Quick fixes:
"ctags not available": Install universal-ctags (see Quick Start). Text search works without it.
"No repositories configured": Set repositories in config.json or SIGIL_REPO_MAP environment variable.
"Authentication failed": For localhost, verify allow_local_bypass is true. For remote, verify OAuth credentials.
"watchdog not available": Install with pip install sigil-mcp-server[watch] to enable file watching.
More help: See comprehensive Troubleshooting Guide and Operations Runbook.
Documentation
Setup Guides
- ChatGPT Setup Guide
- OAuth Configuration
- Cloudflare Tunnel Deployment
- Security Best Practices
- Operations Runbook
- Troubleshooting Guide
- Llama.cpp Local Embeddings
Architecture Decision Records (ADRs)
- ADR-001: OAuth 2.0 Authentication
- ADR-002: Trigram-Based Indexing
- ADR-003: Symbol Extraction with Ctags
- ADR-004: JSON Configuration System
- ADR-005: FastMCP Custom Routes
- ADR-006: Vector Embeddings for Semantic Search
- ADR-007: File Watching
- ADR-008: Granular Re-indexing and Configurable Patterns
- ADR-009: ChatGPT MCP Connector Compatibility
Feature Documentation
Contributing
Contributions welcome! Please see CONTRIBUTING.md for guidelines including:
- Contributor License Agreement (CLA) - Required for all contributors
- Developer Certificate of Origin (DCO) requirements
- Code standards and testing requirements
- Pull request process
- Code of Conduct
Licensing
Sigil is dual-licensed:
-
Open Source: Available under AGPLv3 for open-source projects and private use where source sharing requirements are met.
-
Commercial: A commercial license is required for organizations who wish to run Sigil internally without open-sourcing their own applications or who need indemnification and support.
Contact me for commercial licensing options.
See LICENSE file for full AGPLv3 text.
Licensing FAQ
Q: Can I run this inside my company under AGPLv3?
A: Yes, as long as you're comfortable with AGPLv3 and its requirements. If you expose the server to users over a network (like running it as an internal service), AGPLv3 requires making the source code available to those users, including any modifications you've made.
Q: We have a "no AGPL" policy. Can we still use Sigil?
A: Yes, via a commercial license. Email davetmire85@gmail.com to discuss your needs.
Q: Why do I have to sign a CLA to contribute?
A: The Contributor License Agreement keeps the licensing story clean—AGPLv3 for the open-source community, commercial licenses for organizations that need them—without legal ambiguity about who owns what. Your contribution remains open-source under AGPLv3; the CLA just clarifies the rights.
Q: What's included in a commercial license?
A: Commercial licenses provide freedom to use Sigil internally without open-source requirements, ability to keep modifications proprietary, indemnification and support options, and clear legal status for enterprise compliance. Contact me for details and pricing.
Q: Can I use this for my personal projects?
A: Absolutely! AGPLv3 is perfect for personal projects, hobbyist use, and small teams. You only need a commercial license if you have organizational requirements that conflict with AGPL.
For more details on contributing, see CONTRIBUTING.md.
Acknowledgments
- Trigram indexing inspired by GitHub's Blackbird search engine
- Symbol extraction powered by Universal Ctags
- Built on the Model Context Protocol (MCP) specification
Support
Issues: GitHub Issues Documentation: docs/ Security: docs/SECURITY.md
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sigil_mcp_server-0.3.2.tar.gz.
File metadata
- Download URL: sigil_mcp_server-0.3.2.tar.gz
- Upload date:
- Size: 141.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53ee17a2657609fecef686f1d3852f8fdffda113a5264aa52023a7b3c2352074
|
|
| MD5 |
5334bc5008391a790c9c0c054930f931
|
|
| BLAKE2b-256 |
afd0bd31901975d1a9d97d352a6300883ee2a557477113969cf800c68e0f08d7
|
File details
Details for the file sigil_mcp_server-0.3.2-py3-none-any.whl.
File metadata
- Download URL: sigil_mcp_server-0.3.2-py3-none-any.whl
- Upload date:
- Size: 62.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51fafe20c9848c64eb05557548b557ce7ffc3f83842db3ac6c8114b3013128a1
|
|
| MD5 |
9d2f846a37a7a5a070d783244998e056
|
|
| BLAKE2b-256 |
96bbca3645f2d0b15f0afefca68414ce2f52582fb697910d0b48644af43126c3
|