Turn internal code libraries into AI-accessible knowledge sources via MCP server with semantic search
Project description
SimpleCodeMCP
Turn your internal code libraries into AI-accessible knowledge sources.
SimpleCodeMCP is an open-source tool that indexes internal code libraries and exposes them through a Model Context Protocol (MCP) server. This enables AI coding agents (Claude, GitHub Copilot, etc.) to provide precise, context-aware assistance for company-internal libraries—even when documentation is sparse or outdated.
The Problem
In many organizations:
- Team A builds internal libraries (e.g.,
pythonpackage1) - Team B uses these libraries to implement new software
- Documentation is often incomplete, outdated, or missing
- This leads to frequent misuse, implementation errors, and repeated questions
The Solution
SimpleCodeMCP scans and indexes your internal library's:
- Public and internal APIs
- Function/class signatures and type hints
- Docstrings and comments
- Tests (as usage examples)
- Code structure and relationships
It then exposes this knowledge through an MCP server that AI agents can query to:
- List available functions and classes
- Inspect signatures and behavior
- Retrieve real usage examples from tests
- Search relevant parts of the codebase semantically
Architecture
┌─────────────────────────────────────────────────────────┐
│ Your Library Repository │
│ ├── src/ (source code) │
│ ├── tests/ (usage examples) │
│ └── examples/ (additional examples) │
└────────────────────┬────────────────────────────────────┘
│
▼
┌──────────────────────┐
│ Indexer Component │
│ ───────────────── │
│ • AST Parser │ Extract structure
│ • Docstring Parser │ Extract documentation
│ • Test Parser │ Find usage patterns
│ • Static Analyzer │ Infer types & relationships
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Storage Layer │
│ ───────────────── │
│ • ChromaDB │ Semantic search (embeddings)
│ • Metadata Store │ Signatures, paths, etc.
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ MCP Server │
│ (FastAPI + MCP SDK) │
│ ───────────────── │
│ Available Tools: │
│ • list_api │
│ • get_signature │
│ • search_code │
│ • get_examples │
│ • get_tests │
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ AI Agent (Client) │
│ Claude, Copilot, │
│ or any MCP client │
└──────────────────────┘
Features
Multi-Language Support
- Python (MVP with full AST parsing, type inference)
- C++ (planned)
- JavaScript/TypeScript (planned)
- Java (planned)
- Go (planned)
- Extensible architecture for additional languages
Multiple Embedding Providers
- Local (sentence-transformers) - Free, offline, privacy-friendly
- OpenAI - High quality, fast, cloud-based
- Azure OpenAI - Enterprise support, data residency control
See EMBEDDING_PROVIDERS.md for detailed comparison and setup.
MCP Tools
The server exposes the following tools to AI agents:
list_api
Lists all available functions, classes, and modules in the library.
Parameters:
module(optional): Filter by specific module/namespace
Returns:
{
"functions": ["calculate_total", "validate_email"],
"classes": ["User", "Order"],
"modules": ["core", "utils", "api"]
}
get_signature
Retrieves detailed signature information for a function or class.
Parameters:
name: Function or class name
Returns:
{
"name": "calculate_total",
"signature": "calculate_total(items: List[Item], tax_rate: float = 0.19) -> Decimal",
"docstring": "Calculate the total price including tax...",
"file": "src/billing.py",
"line": 45,
"parameters": [
{"name": "items", "type": "List[Item]", "required": true},
{"name": "tax_rate", "type": "float", "default": "0.19"}
],
"return_type": "Decimal"
}
search_code
Semantic search across the codebase using natural language.
Parameters:
query: Natural language query (e.g., "How do I validate an email?")limit(optional): Maximum results (default: 10)
Returns:
{
"results": [
{
"name": "validate_email",
"relevance_score": 0.92,
"signature": "validate_email(email: str) -> bool",
"docstring": "Validates email format using regex...",
"file": "src/utils/validation.py"
}
]
}
get_examples
Retrieves usage examples from tests and example files.
Parameters:
name: Function or class name
Returns:
{
"examples": [
{
"source": "tests/test_billing.py",
"code": "result = calculate_total(items=[item1, item2], tax_rate=0.19)\nassert result == Decimal('119.00')",
"description": "Basic usage with two items"
}
]
}
get_tests
Retrieves all tests related to a function or class.
Parameters:
name: Function or class name
Returns:
{
"tests": [
{
"test_name": "test_calculate_total_with_default_tax",
"file": "tests/test_billing.py",
"line": 12,
"code": "..."
}
]
}
Installation
For Users
pip install simplecode-mcp
For Development
This project uses uv for fast Python package management:
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the repository
git clone https://github.com/yourusername/simplecode-mcp.git
cd simplecode-mcp
# Install dependencies and create virtual environment
uv sync
# Activate the virtual environment
source .venv/bin/activate # On Unix/macOS
# or
.venv\Scripts\activate # On Windows
Quick Start
1. Index Your Library
Create a configuration file simplecode_mcp.yaml:
library:
name: "my-internal-lib"
path: "/path/to/my-internal-lib"
language: "python" # python, c++, javascript, typescript, java, go
include_private: true # Index _internal functions too
indexing:
trigger: "manual" # manual | on_commit | watch
embedding_model: "local" # local (sentence-transformers) | openai
server:
host: "localhost"
port: 8000
auth: null # Optional: bearer_token for authentication
Index your library:
simplecode-mcp reindex
2. Start the MCP Server
simplecode-mcp serve
The server will start on http://localhost:8000.
3. Connect Your AI Agent
Add the MCP server to your agent's configuration:
For Claude Desktop (claude_desktop_config.json):
{
"mcpServers": {
"my-internal-lib": {
"command": "simplecode-mcp",
"args": ["serve"],
"cwd": "/path/to/my-internal-lib"
}
}
}
For GitHub Copilot (mcp.json):
{
"servers": {
"my-internal-lib": {
"url": "http://localhost:8000"
}
}
}
4. Use in Your IDE
Your AI agent can now answer questions like:
- "How do I use the
calculate_totalfunction?" - "Show me examples of email validation"
- "What parameters does the
Userclass constructor take?"
Configuration Options
Indexing Triggers
manual: Runsimplecode-mcp reindexmanuallyon_commit: Automatically reindex on git commits (via git hook)watch: Watch for file changes and reindex automatically
Embedding Models
-
local: Use sentence-transformers (e.g.,all-MiniLM-L6-v2)- Pros: No external API calls, works offline
- Cons: Slower, lower quality for complex queries
-
openai: Use OpenAI's embedding API- Pros: Fast, high quality
- Cons: Requires API key, not fully offline
Authentication
For internal company use, you can enable bearer token authentication:
server:
auth:
type: "bearer"
token: "your-secret-token"
Use Cases
1. Library Owner Perspective
You maintain an internal Python package used by 10 teams. Instead of answering the same questions repeatedly:
- Run SimpleCodeMCP once on your library
- Share the MCP server endpoint with consumer teams
- Their AI agents can now answer questions about your library autonomously
2. Library Consumer Perspective
You're implementing a new feature using an unfamiliar internal library:
- Connect your AI agent to the library's MCP server
- Ask: "How do I authenticate with the internal API?"
- Get instant, accurate examples from the library's tests
3. Onboarding New Developers
New team members can explore internal libraries through their AI assistant without digging through outdated wikis or bothering senior developers.
Roadmap
MVP (v0.1)
- Python support (AST parsing, docstrings, tests)
- Manual indexing trigger
- Local embedding model (sentence-transformers)
- Basic MCP tools (list_api, get_signature, search_code, get_examples)
- YAML configuration
Future Versions
- Incremental indexing (only changed files)
- C++ Support
- JavaScript/TypeScript support
- Git hook for automatic reindexing
- File watcher mode
- OpenAI embedding support
- Advanced relevance scoring
- Multi-version support (index v1.x and v2.x simultaneously)
- Web UI for browsing indexed libraries
- Integration with internal documentation systems
Contributing
Contributions are welcome! This project uses uv for dependency management.
Setup Development Environment
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and setup
git clone https://github.com/yourusername/simplecode-mcp.git
cd simplecode-mcp
uv sync
# Run tests
uv run pytest
# Run the CLI in development mode
uv run simplecode-mcp --help
Areas We'd Love Help With
- Support for additional languages (JS/TS, Java, Go, Rust)
- Better test example extraction
- Performance optimizations for large codebases
- Alternative embedding models
License
MIT License - see LICENSE for details.
Why "SimpleCodeMCP"?
Because complex internal libraries deserve simple, accessible knowledge interfaces. No more outdated docs, no more digging through source code—just ask your AI agent.
Built for teams that move fast and break things (but want to break fewer things).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file simplecode_mcp-0.1.0.tar.gz.
File metadata
- Download URL: simplecode_mcp-0.1.0.tar.gz
- Upload date:
- Size: 49.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d315dccf8d9a4c6dbf94511b0effefcc2b5ac247cc31fb45b8fa5b81531e0936
|
|
| MD5 |
ce18801009cc6049e585636b8e175ef4
|
|
| BLAKE2b-256 |
718ab7f268f7933254f357a632fbd0b09f370b3ca1d5422914a982bfa4819920
|
File details
Details for the file simplecode_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: simplecode_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 31.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3f118c97a25fd1010e386260d35832642a503a0ba72ab8c25e89e46d9dde820
|
|
| MD5 |
ce3ddf710341819cb133692cf844d054
|
|
| BLAKE2b-256 |
0e727a3e9ed1aac424ca374f3c5ceb9a620ddba20ab73ad1a7a4a5dc72a57e74
|