A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing
Project description
MCP Server Code Extractor
A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing. Extract functions, classes, and code snippets from 30+ programming languages without manual parsing.
Why MCP Server Code Extractor?
When working with AI coding assistants like Claude, you often need to:
- Extract specific functions or classes from large codebases
- Get an overview of what's in a file without reading the entire thing
- Retrieve precise code snippets with accurate line numbers
- Avoid manual parsing and grep/sed/awk gymnastics
MCP Server Code Extractor solves these problems by providing structured, tree-sitter-powered code extraction tools directly within your AI assistant.
Features
- 🎯 Precise Extraction: Uses tree-sitter parsing for accurate code boundary detection
- 🌍 30+ Languages: Supports Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, and many more
- 📍 Line Numbers: Every extraction includes precise line number information
- 🔍 Code Discovery: List all functions and classes in a file before extracting
- 🌐 URL Support: Fetch and extract code from GitHub, GitLab, and direct file URLs
- 🔄 Git Integration: Extract code from any git revision, branch, or tag
- ⚡ Fast & Lightweight: Efficient caching and minimal dependencies
- 🤖 AI-Optimized: Designed specifically for use with AI coding assistants
Installation
Quick Start with UV (Recommended)
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone this repository
git clone https://github.com/ctoth/mcp_server_code_extractor
cd mcp_server_code_extractor
# Run directly with UV (no installation needed!)
uv run mcp_server_code_extractor.py
Traditional Installation
pip install mcp[cli] tree-sitter-languages tree-sitter==0.21.3
Configure with Claude Desktop
Add to your Claude Desktop configuration:
{
"mcpServers": {
"mcp-server-code-extractor": {
"command": "uv",
"args": ["run", "/path/to/mcp_server_code_extractor.py"]
}
}
}
Or with traditional Python:
{
"mcpServers": {
"mcp-server-code-extractor": {
"command": "python",
"args": ["/path/to/mcp_server_code_extractor.py"]
}
}
}
Available Tools
1. get_symbols - Discover Code Structure
List all functions, classes, and other symbols in a file.
Returns:
- name: Symbol name
- type: function/class/method/etc
- start_line/end_line: Line numbers
- preview: First line of the symbol
2. get_function - Extract Complete Functions
Extract a complete function with all its code.
Parameters:
- file_path: Path to the source file
- function_name: Name of the function to extract
Returns:
- code: Complete function code
- start_line/end_line: Precise boundaries
- language: Detected language
3. get_class - Extract Complete Classes
Extract an entire class definition including all methods.
Parameters:
- file_path: Path to the source file
- class_name: Name of the class to extract
Returns:
- code: Complete class code
- start_line/end_line: Precise boundaries
- language: Detected language
4. get_lines - Extract Specific Line Ranges
Get exact line ranges when you know the line numbers.
Parameters:
- file_path: Path to the source file
- start_line: Starting line (1-based)
- end_line: Ending line (inclusive)
Returns:
- code: Extracted lines
- line numbers and metadata
5. get_signature - Get Function Signatures
Quickly get just the function signature without the body.
Parameters:
- file_path: Path to the source file
- function_name: Name of the function
Returns:
- signature: Function signature only
- start_line: Where the function starts
Usage Examples
Example 1: Exploring Local Files
# First, see what's in the file
symbols = get_symbols("src/main.py")
# Returns: List of all functions and classes with line numbers
# Extract a specific function
result = get_function("src/main.py", "process_data")
# Returns: Complete function code with line numbers
# Get just a function signature
sig = get_signature("src/main.py", "process_data")
# Returns: "def process_data(input_file: str, output_dir: Path) -> Dict[str, Any]:"
Example 2: Working with URLs
# Explore a GitHub file
symbols = get_symbols("https://raw.githubusercontent.com/user/repo/main/src/api.py")
# Extract function from GitLab
result = get_function("https://gitlab.com/user/project/-/raw/main/utils.py", "helper_func")
# Get lines from any URL
lines = get_lines("https://example.com/code/script.py", 10, 25)
Example 3: Working with Classes
# Extract an entire class (local file)
result = get_class("models/user.py", "User")
# Returns: Complete User class with all methods
# Extract class from URL
result = get_class("https://raw.githubusercontent.com/user/repo/main/models.py", "DatabaseModel")
# Get specific lines (e.g., just the __init__ method)
lines = get_lines("models/user.py", 10, 25)
# Returns: Lines 10-25 of the file
Example 4: Multi-Language Support
// Works with JavaScript/TypeScript
symbols = get_symbols("app.ts")
func = get_function("app.ts", "handleRequest")
// Works with Go
symbols = get_symbols("main.go")
method = get_function("main.go", "ServeHTTP")
Supported Languages
- Python, JavaScript, TypeScript, JSX/TSX
- Go, Rust, C, C++, C#, Java
- Ruby, PHP, Swift, Kotlin, Scala
- Bash, PowerShell, SQL
- Haskell, OCaml, Elixir, Clojure
- And many more...
Best Practices
- Always use
get_symbolsfirst when exploring a new file - Use
get_function/get_classinstead of reading entire files - Use
get_lineswhen you know exact line numbers - Use
get_signaturefor quick API exploration
Why Not Just Use Read?
Traditional file reading tools require you to:
- Read entire files (inefficient for large files)
- Manually parse code to find functions/classes
- Count lines manually for extraction
- Deal with complex syntax and edge cases
MCP Server Code Extractor:
- ✅ Extracts exactly what you need
- ✅ Provides structured data with metadata
- ✅ Handles complex syntax automatically
- ✅ Works across 30+ languages consistently
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License - see LICENSE file for details.
Acknowledgments
- Built on tree-sitter for robust parsing
- Uses tree-sitter-languages for language support
- Implements the Model Context Protocol specification
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_server_code_extractor-0.2.4.tar.gz.
File metadata
- Download URL: mcp_server_code_extractor-0.2.4.tar.gz
- Upload date:
- Size: 29.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.7.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a9b206faeb06191cac3ce4f63f9636b81fa8904864c3057be6e8376f3b3eb3f
|
|
| MD5 |
9506b2c8a69005b3fc684186b2f98aa5
|
|
| BLAKE2b-256 |
6cdf922ff7ddaad5e184c31fe29af5baa5e74b0b4bc8fb27ebb3edefc2ca5d90
|
File details
Details for the file mcp_server_code_extractor-0.2.4-py3-none-any.whl.
File metadata
- Download URL: mcp_server_code_extractor-0.2.4-py3-none-any.whl
- Upload date:
- Size: 18.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.7.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
282e44d7d1d8c302956558ca808fe48ee26c7639e48ff65bab29b3360aa2c238
|
|
| MD5 |
f2afd0e6377b5686ac3c4aa83a4e7f29
|
|
| BLAKE2b-256 |
767cd93a7ccb8b24a344b7adcca8e4eee9ccc885c7af51dc9ec72638db40d6da
|