Skip to main content

A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing

Project description

MCP Server Code Extractor

A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing. Extract functions, classes, and code snippets from 30+ programming languages without manual parsing.

Why MCP Server Code Extractor?

When working with AI coding assistants like Claude, you often need to:

  • Extract specific functions or classes from large codebases
  • Get an overview of what's in a file without reading the entire thing
  • Retrieve precise code snippets with accurate line numbers
  • Avoid manual parsing and grep/sed/awk gymnastics

MCP Server Code Extractor solves these problems by providing structured, tree-sitter-powered code extraction tools directly within your AI assistant.

Features

  • 🎯 Precise Extraction: Uses tree-sitter parsing for accurate code boundary detection
  • 🌍 30+ Languages: Supports Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, and many more
  • 📍 Line Numbers: Every extraction includes precise line number information
  • 🔍 Code Discovery: List all functions and classes in a file before extracting
  • 🌐 URL Support: Fetch and extract code from GitHub, GitLab, and direct file URLs
  • 🔄 Git Integration: Extract code from any git revision, branch, or tag
  • ⚡ Fast & Lightweight: Efficient caching and minimal dependencies
  • 🤖 AI-Optimized: Designed specifically for use with AI coding assistants

Installation

Quick Start with UV (Recommended)

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone this repository
git clone https://github.com/ctoth/mcp_server_code_extractor
cd mcp_server_code_extractor

# Run directly with UV (no installation needed!)
uv run mcp_server_code_extractor.py

Traditional Installation

pip install mcp[cli] tree-sitter-languages tree-sitter==0.21.3

Configure with Claude Desktop

Add to your Claude Desktop configuration:

{
  "mcpServers": {
    "mcp-server-code-extractor": {
      "command": "uv",
      "args": ["run", "/path/to/mcp_server_code_extractor.py"]
    }
  }
}

Or with traditional Python:

{
  "mcpServers": {
    "mcp-server-code-extractor": {
      "command": "python",
      "args": ["/path/to/mcp_server_code_extractor.py"]
    }
  }
}

Available Tools

1. get_symbols - Discover Code Structure

List all functions, classes, and other symbols in a file.

Returns:
- name: Symbol name
- type: function/class/method/etc
- start_line/end_line: Line numbers
- preview: First line of the symbol

2. get_function - Extract Complete Functions

Extract a complete function with all its code.

Parameters:
- file_path: Path to the source file
- function_name: Name of the function to extract

Returns:
- code: Complete function code
- start_line/end_line: Precise boundaries
- language: Detected language

3. get_class - Extract Complete Classes

Extract an entire class definition including all methods.

Parameters:
- file_path: Path to the source file
- class_name: Name of the class to extract

Returns:
- code: Complete class code
- start_line/end_line: Precise boundaries
- language: Detected language

4. get_lines - Extract Specific Line Ranges

Get exact line ranges when you know the line numbers.

Parameters:
- file_path: Path to the source file
- start_line: Starting line (1-based)
- end_line: Ending line (inclusive)

Returns:
- code: Extracted lines
- line numbers and metadata

5. get_signature - Get Function Signatures

Quickly get just the function signature without the body.

Parameters:
- file_path: Path to the source file
- function_name: Name of the function

Returns:
- signature: Function signature only
- start_line: Where the function starts

Usage Examples

Example 1: Exploring Local Files

# First, see what's in the file
symbols = get_symbols("src/main.py")
# Returns: List of all functions and classes with line numbers

# Extract a specific function
result = get_function("src/main.py", "process_data")
# Returns: Complete function code with line numbers

# Get just a function signature
sig = get_signature("src/main.py", "process_data")
# Returns: "def process_data(input_file: str, output_dir: Path) -> Dict[str, Any]:"

Example 2: Working with URLs

# Explore a GitHub file
symbols = get_symbols("https://raw.githubusercontent.com/user/repo/main/src/api.py")

# Extract function from GitLab
result = get_function("https://gitlab.com/user/project/-/raw/main/utils.py", "helper_func")

# Get lines from any URL
lines = get_lines("https://example.com/code/script.py", 10, 25)

Example 3: Working with Classes

# Extract an entire class (local file)
result = get_class("models/user.py", "User")
# Returns: Complete User class with all methods

# Extract class from URL
result = get_class("https://raw.githubusercontent.com/user/repo/main/models.py", "DatabaseModel")

# Get specific lines (e.g., just the __init__ method)
lines = get_lines("models/user.py", 10, 25)
# Returns: Lines 10-25 of the file

Example 4: Multi-Language Support

// Works with JavaScript/TypeScript
symbols = get_symbols("app.ts")
func = get_function("app.ts", "handleRequest")
// Works with Go
symbols = get_symbols("main.go")
method = get_function("main.go", "ServeHTTP")

Supported Languages

  • Python, JavaScript, TypeScript, JSX/TSX
  • Go, Rust, C, C++, C#, Java
  • Ruby, PHP, Swift, Kotlin, Scala
  • Bash, PowerShell, SQL
  • Haskell, OCaml, Elixir, Clojure
  • And many more...

Best Practices

  1. Always use get_symbols first when exploring a new file
  2. Use get_function/get_class instead of reading entire files
  3. Use get_lines when you know exact line numbers
  4. Use get_signature for quick API exploration

Why Not Just Use Read?

Traditional file reading tools require you to:

  • Read entire files (inefficient for large files)
  • Manually parse code to find functions/classes
  • Count lines manually for extraction
  • Deal with complex syntax and edge cases

MCP Server Code Extractor:

  • ✅ Extracts exactly what you need
  • ✅ Provides structured data with metadata
  • ✅ Handles complex syntax automatically
  • ✅ Works across 30+ languages consistently

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_server_code_extractor-0.2.4.tar.gz (29.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_server_code_extractor-0.2.4-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file mcp_server_code_extractor-0.2.4.tar.gz.

File metadata

File hashes

Hashes for mcp_server_code_extractor-0.2.4.tar.gz
Algorithm Hash digest
SHA256 1a9b206faeb06191cac3ce4f63f9636b81fa8904864c3057be6e8376f3b3eb3f
MD5 9506b2c8a69005b3fc684186b2f98aa5
BLAKE2b-256 6cdf922ff7ddaad5e184c31fe29af5baa5e74b0b4bc8fb27ebb3edefc2ca5d90

See more details on using hashes here.

File details

Details for the file mcp_server_code_extractor-0.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_server_code_extractor-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 282e44d7d1d8c302956558ca808fe48ee26c7639e48ff65bab29b3360aa2c238
MD5 f2afd0e6377b5686ac3c4aa83a4e7f29
BLAKE2b-256 767cd93a7ccb8b24a344b7adcca8e4eee9ccc885c7af51dc9ec72638db40d6da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page