Skip to main content

A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing

Project description

MCP Server Code Extractor

A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing. Extract functions, classes, and code snippets from 30+ programming languages without manual parsing.

Why MCP Server Code Extractor?

When working with AI coding assistants like Claude, you often need to:

  • Extract specific functions or classes from large codebases
  • Get an overview of what's in a file without reading the entire thing
  • Retrieve precise code snippets with accurate line numbers
  • Avoid manual parsing and grep/sed/awk gymnastics

MCP Server Code Extractor solves these problems by providing structured, tree-sitter-powered code extraction tools directly within your AI assistant.

Features

  • 🎯 Precise Extraction: Uses tree-sitter parsing for accurate code boundary detection
  • 🌍 30+ Languages: Supports Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, and many more
  • 📍 Line Numbers: Every extraction includes precise line number information
  • 🔍 Code Discovery: List all functions and classes in a file before extracting
  • 🌐 URL Support: Fetch and extract code from GitHub, GitLab, and direct file URLs
  • 🔄 Git Integration: Extract code from any git revision, branch, or tag
  • ⚡ Fast & Lightweight: Efficient caching and minimal dependencies
  • 🤖 AI-Optimized: Designed specifically for use with AI coding assistants

Installation

Quick Start with UV (Recommended)

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone this repository
git clone https://github.com/ctoth/mcp_server_code_extractor
cd mcp_server_code_extractor

# Run directly with UV (no installation needed!)
uv run mcp_server_code_extractor.py

Traditional Installation

pip install mcp[cli] tree-sitter-languages tree-sitter==0.21.3

Configure with Claude Desktop

Add to your Claude Desktop configuration:

{
  "mcpServers": {
    "mcp-server-code-extractor": {
      "command": "uv",
      "args": ["run", "/path/to/mcp_server_code_extractor.py"]
    }
  }
}

Or with traditional Python:

{
  "mcpServers": {
    "mcp-server-code-extractor": {
      "command": "python",
      "args": ["/path/to/mcp_server_code_extractor.py"]
    }
  }
}

Available Tools

1. get_symbols - Discover Code Structure

List all functions, classes, and other symbols in a file.

Returns:
- name: Symbol name
- type: function/class/method/etc
- start_line/end_line: Line numbers
- preview: First line of the symbol

2. get_function - Extract Complete Functions

Extract a complete function with all its code.

Parameters:
- file_path: Path to the source file
- function_name: Name of the function to extract

Returns:
- code: Complete function code
- start_line/end_line: Precise boundaries
- language: Detected language

3. get_class - Extract Complete Classes

Extract an entire class definition including all methods.

Parameters:
- file_path: Path to the source file
- class_name: Name of the class to extract

Returns:
- code: Complete class code
- start_line/end_line: Precise boundaries
- language: Detected language

4. get_lines - Extract Specific Line Ranges

Get exact line ranges when you know the line numbers.

Parameters:
- file_path: Path to the source file
- start_line: Starting line (1-based)
- end_line: Ending line (inclusive)

Returns:
- code: Extracted lines
- line numbers and metadata

5. get_signature - Get Function Signatures

Quickly get just the function signature without the body.

Parameters:
- file_path: Path to the source file
- function_name: Name of the function

Returns:
- signature: Function signature only
- start_line: Where the function starts

Usage Examples

Example 1: Exploring Local Files

# First, see what's in the file
symbols = get_symbols("src/main.py")
# Returns: List of all functions and classes with line numbers

# Extract a specific function
result = get_function("src/main.py", "process_data")
# Returns: Complete function code with line numbers

# Get just a function signature
sig = get_signature("src/main.py", "process_data")
# Returns: "def process_data(input_file: str, output_dir: Path) -> Dict[str, Any]:"

Example 2: Working with URLs

# Explore a GitHub file
symbols = get_symbols("https://raw.githubusercontent.com/user/repo/main/src/api.py")

# Extract function from GitLab
result = get_function("https://gitlab.com/user/project/-/raw/main/utils.py", "helper_func")

# Get lines from any URL
lines = get_lines("https://example.com/code/script.py", 10, 25)

Example 3: Working with Classes

# Extract an entire class (local file)
result = get_class("models/user.py", "User")
# Returns: Complete User class with all methods

# Extract class from URL
result = get_class("https://raw.githubusercontent.com/user/repo/main/models.py", "DatabaseModel")

# Get specific lines (e.g., just the __init__ method)
lines = get_lines("models/user.py", 10, 25)
# Returns: Lines 10-25 of the file

Example 4: Multi-Language Support

// Works with JavaScript/TypeScript
symbols = get_symbols("app.ts")
func = get_function("app.ts", "handleRequest")
// Works with Go
symbols = get_symbols("main.go")
method = get_function("main.go", "ServeHTTP")

Supported Languages

  • Python, JavaScript, TypeScript, JSX/TSX
  • Go, Rust, C, C++, C#, Java
  • Ruby, PHP, Swift, Kotlin, Scala
  • Bash, PowerShell, SQL
  • Haskell, OCaml, Elixir, Clojure
  • And many more...

Best Practices

  1. Always use get_symbols first when exploring a new file
  2. Use get_function/get_class instead of reading entire files
  3. Use get_lines when you know exact line numbers
  4. Use get_signature for quick API exploration

Why Not Just Use Read?

Traditional file reading tools require you to:

  • Read entire files (inefficient for large files)
  • Manually parse code to find functions/classes
  • Count lines manually for extraction
  • Deal with complex syntax and edge cases

MCP Server Code Extractor:

  • ✅ Extracts exactly what you need
  • ✅ Provides structured data with metadata
  • ✅ Handles complex syntax automatically
  • ✅ Works across 30+ languages consistently

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_server_code_extractor-0.2.5.tar.gz (29.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_server_code_extractor-0.2.5-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file mcp_server_code_extractor-0.2.5.tar.gz.

File metadata

File hashes

Hashes for mcp_server_code_extractor-0.2.5.tar.gz
Algorithm Hash digest
SHA256 0e5b93efb0ddbf744294693a12e072fc5a2fb139baf006f66d428f5a8bf17852
MD5 113dc45e6b8c09ab5ada99968b88c752
BLAKE2b-256 5013545674c87085c08812c68877a396ad17a7b6041a4f2228e43bb76f95140b

See more details on using hashes here.

File details

Details for the file mcp_server_code_extractor-0.2.5-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_server_code_extractor-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d9ed18e2263bbd7fb8ed575a3de6658ce9944e90d3bd99ad7c0a91fab3b436c5
MD5 a55e35f2253409b50f7d0ca6626b08a6
BLAKE2b-256 03f00ad860c780879bb2de079f90f2031c3b485b224dd1afea343ea73658d2da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page