Skip to main content

A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing

Project description

MCP Server Code Extractor

A Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing. Extract functions, classes, and code snippets from 30+ programming languages without manual parsing.

Why MCP Server Code Extractor?

When working with AI coding assistants like Claude, you often need to:

  • Extract specific functions or classes from large codebases
  • Get an overview of what's in a file without reading the entire thing
  • Retrieve precise code snippets with accurate line numbers
  • Avoid manual parsing and grep/sed/awk gymnastics

MCP Server Code Extractor solves these problems by providing structured, tree-sitter-powered code extraction tools directly within your AI assistant.

Features

  • 🎯 Precise Extraction: Uses tree-sitter parsing for accurate code boundary detection
  • 🌍 30+ Languages: Supports Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, and many more
  • 📍 Line Numbers: Every extraction includes precise line number information
  • 🔍 Code Discovery: List all functions and classes in a file before extracting
  • 🌐 URL Support: Fetch and extract code from GitHub, GitLab, and direct file URLs
  • 🔄 Git Integration: Extract code from any git revision, branch, or tag
  • ⚡ Fast & Lightweight: Efficient caching and minimal dependencies
  • 🤖 AI-Optimized: Designed specifically for use with AI coding assistants

Installation

Quick Start with UV (Recommended)

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone this repository
git clone https://github.com/ctoth/mcp_server_code_extractor
cd mcp_server_code_extractor

# Run directly with UV (no installation needed!)
uv run mcp_server_code_extractor.py

Traditional Installation

pip install mcp[cli] tree-sitter-languages tree-sitter==0.21.3

Configure with Claude Desktop

Add to your Claude Desktop configuration:

{
  "mcpServers": {
    "mcp-server-code-extractor": {
      "command": "uv",
      "args": ["run", "/path/to/mcp_server_code_extractor.py"]
    }
  }
}

Or with traditional Python:

{
  "mcpServers": {
    "mcp-server-code-extractor": {
      "command": "python",
      "args": ["/path/to/mcp_server_code_extractor.py"]
    }
  }
}

Available Tools

1. get_symbols - Discover Code Structure

List all functions, classes, and other symbols in a file.

Returns:
- name: Symbol name
- type: function/class/method/etc
- start_line/end_line: Line numbers
- preview: First line of the symbol

2. get_function - Extract Complete Functions

Extract a complete function with all its code.

Parameters:
- file_path: Path to the source file
- function_name: Name of the function to extract

Returns:
- code: Complete function code
- start_line/end_line: Precise boundaries
- language: Detected language

3. get_class - Extract Complete Classes

Extract an entire class definition including all methods.

Parameters:
- file_path: Path to the source file
- class_name: Name of the class to extract

Returns:
- code: Complete class code
- start_line/end_line: Precise boundaries
- language: Detected language

4. get_lines - Extract Specific Line Ranges

Get exact line ranges when you know the line numbers.

Parameters:
- file_path: Path to the source file
- start_line: Starting line (1-based)
- end_line: Ending line (inclusive)

Returns:
- code: Extracted lines
- line numbers and metadata

5. get_signature - Get Function Signatures

Quickly get just the function signature without the body.

Parameters:
- file_path: Path to the source file
- function_name: Name of the function

Returns:
- signature: Function signature only
- start_line: Where the function starts

Usage Examples

Example 1: Exploring Local Files

# First, see what's in the file
symbols = get_symbols("src/main.py")
# Returns: List of all functions and classes with line numbers

# Extract a specific function
result = get_function("src/main.py", "process_data")
# Returns: Complete function code with line numbers

# Get just a function signature
sig = get_signature("src/main.py", "process_data")
# Returns: "def process_data(input_file: str, output_dir: Path) -> Dict[str, Any]:"

Example 2: Working with URLs

# Explore a GitHub file
symbols = get_symbols("https://raw.githubusercontent.com/user/repo/main/src/api.py")

# Extract function from GitLab
result = get_function("https://gitlab.com/user/project/-/raw/main/utils.py", "helper_func")

# Get lines from any URL
lines = get_lines("https://example.com/code/script.py", 10, 25)

Example 3: Working with Classes

# Extract an entire class (local file)
result = get_class("models/user.py", "User")
# Returns: Complete User class with all methods

# Extract class from URL
result = get_class("https://raw.githubusercontent.com/user/repo/main/models.py", "DatabaseModel")

# Get specific lines (e.g., just the __init__ method)
lines = get_lines("models/user.py", 10, 25)
# Returns: Lines 10-25 of the file

Example 4: Multi-Language Support

// Works with JavaScript/TypeScript
symbols = get_symbols("app.ts")
func = get_function("app.ts", "handleRequest")
// Works with Go
symbols = get_symbols("main.go")
method = get_function("main.go", "ServeHTTP")

Supported Languages

  • Python, JavaScript, TypeScript, JSX/TSX
  • Go, Rust, C, C++, C#, Java
  • Ruby, PHP, Swift, Kotlin, Scala
  • Bash, PowerShell, SQL
  • Haskell, OCaml, Elixir, Clojure
  • And many more...

Best Practices

  1. Always use get_symbols first when exploring a new file
  2. Use get_function/get_class instead of reading entire files
  3. Use get_lines when you know exact line numbers
  4. Use get_signature for quick API exploration

Why Not Just Use Read?

Traditional file reading tools require you to:

  • Read entire files (inefficient for large files)
  • Manually parse code to find functions/classes
  • Count lines manually for extraction
  • Deal with complex syntax and edge cases

MCP Server Code Extractor:

  • ✅ Extracts exactly what you need
  • ✅ Provides structured data with metadata
  • ✅ Handles complex syntax automatically
  • ✅ Works across 30+ languages consistently

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_server_code_extractor-0.2.3.tar.gz (29.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_server_code_extractor-0.2.3-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file mcp_server_code_extractor-0.2.3.tar.gz.

File metadata

File hashes

Hashes for mcp_server_code_extractor-0.2.3.tar.gz
Algorithm Hash digest
SHA256 2cd7eaf86378b4ff8b0d75a0a79c090adc54b66017b199c154d367ee913272fb
MD5 53ed164c4ade9e557f9aedd948b6ef1f
BLAKE2b-256 acbabe31e1717fe11894abd5fdd486f76dcb34fc06b0826bbe6dbf0b1a7e5656

See more details on using hashes here.

File details

Details for the file mcp_server_code_extractor-0.2.3-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_server_code_extractor-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2b029c83583ed02d1f9fdb1579fda482e328e28a341a90be525efce6164d34ee
MD5 b2d4ab23a26f39a9094ab6af1a289f00
BLAKE2b-256 25c1ca3fccdc11c529ff1953d74cd5eee1e18ac935ff2b8763da2bdf51238d3d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page