MCP server that helps AI assistants work with large files that exceed context limits

These details have not been verified by PyPI

Project links

Project description

Largefile MCP Server

An MCP server that enables AI assistants to work with large files that exceed context limits.

Navigate, search, and edit files of any size without loading entire content into memory. Largefile provides targeted access to specific lines, patterns, and sections while maintaining file integrity using research-backed search/replace editing instead of error-prone line-based operations.

Perfect for working with large codebases, generated files, logs, and datasets that would otherwise be inaccessible due to context window limitations.

MCP Tools

Five tools that work together for progressive file exploration:

Tool	Purpose
`get_overview`	File structure with Tree-sitter semantic analysis, line counts, and search hints
`search_content`	Pattern search with fuzzy matching, context lines, and semantic information
`read_content`	Targeted reading by line number, pattern, or tail mode for log files
`edit_content`	Search/replace editing with batch support, automatic backups, and preview mode
`revert_edit`	Recover from bad edits by reverting to previous backup states

Quick Start

Prerequisite: Install uv (an extremely fast Python package manager) which provides the uvx command.

Add to your MCP configuration:

{
  "mcpServers": {
    "largefile": {
      "command": "uvx",
      "args": ["--from", "largefile", "largefile-mcp"]
    }
  }
}

Usage

Your AI Assistant / LLM can now work with very large files that exceed its context limits. Here are some common workflows:

Analyzing Large Code Files

AI Question: "Can you analyze this large Django models file and tell me about the class structure and any potential issues? It's a large file so use largefile."

AI Assistant workflow:

Gets file overview to understand structure
Searches for classes and their methods
Looks for code issues like TODOs or long functions

# AI gets file structure
overview = get_overview("/path/to/django-models.py")
# Returns: 2,847 lines, 15 classes, semantic outline with Tree-sitter

# AI searches for all class definitions
classes = search_content("/path/to/django-models.py", "class ", max_results=20)
# Returns: Model classes with line numbers and context

# AI examines specific class implementation
model_code = read_content("/path/to/django-models.py", "class User", mode="semantic")
# Returns: Complete class definition with all methods

Working with Documentation

AI Question: "Find all the installation methods mentioned in this README file and update the pip install to use uv instead."

AI Assistant workflow:

Search for installation patterns
Read the installation section
Replace pip commands with uv equivalents

# AI finds installation instructions
install_sections = search_content("/path/to/readme.md", "install", fuzzy=True, context_lines=3)

# AI reads the installation section
install_content = read_content("/path/to/readme.md", "## Installation", mode="semantic")

# AI replaces pip with uv
edit_result = edit_content(
    "/path/to/readme.md",
    search_text="pip install anthropic",
    replace_text="uv add anthropic",
    preview=True
)

Debugging Large Log Files

AI Question: "Check this production log file for any critical errors in the last few thousand lines and show me the context around them. Use largefile mcp."

AI Assistant workflow:

Get log file overview
Read the last N lines efficiently with tail mode
Search for error patterns in recent entries

# AI gets log file overview
overview = get_overview("/path/to/production.log")
# Returns: 150,000 lines, 2.1GB file size

# AI reads the last 1000 lines efficiently (no need to know total line count)
recent = read_content("/path/to/production.log", 1000, mode="tail")
# Returns: Last 1000 lines without loading entire file

# AI searches for critical errors
errors = search_content("/path/to/production.log", "CRITICAL|ERROR", fuzzy=True, max_results=10)

# AI examines context around each error
for error in errors:
    context = read_content("/path/to/production.log", error.line_number, mode="lines")
    # Shows surrounding log entries for debugging

Refactoring Code

AI Question: "I need to rename the function process_data to transform_data throughout this large codebase file. Can you help me do this safely?"

AI Assistant workflow:

Find all occurrences of the function
Preview changes to ensure accuracy
Apply changes with automatic backup

# AI finds all usages
usages = search_content("/path/to/codebase.py", "process_data", fuzzy=False, max_results=50)

# AI previews the changes
preview = edit_content(
    "/path/to/codebase.py",
    search_text="process_data",
    replace_text="transform_data",
    preview=True
)

# AI applies changes after confirmation
result = edit_content(
    "/path/to/codebase.py",
    search_text="process_data",
    replace_text="transform_data",
    preview=False
)
# Creates automatic backup before changes

Batch Editing Multiple Patterns

AI Question: "Update all the deprecated API calls in this file - there are several different ones to change."

AI Assistant workflow:

Identify all deprecated patterns
Apply multiple changes atomically in one call

# AI applies multiple changes in a single atomic operation
result = edit_content(
    "/path/to/api_client.py",
    changes=[
        {"search": "client.get_user(", "replace": "client.fetch_user("},
        {"search": "client.post_data(", "replace": "client.send_data("},
        {"search": "client.delete_item(", "replace": "client.remove_item("},
    ],
    preview=True
)
# Returns per-change results with success/failure status
# All changes applied atomically - partial success is reported

Recovering from Bad Edits

AI Question: "That last edit broke something. Can you undo it?"

AI Assistant workflow:

List available backups
Revert to previous state (current state is preserved as new backup)

# AI reverts to the most recent backup
result = revert_edit("/path/to/broken_file.py")
# Current state saved as backup, file restored to previous version

# Or revert to a specific backup by ID
result = revert_edit("/path/to/broken_file.py", backup_id="20240115_143022")
# Returns: available_backups list for reference

Exploring API Documentation

AI Question: "What are all the available methods in this large API documentation file and can you show me examples of authentication?"

AI Assistant workflow:

Get document structure overview
Search for method definitions and auth patterns
Extract relevant code examples

# AI analyzes document structure
overview = get_overview("/path/to/api-docs.md")
# Returns: Section outline, headings, suggested search patterns

# AI finds API methods
methods = search_content("/path/to/api-docs.md", "###", max_results=30)
# Returns: All method headings with context

# AI searches for authentication examples
auth_examples = search_content("/path/to/api-docs.md", "auth", fuzzy=True, context_lines=5)

# AI reads complete authentication section
auth_section = read_content("/path/to/api-docs.md", "## Authentication", mode="semantic")

File Size Handling

Small files (<50MB): Memory loading with Tree-sitter AST caching
Medium files (50-500MB): Memory-mapped access
Large files (>500MB): Streaming processing
Long lines (>1000 chars): Automatic truncation for display

Supported Languages

Tree-sitter semantic analysis for:

Python (.py)
JavaScript/JSX (.js, .jsx)
TypeScript/TSX (.ts, .tsx)
Rust (.rs)
Go (.go)

Files without Tree-sitter support use text-based analysis with graceful degradation.

Configuration

Configure via environment variables:

# File processing thresholds
LARGEFILE_MEMORY_THRESHOLD_MB=50        # Memory loading limit
LARGEFILE_MMAP_THRESHOLD_MB=500         # Memory mapping limit

# Search settings
LARGEFILE_FUZZY_THRESHOLD=0.8           # Fuzzy match sensitivity (0.0-1.0)
LARGEFILE_MAX_SEARCH_RESULTS=20         # Result limit per search
LARGEFILE_CONTEXT_LINES=2               # Context lines around matches

# Error recovery
LARGEFILE_SIMILAR_MATCH_LIMIT=3         # Similar matches shown on edit failure
LARGEFILE_SIMILAR_MATCH_THRESHOLD=0.6   # Min similarity for suggestions

# Backup management
LARGEFILE_BACKUP_DIR="~/.largefile/backups"  # Backup location
LARGEFILE_MAX_BACKUPS=10                # Backups retained per file

# Batch editing
LARGEFILE_MAX_BATCH_CHANGES=50          # Max changes per batch call

# Performance
LARGEFILE_ENABLE_TREE_SITTER=true       # Semantic features

Key Features

Search/replace editing - Eliminates LLM line number errors with fuzzy matching
Batch operations - Apply multiple changes atomically in one call
Smart error recovery - Failed edits show similar matches with suggestions
Backup & revert - Automatic backups with full revert capability
Tail mode - Read log file endings without knowing total line count
Semantic awareness - Tree-sitter integration for code structure
Memory efficient - Handles files of any size via tiered access strategy

Documentation

API Reference - Detailed tool documentation
Configuration Guide - Environment variables and tuning
Examples - Real-world usage examples and workflows
Design Document - Architecture and implementation details
Contributing - Development setup and guidelines

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

Apr 9, 2026

0.2.0

Mar 15, 2026

0.1.8

Mar 6, 2026

0.1.7

Mar 6, 2026

0.1.6

Mar 6, 2026

0.1.5

Jan 19, 2026

0.1.4

Jan 18, 2026

0.1.3

Jan 17, 2026

This version

0.1.2

Jan 9, 2026

0.1.1

Jul 20, 2025

0.1.0

Jul 20, 2025

0.0.1

Jul 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

largefile-0.1.2.tar.gz (2.6 MB view details)

Uploaded Jan 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

largefile-0.1.2-py3-none-any.whl (28.4 kB view details)

Uploaded Jan 9, 2026 Python 3

File details

Details for the file largefile-0.1.2.tar.gz.

File metadata

Download URL: largefile-0.1.2.tar.gz
Upload date: Jan 9, 2026
Size: 2.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`d9a598a38cd405cd3007ec58621dbdfc1c7d1628edd643a09fcd46dbabae1d6c`
MD5	`2582b53cd8ca6c96c77486081c007ff8`
BLAKE2b-256	`947bde52271685d6c213f0d3dd3f9623c892947ea1a1e15e271b04c9128e30bc`

See more details on using hashes here.

File details

Details for the file largefile-0.1.2-py3-none-any.whl.

File metadata

Download URL: largefile-0.1.2-py3-none-any.whl
Upload date: Jan 9, 2026
Size: 28.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e33d8b6d271c59658fe1085ce90e283fc344042b5f015b58402c1919821e985f`
MD5	`b287c08b06370a0f60588e6b9f0a4927`
BLAKE2b-256	`cd7c0cdf728380c924a60f4dfdcaa37b5782d1d19217dd27e506ffb14d35ad8f`

See more details on using hashes here.

largefile 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Largefile MCP Server

MCP Tools

Quick Start

Usage

Analyzing Large Code Files

Working with Documentation

Debugging Large Log Files

Refactoring Code

Batch Editing Multiple Patterns

Recovering from Bad Edits

Exploring API Documentation

File Size Handling

Supported Languages

Configuration

Key Features

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes