MCP server for AI assistants to navigate, search, and edit large codebases, logs, and data files with semantic code analysis
Project description
Largefile MCP Server
Navigate, search, and edit large codebases, logs, and data files that exceed AI context limits.
Why Largefile?
- Go beyond context limits - Read, search, and edit files too large to fit in AI context windows
- Semantic code navigation - Tree-sitter extracts functions/classes for Python, JS/TS, Rust, Go
- Fewer LLM errors - Search/replace editing eliminates line number mistakes common with line-based edits
- Smart search - Fuzzy matching, regex, case-insensitive, inverted, and count-only modes
- No size limits - Handles multi-GB files via tiered memory strategy (RAM → mmap → streaming)
Quick Start
Prerequisite: Install uv for the uvx command.
{
"mcpServers": {
"largefile": {
"command": "uvx",
"args": ["--from", "largefile", "largefile-mcp"]
}
}
}
Tools
| Tool | Use For |
|---|---|
get_overview |
File structure and semantic outline before diving in |
search_content |
Finding patterns, counting occurrences, regex matching |
read_content |
Reading specific sections; tail/head modes for logs |
edit_content |
Safe search/replace with automatic backups |
revert_edit |
Recovering from bad edits |
When to Use Largefile
Use when:
- File exceeds ~1000 lines or 100KB (supports multi-GB files)
- Navigating large codebases with semantic structure
- Analyzing log files (especially recent entries with tail mode)
- Making search/replace edits across large files
- Counting occurrences without loading full content
Don't use for:
- Small files that fit in context (AI doesn't need help with those)
- Binary files (images, executables, compressed)
Usage Examples
Large Codebase Navigation
# Get semantic structure of a large Python file
overview = get_overview("/path/to/large_module.py")
# Returns: 2,847 lines, 15 classes, function outline via Tree-sitter
# Find all class definitions
classes = search_content("/path/to/large_module.py", "class ", fuzzy=False)
# Read complete class with semantic chunking
code = read_content("/path/to/large_module.py", pattern="class UserModel", mode="semantic")
Batch Refactoring
# Preview rename across file
preview = edit_content("/path/to/api.py", changes=[
{"search": "process_data", "replace": "transform_data"},
{"search": "old_endpoint", "replace": "new_endpoint"}
], preview=True)
# Apply changes (creates automatic backup)
result = edit_content("/path/to/api.py", changes=[...], preview=False)
# Undo if needed
revert_edit("/path/to/api.py")
Log Analysis
# Get log file overview
overview = get_overview("/var/log/app.log")
# Returns: 150,000 lines, 2.1GB
# Read last 500 lines efficiently
recent = read_content("/var/log/app.log", limit=500, mode="tail")
# Count errors without loading content
error_count = search_content("/var/log/app.log", "ERROR", count_only=True, fuzzy=False)
# Find errors with regex
errors = search_content("/var/log/app.log", r"ERROR.*timeout", regex=True)
Supported Languages
Tree-sitter semantic analysis for: Python, JavaScript/JSX, TypeScript/TSX, Rust, Go
Other file types use text-based analysis with graceful fallback.
File Size Handling
| Size | Strategy |
|---|---|
| < 50MB | Full memory loading with AST caching |
| 50-500MB | Memory-mapped access |
| > 500MB | Streaming (tail/head modes recommended) |
Configuration
Environment variables for tuning:
LARGEFILE_MEMORY_THRESHOLD_MB=50 # RAM loading limit
LARGEFILE_MMAP_THRESHOLD_MB=500 # Memory mapping limit
LARGEFILE_FUZZY_THRESHOLD=0.8 # Match sensitivity (0.0-1.0)
LARGEFILE_MAX_SEARCH_RESULTS=20 # Results per search
LARGEFILE_BACKUP_DIR=~/.largefile/backups
Documentation
- API Reference - Detailed tool documentation
- Configuration Guide - All environment variables
- Examples - More workflow examples
- Design Document - Architecture details
- Contributing - Development setup
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file largefile-0.1.4.tar.gz.
File metadata
- Download URL: largefile-0.1.4.tar.gz
- Upload date:
- Size: 2.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2517e53b3016264de3b97609d36e25dda89841ba944a2a603408ef7225a782c
|
|
| MD5 |
91a77bf0e0fdda7ec1c39b42d91d00e8
|
|
| BLAKE2b-256 |
d95b83a30ce766bbde86414d91a335699dee21b7b369cfc37b8f0b30cded18d3
|
File details
Details for the file largefile-0.1.4-py3-none-any.whl.
File metadata
- Download URL: largefile-0.1.4-py3-none-any.whl
- Upload date:
- Size: 29.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b7dbf4a8d1dc7946363a6dce3d34a934e8d6bd1f504df4e69b614bcf5f2516c
|
|
| MD5 |
b94b55ddf4f121d534978b1c8f38a3c0
|
|
| BLAKE2b-256 |
2a6441f10debb592add938b981eb619f7da9c772e78c2ec7e94ede5b443f06f0
|