Skip to main content

MCP server for AI assistants to navigate, search, and edit large codebases, logs, and data files with semantic code analysis

Project description

Largefile MCP Server

Navigate, search, and edit large codebases, logs, and data files that exceed AI context limits.

CI codecov PyPI version Python 3.10+ License: MIT

Why Largefile?

  • Go beyond context limits - Read, search, and edit files too large to fit in AI context windows
  • Semantic code navigation - Tree-sitter extracts functions/classes for Python, JS/TS, Rust, Go
  • Fewer LLM errors - Search/replace editing eliminates line number mistakes common with line-based edits
  • Smart search - Fuzzy matching, regex, case-insensitive, inverted, and count-only modes
  • No size limits - Handles multi-GB files via tiered memory strategy (RAM → mmap → streaming)

Quick Start

Prerequisite: Install uv for the uvx command.

{
  "mcpServers": {
    "largefile": {
      "command": "uvx",
      "args": ["--from", "largefile", "largefile-mcp"]
    }
  }
}

Tools

Tool Use For
get_overview File structure and semantic outline before diving in
search_content Finding patterns, counting occurrences, regex matching
read_content Reading specific sections; tail/head modes for logs
edit_content Safe search/replace with automatic backups
revert_edit Recovering from bad edits
list_directory Browse directory trees with recursive depth control
search_directory Search patterns across all files in a directory

When to Use Largefile

Use when:

  • File exceeds ~1000 lines or 100KB (supports multi-GB files)
  • Navigating large codebases with semantic structure
  • Analyzing log files (especially recent entries with tail mode)
  • Making search/replace edits across large files
  • Counting occurrences without loading full content

Don't use for:

  • Small files that fit in context (AI doesn't need help with those)
  • Binary files (images, executables, compressed)

Usage Examples

Large Codebase Navigation

# Get semantic structure of a large Python file
overview = get_overview("/path/to/large_module.py")
# Returns: 2,847 lines, 15 classes, function outline via Tree-sitter

# Find all class definitions
classes = search_content("/path/to/large_module.py", "class ", fuzzy=False)

# Read complete class with semantic chunking
code = read_content("/path/to/large_module.py", pattern="class UserModel", mode="semantic")

Batch Refactoring

# Preview rename across file
preview = edit_content("/path/to/api.py", changes=[
    {"search": "process_data", "replace": "transform_data"},
    {"search": "old_endpoint", "replace": "new_endpoint"}
], preview=True)

# Apply changes (creates automatic backup)
result = edit_content("/path/to/api.py", changes=[...], preview=False)

# Undo if needed
revert_edit("/path/to/api.py")

Log Analysis

# Get log file overview
overview = get_overview("/var/log/app.log")
# Returns: 150,000 lines, 2.1GB

# Read last 500 lines efficiently
recent = read_content("/var/log/app.log", limit=500, mode="tail")

# Count errors without loading content
error_count = search_content("/var/log/app.log", "ERROR", count_only=True, fuzzy=False)

# Find errors with regex
errors = search_content("/var/log/app.log", r"ERROR.*timeout", regex=True)

Supported Languages

Tree-sitter semantic analysis for: Python, JavaScript/JSX, TypeScript/TSX, Rust, Go, Java

Other file types use text-based analysis with graceful fallback.

File Size Handling

Size Strategy
< 50MB Full memory loading with AST caching
50-500MB Memory-mapped access
> 500MB Streaming (tail/head modes recommended)

Configuration

Environment variables for tuning:

LARGEFILE_MEMORY_THRESHOLD_MB=50      # RAM loading limit
LARGEFILE_MMAP_THRESHOLD_MB=500       # Memory mapping limit
LARGEFILE_FUZZY_THRESHOLD=0.8         # Match sensitivity (0.0-1.0)
LARGEFILE_MAX_SEARCH_RESULTS=20       # Results per search
LARGEFILE_BACKUP_DIR=~/.largefile/backups

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

largefile-0.1.6.tar.gz (2.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

largefile-0.1.6-py3-none-any.whl (35.2 kB view details)

Uploaded Python 3

File details

Details for the file largefile-0.1.6.tar.gz.

File metadata

  • Download URL: largefile-0.1.6.tar.gz
  • Upload date:
  • Size: 2.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.6.tar.gz
Algorithm Hash digest
SHA256 e60a0d586b02f1973bff25e210cd2986cdc9f5dae6374b745f6ec21bd33e7738
MD5 a2bacac3361db63fe47949965432b365
BLAKE2b-256 bffd6f06c22a6e97a4eb58af9fd41bdf86a79189c6af9b6c54702c0401d71540

See more details on using hashes here.

File details

Details for the file largefile-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: largefile-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 35.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 b67980c85914f12a04540e74ac73a1851c6dbf2a0d78fadea7b9c5586fc465a3
MD5 5933cf20cd380393b1bfb514681fbde4
BLAKE2b-256 5d21091f20131cd12a4364db8f5966442a41afcbee4325372ee9fdaf0e6fee9c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page