Skip to main content

MCP server for AI assistants to navigate, search, and edit large codebases, logs, and data files with semantic code analysis

Project description

Largefile MCP Server

Navigate, search, and edit large codebases, logs, and data files that exceed AI context limits.

CI codecov PyPI version Python 3.10+ License: MIT

Why Largefile?

  • Go beyond context limits - Read, search, and edit files too large to fit in AI context windows
  • Semantic code navigation - Tree-sitter extracts functions/classes for Python, JS/TS, Rust, Go
  • Fewer LLM errors - Search/replace editing eliminates line number mistakes common with line-based edits
  • Smart search - Fuzzy matching, regex, case-insensitive, inverted, and count-only modes
  • No size limits - Handles multi-GB files via tiered memory strategy (RAM → mmap → streaming)

Quick Start

Prerequisite: Install uv for the uvx command.

{
  "mcpServers": {
    "largefile": {
      "command": "uvx",
      "args": ["--from", "largefile", "largefile-mcp"]
    }
  }
}

Tools

Tool Use For
get_overview File structure and semantic outline before diving in
search_content Finding patterns, counting occurrences, regex matching
read_content Reading specific sections; tail/head modes for logs
edit_content Safe search/replace with automatic backups
revert_edit Recovering from bad edits

When to Use Largefile

Use when:

  • File exceeds ~1000 lines or 100KB (supports multi-GB files)
  • Navigating large codebases with semantic structure
  • Analyzing log files (especially recent entries with tail mode)
  • Making search/replace edits across large files
  • Counting occurrences without loading full content

Don't use for:

  • Small files that fit in context (AI doesn't need help with those)
  • Binary files (images, executables, compressed)

Usage Examples

Large Codebase Navigation

# Get semantic structure of a large Python file
overview = get_overview("/path/to/large_module.py")
# Returns: 2,847 lines, 15 classes, function outline via Tree-sitter

# Find all class definitions
classes = search_content("/path/to/large_module.py", "class ", fuzzy=False)

# Read complete class with semantic chunking
code = read_content("/path/to/large_module.py", pattern="class UserModel", mode="semantic")

Batch Refactoring

# Preview rename across file
preview = edit_content("/path/to/api.py", changes=[
    {"search": "process_data", "replace": "transform_data"},
    {"search": "old_endpoint", "replace": "new_endpoint"}
], preview=True)

# Apply changes (creates automatic backup)
result = edit_content("/path/to/api.py", changes=[...], preview=False)

# Undo if needed
revert_edit("/path/to/api.py")

Log Analysis

# Get log file overview
overview = get_overview("/var/log/app.log")
# Returns: 150,000 lines, 2.1GB

# Read last 500 lines efficiently
recent = read_content("/var/log/app.log", limit=500, mode="tail")

# Count errors without loading content
error_count = search_content("/var/log/app.log", "ERROR", count_only=True, fuzzy=False)

# Find errors with regex
errors = search_content("/var/log/app.log", r"ERROR.*timeout", regex=True)

Supported Languages

Tree-sitter semantic analysis for: Python, JavaScript/JSX, TypeScript/TSX, Rust, Go

Other file types use text-based analysis with graceful fallback.

File Size Handling

Size Strategy
< 50MB Full memory loading with AST caching
50-500MB Memory-mapped access
> 500MB Streaming (tail/head modes recommended)

Configuration

Environment variables for tuning:

LARGEFILE_MEMORY_THRESHOLD_MB=50      # RAM loading limit
LARGEFILE_MMAP_THRESHOLD_MB=500       # Memory mapping limit
LARGEFILE_FUZZY_THRESHOLD=0.8         # Match sensitivity (0.0-1.0)
LARGEFILE_MAX_SEARCH_RESULTS=20       # Results per search
LARGEFILE_BACKUP_DIR=~/.largefile/backups

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

largefile-0.1.3.tar.gz (2.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

largefile-0.1.3-py3-none-any.whl (29.3 kB view details)

Uploaded Python 3

File details

Details for the file largefile-0.1.3.tar.gz.

File metadata

  • Download URL: largefile-0.1.3.tar.gz
  • Upload date:
  • Size: 2.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.3.tar.gz
Algorithm Hash digest
SHA256 923a7f5822cd8223ec4c639cde29bde20b1c4dd0e77952e1135e62ca62f3a304
MD5 b75f343fdccafc2924a97c3ac781ea6f
BLAKE2b-256 6dc54a035a5006e84e8760914d1642a321b818a1371b4c742fecf11e62c64a01

See more details on using hashes here.

File details

Details for the file largefile-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: largefile-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 29.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3627c7e6a0f02663683eaf648f75b48cd6f8cc14a5ac2eedee6c58590914e173
MD5 91dc61d97ddda48fd95c57243cf5c914
BLAKE2b-256 57ad905bcca411a1f87f0b83f7f3bbfe323643df3d5f0e81811def07b2180df6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page