Skip to main content

MCP server for AI assistants to navigate, search, and edit large codebases, logs, and data files with semantic code analysis

Project description

Largefile MCP Server

Navigate, search, and edit large codebases, logs, and data files that exceed AI context limits.

CI codecov PyPI version Python 3.10+ License: MIT

Why Largefile?

  • Go beyond context limits - Read, search, and edit files too large to fit in AI context windows
  • Semantic code navigation - Tree-sitter extracts functions/classes for Python, JS/TS, Rust, Go
  • Fewer LLM errors - Search/replace editing eliminates line number mistakes common with line-based edits
  • Smart search - Fuzzy matching, regex, case-insensitive, inverted, and count-only modes
  • No size limits - Handles multi-GB files via tiered memory strategy (RAM → mmap → streaming)

Quick Start

Prerequisite: Install uv for the uvx command.

{
  "mcpServers": {
    "largefile": {
      "command": "uvx",
      "args": ["--from", "largefile", "largefile-mcp"]
    }
  }
}

Tools

Tool Use For
get_overview File structure and semantic outline before diving in
search_content Finding patterns, counting occurrences, regex matching
read_content Reading specific sections; tail/head modes for logs
edit_content Safe search/replace with automatic backups
revert_edit Recovering from bad edits

When to Use Largefile

Use when:

  • File exceeds ~1000 lines or 100KB (supports multi-GB files)
  • Navigating large codebases with semantic structure
  • Analyzing log files (especially recent entries with tail mode)
  • Making search/replace edits across large files
  • Counting occurrences without loading full content

Don't use for:

  • Small files that fit in context (AI doesn't need help with those)
  • Binary files (images, executables, compressed)

Usage Examples

Large Codebase Navigation

# Get semantic structure of a large Python file
overview = get_overview("/path/to/large_module.py")
# Returns: 2,847 lines, 15 classes, function outline via Tree-sitter

# Find all class definitions
classes = search_content("/path/to/large_module.py", "class ", fuzzy=False)

# Read complete class with semantic chunking
code = read_content("/path/to/large_module.py", pattern="class UserModel", mode="semantic")

Batch Refactoring

# Preview rename across file
preview = edit_content("/path/to/api.py", changes=[
    {"search": "process_data", "replace": "transform_data"},
    {"search": "old_endpoint", "replace": "new_endpoint"}
], preview=True)

# Apply changes (creates automatic backup)
result = edit_content("/path/to/api.py", changes=[...], preview=False)

# Undo if needed
revert_edit("/path/to/api.py")

Log Analysis

# Get log file overview
overview = get_overview("/var/log/app.log")
# Returns: 150,000 lines, 2.1GB

# Read last 500 lines efficiently
recent = read_content("/var/log/app.log", limit=500, mode="tail")

# Count errors without loading content
error_count = search_content("/var/log/app.log", "ERROR", count_only=True, fuzzy=False)

# Find errors with regex
errors = search_content("/var/log/app.log", r"ERROR.*timeout", regex=True)

Supported Languages

Tree-sitter semantic analysis for: Python, JavaScript/JSX, TypeScript/TSX, Rust, Go

Other file types use text-based analysis with graceful fallback.

File Size Handling

Size Strategy
< 50MB Full memory loading with AST caching
50-500MB Memory-mapped access
> 500MB Streaming (tail/head modes recommended)

Configuration

Environment variables for tuning:

LARGEFILE_MEMORY_THRESHOLD_MB=50      # RAM loading limit
LARGEFILE_MMAP_THRESHOLD_MB=500       # Memory mapping limit
LARGEFILE_FUZZY_THRESHOLD=0.8         # Match sensitivity (0.0-1.0)
LARGEFILE_MAX_SEARCH_RESULTS=20       # Results per search
LARGEFILE_BACKUP_DIR=~/.largefile/backups

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

largefile-0.1.5.tar.gz (2.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

largefile-0.1.5-py3-none-any.whl (30.6 kB view details)

Uploaded Python 3

File details

Details for the file largefile-0.1.5.tar.gz.

File metadata

  • Download URL: largefile-0.1.5.tar.gz
  • Upload date:
  • Size: 2.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.5.tar.gz
Algorithm Hash digest
SHA256 5fcb76be508cece016a0dcac97359619b197dc96713ee850ce959cd1d9223b37
MD5 6a99c8878d521550f9df8a5eb087ddc6
BLAKE2b-256 5813ccef4eb5c8fcfa6fa74740910a0287b3eed72fb1bf71e22ea45ecdb394cc

See more details on using hashes here.

File details

Details for the file largefile-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: largefile-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 30.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 b75936875f008a4cf42a7b77acb917a0a225993b55ca3386a87146aa1f8141f9
MD5 c2c6eeb59caeb028452d12517fdc5416
BLAKE2b-256 06d806672716677a0835eb83f4b9388f89582a87419c57b3fcb003a3541af805

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page