Skip to main content

MCP server for AI assistants to navigate, search, and edit large codebases, logs, and data files with semantic code analysis

Project description

Largefile MCP Server

Navigate, search, and edit large codebases, logs, and data files that exceed AI context limits.

CI codecov PyPI version Python 3.10+ License: MIT

Why Largefile?

  • Go beyond context limits - Read, search, and edit files too large to fit in AI context windows
  • Semantic code navigation - Tree-sitter extracts functions/classes for Python, JS/TS, Rust, Go
  • Fewer LLM errors - Search/replace editing eliminates line number mistakes common with line-based edits
  • Smart search - Fuzzy matching, regex, case-insensitive, inverted, and count-only modes
  • No size limits - Handles multi-GB files via tiered memory strategy (RAM → mmap → streaming)

Quick Start

Prerequisite: Install uv for the uvx command.

{
  "mcpServers": {
    "largefile": {
      "command": "uvx",
      "args": ["--from", "largefile", "largefile-mcp"]
    }
  }
}

Tools

Tool Use For
get_overview File structure and semantic outline before diving in
search_content Finding patterns, counting occurrences, regex matching
read_content Reading specific sections; tail/head modes for logs
edit_content Safe search/replace with automatic backups
revert_edit Recovering from bad edits
list_directory Browse directory trees with recursive depth control
search_directory Search patterns across all files in a directory

When to Use Largefile

Use when:

  • File exceeds ~1000 lines or 100KB (supports multi-GB files)
  • Navigating large codebases with semantic structure
  • Analyzing log files (especially recent entries with tail mode)
  • Making search/replace edits across large files
  • Counting occurrences without loading full content

Don't use for:

  • Small files that fit in context (AI doesn't need help with those)
  • Binary files (images, executables, compressed)

Usage Examples

Large Codebase Navigation

# Get semantic structure of a large Python file
overview = get_overview("/path/to/large_module.py")
# Returns: 2,847 lines, 15 classes, function outline via Tree-sitter

# Find all class definitions
classes = search_content("/path/to/large_module.py", "class ", fuzzy=False)

# Read complete class with semantic chunking
code = read_content("/path/to/large_module.py", pattern="class UserModel", mode="semantic")

Batch Refactoring

# Preview rename across file
preview = edit_content("/path/to/api.py", changes=[
    {"search": "process_data", "replace": "transform_data"},
    {"search": "old_endpoint", "replace": "new_endpoint"}
], preview=True)

# Apply changes (creates automatic backup)
result = edit_content("/path/to/api.py", changes=[...], preview=False)

# Undo if needed
revert_edit("/path/to/api.py")

Log Analysis

# Get log file overview
overview = get_overview("/var/log/app.log")
# Returns: 150,000 lines, 2.1GB

# Read last 500 lines efficiently
recent = read_content("/var/log/app.log", limit=500, mode="tail")

# Count errors without loading content
error_count = search_content("/var/log/app.log", "ERROR", count_only=True, fuzzy=False)

# Find errors with regex
errors = search_content("/var/log/app.log", r"ERROR.*timeout", regex=True)

Supported Languages

Tree-sitter semantic analysis for: Python, JavaScript/JSX, TypeScript/TSX, Rust, Go, Java

Other file types use text-based analysis with graceful fallback.

File Size Handling

Size Strategy
< 50MB Full memory loading with AST caching
50-500MB Memory-mapped access
> 500MB Streaming (tail/head modes recommended)

Configuration

Environment variables for tuning:

LARGEFILE_MEMORY_THRESHOLD_MB=50      # RAM loading limit
LARGEFILE_MMAP_THRESHOLD_MB=500       # Memory mapping limit
LARGEFILE_FUZZY_THRESHOLD=0.8         # Match sensitivity (0.0-1.0)
LARGEFILE_MAX_SEARCH_RESULTS=20       # Results per search
LARGEFILE_BACKUP_DIR=~/.largefile/backups

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

largefile-0.1.7.tar.gz (2.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

largefile-0.1.7-py3-none-any.whl (35.2 kB view details)

Uploaded Python 3

File details

Details for the file largefile-0.1.7.tar.gz.

File metadata

  • Download URL: largefile-0.1.7.tar.gz
  • Upload date:
  • Size: 2.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.7.tar.gz
Algorithm Hash digest
SHA256 7cb315eca04de2174c47f28ce42f92b8ee5c5e86f304071c8da4169df95101d6
MD5 440fe663c0bf58a25481826ff542c338
BLAKE2b-256 0940a44b839a108925799c325fba5f9451353d8ec6e08451c9c1d3d7d1d27133

See more details on using hashes here.

File details

Details for the file largefile-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: largefile-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 35.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for largefile-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 3a68aea5853582609080d912d8e4d7e10dfd1d3608f68daf91ace1376b1a9a55
MD5 90e4395084da55f30a05e67c14cd0720
BLAKE2b-256 80a71eab752b708621aaa714f2c6a3599dda0390b5c3833ce87eecc23f305f7a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page