Skip to main content

Reusable utilities for MCP servers handling binary blob transfers through shared Docker volumes

Project description

MCP Mapped Resource Library

PyPI version Python Versions License: MIT Tests

A pip-installable Python library providing reusable utilities for MCP servers handling binary blob transfers through shared Docker volumes.

Features

  • Blob Storage: Upload, retrieve, list, and delete binary files with unique identifiers
  • Resource Identifiers: Unique blob IDs in the format blob://TIMESTAMP-HASH.EXT
  • Metadata Tracking: JSON-based metadata storage alongside blobs
  • Lazy Cleanup: Automatic TTL-based expiration with configurable cleanup intervals
  • Security: Path traversal prevention, MIME type validation, size limits
  • Deduplication: Optional content-based deduplication using SHA256 hashing
  • Docker Volume Support: Designed for shared Docker volumes across multiple MCP servers

Installation

From PyPI (Recommended)

pip install mcp-mapped-resource-lib

System Dependencies

The library requires libmagic for MIME type detection:

# Ubuntu/Debian
sudo apt-get install libmagic1

# macOS
brew install libmagic

# Windows (using conda)
conda install -c conda-forge python-magic

From Source (Development)

git clone https://github.com/nickweedon/mcp_mapped_resource_lib.git
cd mcp_mapped_resource_lib
pip install -e ".[dev]"

Quick Start

from mcp_mapped_resource_lib import BlobStorage, maybe_cleanup_expired_blobs

# Initialize storage
storage = BlobStorage(
    storage_root="/mnt/blob-storage",
    max_size_mb=100,
    allowed_mime_types=["image/*", "application/pdf"],
    enable_deduplication=True
)

# Upload a blob
result = storage.upload_blob(
    data=b"Hello, world!",
    filename="hello.txt",
    tags=["example"],
    ttl_hours=24
)

print(f"Blob ID: {result['blob_id']}")
print(f"File path: {result['file_path']}")
print(f"SHA256: {result['sha256']}")

# Retrieve metadata
metadata = storage.get_metadata(result['blob_id'])
print(f"Created: {metadata['created_at']}")

# List blobs with filtering
results = storage.list_blobs(
    mime_type="text/*",
    tags=["example"],
    page=1,
    page_size=20
)

print(f"Found {results['total']} blobs")

# Get filesystem path for direct access
file_path = storage.get_file_path(result['blob_id'])
with open(file_path, 'rb') as f:
    data = f.read()

# Delete a blob
storage.delete_blob(result['blob_id'])

# Lazy cleanup (run periodically)
cleanup_result = maybe_cleanup_expired_blobs(
    storage_root="/mnt/blob-storage",
    ttl_hours=24,
    cleanup_interval_minutes=5
)

if cleanup_result:
    print(f"Deleted {cleanup_result['deleted_count']} expired blobs")
    print(f"Freed {cleanup_result['freed_bytes']} bytes")

Integration with MCP Servers

This library is designed to be imported into MCP servers (built with FastMCP or other frameworks):

from mcp_mapped_resource_lib import BlobStorage, maybe_cleanup_expired_blobs
from fastmcp import FastMCP
import base64

mcp = FastMCP("my-mcp-server")
storage = BlobStorage(storage_root="/mnt/blob-storage")

@mcp.tool()
def upload_blob(
    data: str,  # base64-encoded
    filename: str,
    mime_type: str | None = None,
    tags: list[str] | None = None
) -> dict:
    """Upload a binary blob and receive a resource identifier."""
    binary_data = base64.b64decode(data)

    result = storage.upload_blob(
        data=binary_data,
        filename=filename,
        mime_type=mime_type,
        tags=tags
    )

    # Trigger lazy cleanup
    maybe_cleanup_expired_blobs(
        storage_root="/mnt/blob-storage",
        ttl_hours=24
    )

    return result

@mcp.resource("blob://{blob_id}")
def get_blob_content(blob_id: str) -> str:
    """Retrieve blob content as base64."""
    file_path = storage.get_file_path(blob_id)
    with open(file_path, 'rb') as f:
        return base64.b64encode(f.read()).decode()

Core Modules

BlobStorage

Main class for blob storage operations:

  • upload_blob() - Upload binary data and receive resource identifier
  • get_metadata() - Retrieve blob metadata
  • list_blobs() - List blobs with filtering and pagination
  • delete_blob() - Delete a blob
  • get_file_path() - Get filesystem path for direct access

Blob ID Utilities

Functions for working with blob identifiers:

  • create_blob_id() - Generate unique blob identifier
  • validate_blob_id() - Validate blob ID format and security
  • parse_blob_id() - Parse blob ID into components
  • strip_blob_protocol() - Remove "blob://" prefix

Cleanup Utilities

Functions for managing blob lifecycle:

  • maybe_cleanup_expired_blobs() - Lazy cleanup with interval checking
  • cleanup_expired_blobs() - Force cleanup of expired blobs
  • should_run_cleanup() - Check if cleanup interval has elapsed
  • scan_for_expired_blobs() - Find expired blobs

Path Utilities

Functions for path resolution and security:

  • blob_id_to_path() - Translate blob ID to filesystem path
  • get_metadata_path() - Get metadata file path
  • sanitize_filename() - Sanitize user-provided filenames
  • validate_path_safety() - Prevent path traversal attacks

MIME & Hash Utilities

Functions for content handling:

  • detect_mime_type() - Detect MIME type from data and filename
  • validate_mime_type() - Validate MIME type against allowed list
  • calculate_sha256() - Calculate SHA256 hash

Configuration Options

BlobStorage Configuration

storage = BlobStorage(
    storage_root="/mnt/blob-storage",      # Storage directory
    max_size_mb=100,                       # Max blob size in MB
    allowed_mime_types=["image/*"],        # Allowed MIME types (None = all)
    enable_deduplication=True,             # Enable SHA256 deduplication
    default_ttl_hours=24                   # Default TTL for blobs
)

Cleanup Configuration

result = maybe_cleanup_expired_blobs(
    storage_root="/mnt/blob-storage",
    ttl_hours=24,                          # Time-to-live in hours
    cleanup_interval_minutes=5             # Min interval between cleanups
)

Directory Structure

The library uses two-level directory sharding for performance:

/mnt/blob-storage/
├── 17/                                   # First 2 digits of timestamp
│   ├── 33/                              # Digits 3-4 of timestamp
│   │   ├── 1733437200-a3f9d8c2b1e4f6a7.png
│   │   └── 1733437200-a3f9d8c2b1e4f6a7.png.meta.json
│   └── 34/
├── 18/
└── .last_cleanup                         # Cleanup tracking file

Security Features

  • Path Traversal Prevention: Strict validation regex prevents directory traversal
  • MIME Type Filtering: Configurable whitelist of allowed MIME types
  • Size Limits: Configurable maximum blob size
  • Input Sanitization: Filenames are sanitized before storage
  • Path Safety Validation: Ensures resolved paths stay within storage root

Docker Volume Configuration

Example Docker Compose configuration for shared volumes:

services:
  mcp-server-1:
    build: .
    volumes:
      - blob-storage:/mnt/blob-storage    # Read-write access
    environment:
      - BLOB_STORAGE_ROOT=/mnt/blob-storage

  mcp-server-2:
    build: .
    volumes:
      - blob-storage:/mnt/blob-storage:ro # Read-only access
    environment:
      - BLOB_STORAGE_ROOT=/mnt/blob-storage

volumes:
  blob-storage:
    driver: local

Documentation

Development

Using DevContainer (Recommended)

This project includes a complete DevContainer setup for VS Code:

  1. Install Docker and VS Code
  2. Install the Dev Containers extension
  3. Open the project in VS Code
  4. Click "Reopen in Container" when prompted (or run "Dev Containers: Reopen in Container" from the command palette)

The DevContainer includes:

  • Python 3.12 with uv package manager
  • All development dependencies pre-installed
  • Docker CLI for testing containerized MCP servers
  • Claude Code CLI for AI-assisted development
  • Pre-configured extensions (Python, Pylance, Ruff, Claude Code)

Local Development

# Install development dependencies
make install
# or: pip install -e ".[dev]"

# Run all checks (recommended - runs lint, typecheck, and test)
make all

# Run individual checks
make lint       # Run ruff linting
make typecheck  # Run mypy type checking
make test       # Run pytest with coverage

# Clean build artifacts
make clean

# Show available commands
make help

Direct Commands (without Makefile)

# Lint code
ruff check src/ tests/

# Type check
mypy src/

# Run tests with coverage
pytest tests/ --cov=mcp_mapped_resource_lib --cov-report=html

License

MIT License - see LICENSE file for details

Changelog

See CHANGELOG.md for version history and release notes.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_mapped_resource_lib-0.1.0.tar.gz (77.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_mapped_resource_lib-0.1.0-py3-none-any.whl (19.7 kB view details)

Uploaded Python 3

File details

Details for the file mcp_mapped_resource_lib-0.1.0.tar.gz.

File metadata

  • Download URL: mcp_mapped_resource_lib-0.1.0.tar.gz
  • Upload date:
  • Size: 77.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mcp_mapped_resource_lib-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a8a7b25d4a346cc27ab018edf9642ea857fc27b533b16215d037ea2f499e8ed0
MD5 a665df8a12c8d7196323490a7053f715
BLAKE2b-256 22143766b5c98089d3ff7fa2b111fc67b58b8819e5d7ba14cece511eeea2f6b1

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_mapped_resource_lib-0.1.0.tar.gz:

Publisher: publish.yml on nickweedon/mcp_mapped_resource_lib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_mapped_resource_lib-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_mapped_resource_lib-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a6e86660162319d2ec7fdef9a6b4b115eb0e944646b58ea3a7bc2c313b39d19c
MD5 3e4b80f72bd2c4b702a1d40f763be6ad
BLAKE2b-256 260951d57a8f4cc4a76069a12f08316a4f4f19fa21917f5eb1ddf6e7c5cf065f

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_mapped_resource_lib-0.1.0-py3-none-any.whl:

Publisher: publish.yml on nickweedon/mcp_mapped_resource_lib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page