Skip to main content

Reusable utilities for MCP servers handling binary blob transfers through shared Docker volumes

Project description

MCP Mapped Resource Library

PyPI version Python Versions License: MIT Tests

A pip-installable Python library providing reusable utilities for MCP servers handling binary blob transfers through shared Docker volumes.

Features

  • Blob Storage: Upload, retrieve, list, and delete binary files with unique identifiers
  • Resource Identifiers: Unique blob IDs in the format blob://TIMESTAMP-HASH.EXT
  • Metadata Tracking: JSON-based metadata storage alongside blobs
  • Lazy Cleanup: Automatic TTL-based expiration with configurable cleanup intervals
  • Security: Path traversal prevention, MIME type validation, size limits
  • Deduplication: Optional content-based deduplication using SHA256 hashing
  • Docker Volume Support: Designed for shared Docker volumes across multiple MCP servers

Installation

From PyPI (Recommended)

pip install mcp-mapped-resource-lib

System Dependencies

The library requires libmagic for MIME type detection:

# Ubuntu/Debian
sudo apt-get install libmagic1

# macOS
brew install libmagic

# Windows (using conda)
conda install -c conda-forge python-magic

From Source (Development)

git clone https://github.com/nickweedon/mcp_mapped_resource_lib.git
cd mcp_mapped_resource_lib
pip install -e ".[dev]"

Quick Start

from mcp_mapped_resource_lib import BlobStorage, maybe_cleanup_expired_blobs

# Initialize storage
storage = BlobStorage(
    storage_root="/mnt/blob-storage",
    max_size_mb=100,
    allowed_mime_types=["image/*", "application/pdf"],
    enable_deduplication=True
)

# Upload a blob
result = storage.upload_blob(
    data=b"Hello, world!",
    filename="hello.txt",
    tags=["example"],
    ttl_hours=24
)

print(f"Blob ID: {result['blob_id']}")
print(f"File path: {result['file_path']}")
print(f"SHA256: {result['sha256']}")

# Retrieve metadata
metadata = storage.get_metadata(result['blob_id'])
print(f"Created: {metadata['created_at']}")

# List blobs with filtering
results = storage.list_blobs(
    mime_type="text/*",
    tags=["example"],
    page=1,
    page_size=20
)

print(f"Found {results['total']} blobs")

# Get filesystem path for direct access
file_path = storage.get_file_path(result['blob_id'])
with open(file_path, 'rb') as f:
    data = f.read()

# Delete a blob
storage.delete_blob(result['blob_id'])

# Lazy cleanup (run periodically)
cleanup_result = maybe_cleanup_expired_blobs(
    storage_root="/mnt/blob-storage",
    ttl_hours=24,
    cleanup_interval_minutes=5
)

if cleanup_result:
    print(f"Deleted {cleanup_result['deleted_count']} expired blobs")
    print(f"Freed {cleanup_result['freed_bytes']} bytes")

Integration with MCP Servers

This library is designed to be imported into MCP servers (built with FastMCP or other frameworks):

from mcp_mapped_resource_lib import BlobStorage, maybe_cleanup_expired_blobs
from fastmcp import FastMCP
import base64

mcp = FastMCP("my-mcp-server")
storage = BlobStorage(storage_root="/mnt/blob-storage")

@mcp.tool()
def upload_blob(
    data: str,  # base64-encoded
    filename: str,
    mime_type: str | None = None,
    tags: list[str] | None = None
) -> dict:
    """Upload a binary blob and receive a resource identifier."""
    binary_data = base64.b64decode(data)

    result = storage.upload_blob(
        data=binary_data,
        filename=filename,
        mime_type=mime_type,
        tags=tags
    )

    # Trigger lazy cleanup
    maybe_cleanup_expired_blobs(
        storage_root="/mnt/blob-storage",
        ttl_hours=24
    )

    return result

@mcp.resource("blob://{blob_id}")
def get_blob_content(blob_id: str) -> str:
    """Retrieve blob content as base64."""
    file_path = storage.get_file_path(blob_id)
    with open(file_path, 'rb') as f:
        return base64.b64encode(f.read()).decode()

Core Modules

BlobStorage

Main class for blob storage operations:

  • upload_blob() - Upload binary data and receive resource identifier
  • get_metadata() - Retrieve blob metadata
  • list_blobs() - List blobs with filtering and pagination
  • delete_blob() - Delete a blob
  • get_file_path() - Get filesystem path for direct access

Blob ID Utilities

Functions for working with blob identifiers:

  • create_blob_id() - Generate unique blob identifier
  • validate_blob_id() - Validate blob ID format and security
  • parse_blob_id() - Parse blob ID into components
  • strip_blob_protocol() - Remove "blob://" prefix

Cleanup Utilities

Functions for managing blob lifecycle:

  • maybe_cleanup_expired_blobs() - Lazy cleanup with interval checking
  • cleanup_expired_blobs() - Force cleanup of expired blobs
  • should_run_cleanup() - Check if cleanup interval has elapsed
  • scan_for_expired_blobs() - Find expired blobs

Path Utilities

Functions for path resolution and security:

  • blob_id_to_path() - Translate blob ID to filesystem path
  • get_metadata_path() - Get metadata file path
  • sanitize_filename() - Sanitize user-provided filenames
  • validate_path_safety() - Prevent path traversal attacks

MIME & Hash Utilities

Functions for content handling:

  • detect_mime_type() - Detect MIME type from data and filename
  • validate_mime_type() - Validate MIME type against allowed list
  • calculate_sha256() - Calculate SHA256 hash

Configuration Options

BlobStorage Configuration

storage = BlobStorage(
    storage_root="/mnt/blob-storage",      # Storage directory
    max_size_mb=100,                       # Max blob size in MB
    allowed_mime_types=["image/*"],        # Allowed MIME types (None = all)
    enable_deduplication=True,             # Enable SHA256 deduplication
    default_ttl_hours=24                   # Default TTL for blobs
)

Cleanup Configuration

result = maybe_cleanup_expired_blobs(
    storage_root="/mnt/blob-storage",
    ttl_hours=24,                          # Time-to-live in hours
    cleanup_interval_minutes=5             # Min interval between cleanups
)

Directory Structure

The library uses two-level directory sharding for performance:

/mnt/blob-storage/
├── 17/                                   # First 2 digits of timestamp
│   ├── 33/                              # Digits 3-4 of timestamp
│   │   ├── 1733437200-a3f9d8c2b1e4f6a7.png
│   │   └── 1733437200-a3f9d8c2b1e4f6a7.png.meta.json
│   └── 34/
├── 18/
└── .last_cleanup                         # Cleanup tracking file

Security Features

  • Path Traversal Prevention: Strict validation regex prevents directory traversal
  • MIME Type Filtering: Configurable whitelist of allowed MIME types
  • Size Limits: Configurable maximum blob size
  • Input Sanitization: Filenames are sanitized before storage
  • Path Safety Validation: Ensures resolved paths stay within storage root

Docker Volume Configuration

Example Docker Compose configuration for shared volumes:

services:
  mcp-server-1:
    build: .
    volumes:
      - blob-storage:/mnt/blob-storage    # Read-write access
    environment:
      - BLOB_STORAGE_ROOT=/mnt/blob-storage

  mcp-server-2:
    build: .
    volumes:
      - blob-storage:/mnt/blob-storage:ro # Read-only access
    environment:
      - BLOB_STORAGE_ROOT=/mnt/blob-storage

volumes:
  blob-storage:
    driver: local

Documentation

Development

Using DevContainer (Recommended)

This project includes a complete DevContainer setup for VS Code:

  1. Install Docker and VS Code
  2. Install the Dev Containers extension
  3. Open the project in VS Code
  4. Click "Reopen in Container" when prompted (or run "Dev Containers: Reopen in Container" from the command palette)

The DevContainer includes:

  • Python 3.12 with uv package manager
  • All development dependencies pre-installed
  • Docker CLI for testing containerized MCP servers
  • Claude Code CLI for AI-assisted development
  • Pre-configured extensions (Python, Pylance, Ruff, Claude Code)

Local Development

# Install development dependencies
make install
# or: pip install -e ".[dev]"

# Run all checks (recommended - runs lint, typecheck, and test)
make all

# Run individual checks
make lint       # Run ruff linting
make typecheck  # Run mypy type checking
make test       # Run pytest with coverage

# Clean build artifacts
make clean

# Show available commands
make help

Direct Commands (without Makefile)

# Lint code
ruff check src/ tests/

# Type check
mypy src/

# Run tests with coverage
pytest tests/ --cov=mcp_mapped_resource_lib --cov-report=html

Releasing New Versions

This project includes a simple release automation workflow for creating new versions and publishing to PyPI.

Prerequisites

  1. Install GitHub CLI (gh):

    # macOS
    brew install gh
    
    # Ubuntu/Debian
    sudo apt install gh
    
    # Windows
    winget install GitHub.cli
    
  2. Authenticate with GitHub:

    gh auth login
    

Creating a Release

Use the Makefile target to create a new release:

# Auto-increment patch version (e.g., 0.1.0 → 0.1.1)
make release

# Or specify a version explicitly for minor/major releases
make release VERSION=0.2.0

Or use the script directly:

# Auto-increment patch version
./release.sh

# Or specify version explicitly
./release.sh 0.2.0

What Happens During Release

The release script performs the following steps automatically:

  1. Detects the latest version from git tags (not pyproject.toml)
  2. Validates version format (must be semver: X.Y.Z)
  3. Checks that GitHub CLI is installed and authenticated
  4. Verifies working directory is clean (no uncommitted changes)
  5. Updates version in pyproject.toml
  6. Runs make all (lint + typecheck + test) to ensure quality
  7. Commits the version bump
  8. Creates a git tag (e.g., v0.2.0)
  9. Pushes commit and tag to GitHub
  10. Creates GitHub release with auto-generated release notes
  11. Triggers automatic PyPI publishing via GitHub Actions

Important Notes

  • ⚠️ The release target is not part of the regular build process (make all)
  • ⚠️ Releases must be created explicitly - they never happen automatically
  • ✅ If no version is specified, the patch version is auto-incremented from the latest git tag (not pyproject.toml)
  • ✅ Run git fetch --tags before releasing to ensure you have the latest tags from GitHub
  • ✅ For minor/major releases, specify the version explicitly (e.g., VERSION=0.2.0 or VERSION=1.0.0)
  • ✅ All tests must pass before a release can be created
  • ✅ PyPI publishing happens automatically via GitHub Actions (using Trusted Publishers)
  • ✅ You can monitor the publishing workflow at: https://github.com/nickweedon/mcp_mapped_resource_lib/actions

Versioning

This project follows Semantic Versioning:

  • PATCH (0.0.X): Bug fixes, backwards compatible - make release (auto-increment)
  • MINOR (0.X.0): New features, backwards compatible - make release VERSION=0.2.0
  • MAJOR (X.0.0): Breaking changes - make release VERSION=1.0.0

Troubleshooting

Error: GitHub CLI is not installed

  • Install gh using the instructions above

Error: GitHub CLI is not authenticated

  • Run gh auth login and follow the prompts

Error: Working directory has uncommitted changes

  • Commit or stash your changes before releasing: git status

Error: Tests failed

  • Fix the failing tests before creating a release
  • The version change in pyproject.toml will be automatically reverted

License

MIT License - see LICENSE file for details

Changelog

See CHANGELOG.md for version history and release notes.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_mapped_resource_lib-0.1.2.tar.gz (80.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_mapped_resource_lib-0.1.2-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file mcp_mapped_resource_lib-0.1.2.tar.gz.

File metadata

  • Download URL: mcp_mapped_resource_lib-0.1.2.tar.gz
  • Upload date:
  • Size: 80.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mcp_mapped_resource_lib-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6abbd05df8c126f18095fb241fb1a2ff5268522c87520d8a10474cd353ea389b
MD5 6fe7f867bdc42418e6d3bc9ac6c19d9a
BLAKE2b-256 eeb01d25cc24d87f538207a1d2678cf77ba441338c11d2380c7c189d22149f87

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_mapped_resource_lib-0.1.2.tar.gz:

Publisher: publish.yml on nickweedon/mcp_mapped_resource_lib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_mapped_resource_lib-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_mapped_resource_lib-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 85c1f6e684d8dae355b8887e9405f68bb9b6b3d816b574d1947d73af41b2e474
MD5 71b75c4dbdc36e115f6ccb322b6a5280
BLAKE2b-256 8b793069e879a9498fc2d0315f312f0002c4494a6f58690447c144681f251d2c

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_mapped_resource_lib-0.1.2-py3-none-any.whl:

Publisher: publish.yml on nickweedon/mcp_mapped_resource_lib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page