Skip to main content

MCP server for secure local file system access

Project description

DataSage MCP Server

DataSage is a Model Context Protocol (MCP) server that provides AI assistants with secure access to local file systems. It enables generative AI tools like Amazon Q, Claude Desktop, and other MCP-compatible clients to search, read, and navigate local files and directories through a standardized interface.

Features

  • Secure File Access: Configurable path restrictions prevent access outside specified directories
  • Full-Text Search: Search file contents and filenames with fuzzy matching, regex, and exact matching
  • Directory Traversal: Navigate directory structures with configurable depth limits
  • Text File Support: Automatic detection and handling of text-based files with encoding support
  • MCP Compliant: Follows Model Context Protocol specification for seamless AI integration
  • FastMCP v2: Built on the latest FastMCP framework for optimal performance

Installation

Install DataSage using uvx (recommended):

uvx p6plab-datasage

Or install with pip:

pip install p6plab-datasage

Quick Start

  1. Create a configuration file (datasage.yaml):
server:
  name: "My DataSage"
  description: "Local file server for AI assistants"

paths:
  - path: "~/Documents"
    description: "Personal documents"
  - path: "~/Code"
    description: "Source code files"

settings:
  max_depth: 10
  max_file_size: 10485760  # 10MB
  1. Start the server:
# STDIO transport (for Claude Desktop, etc.)
uvx p6plab-datasage

# HTTP transport (for web-based clients)
uvx p6plab-datasage --transport http --port 8000

# Custom configuration
uvx p6plab-datasage --config my-config.yaml

Configuration

Configuration File Format

DataSage uses YAML configuration files with the following structure:

server:
  name: "DataSage"                    # Server name
  description: "File server for AI"   # Server description

paths:                                # Allowed file paths
  - path: "~/Documents"
    description: "Documents folder"
  - path: "/Users/shared/projects"
    description: "Shared projects"

settings:
  max_depth: 10                       # Maximum directory depth
  max_file_size: 10485760            # Maximum file size (10MB)
  text_detection: "auto"             # Text file detection method
  excluded_extensions:               # Binary file extensions to skip
    - ".exe"
    - ".jpg"
    - ".pdf"

tools:
  search:
    description: "Search files"       # Tool descriptions
    max_results: 50
  get_page:
    description: "Get file content"
  get_page_children:
    description: "List directory contents"

search:
  fuzzy_threshold: 0.8               # Fuzzy matching threshold
  enable_regex: true                 # Enable regex search
  index_content: true                # Index file contents

Environment Variables

Override configuration with environment variables (higher priority):

export DATASAGE_NAME="Custom DataSage"
export DATASAGE_DESCRIPTION="Custom description"
export DATASAGE_PATHS="/path1,/path2"
export DATASAGE_MAX_DEPTH=5
export DATASAGE_TOOL_SEARCH_DESC="Custom search description"

Available Tools

DataSage provides three MCP tools:

1. search

Search files by content or filename with multiple matching algorithms.

Parameters:

  • query (required): Search query string
  • file_type (optional): File extension filter (e.g., ".py", ".md")
  • search_type (optional): "content", "filename", or "both" (default: "both")
  • match_type (optional): "exact", "fuzzy", or "regex" (default: "fuzzy")
  • max_results (optional): Maximum results to return (default: 20)

2. get_page

Retrieve the content of a specific file.

Parameters:

  • path (required): File path to read
  • encoding (optional): Text encoding (default: "utf-8")

3. get_page_children

List the contents of a directory with optional recursion.

Parameters:

  • path (required): Directory path to list
  • max_depth (optional): Maximum recursion depth (default: 1)
  • include_files (optional): Include files in results (default: true)
  • include_dirs (optional): Include directories in results (default: true)
  • file_filter (optional): File extension filter

Usage Examples

With Claude Desktop

Add to your Claude Desktop MCP configuration:

{
  "mcpServers": {
    "datasage": {
      "command": "uvx",
      "args": ["p6plab-datasage", "--config", "/path/to/datasage.yaml"]
    }
  }
}

With FastMCP Client

import asyncio
from fastmcp import Client

async def main():
    async with Client("uvx p6plab-datasage") as client:
        # Search for Python files
        result = await client.call_tool("search", {
            "query": "function",
            "file_type": ".py",
            "search_type": "content"
        })
        print(result.content[0].text)

asyncio.run(main())

Command Line Options

# Basic usage
uvx p6plab-datasage

# HTTP server
uvx p6plab-datasage --transport http --port 8000

# Custom configuration
uvx p6plab-datasage --config /path/to/config.yaml

# Bind to all interfaces
uvx p6plab-datasage --transport http --host 0.0.0.0 --port 8000

# Show help
uvx p6plab-datasage --help

Security

DataSage implements multiple security measures:

  • Path Validation: Only allows access to explicitly configured paths
  • Directory Traversal Protection: Prevents ../ attacks
  • File Type Filtering: Automatically excludes binary files
  • Size Limits: Configurable maximum file sizes
  • Permission Checking: Respects file system permissions

Development

Running from Source

git clone <repository>
cd datasage
pip install -e .
python -m p6plab_datasage.server --config examples/datasage.yaml

Running Tests

pip install -e ".[dev]"
pytest

Using FastMCP CLI

fastmcp run src/p6plab_datasage/server.py
fastmcp run src/p6plab_datasage/server.py --transport http --port 8000

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

p6plab_datasage-1.0.0.tar.gz (229.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

p6plab_datasage-1.0.0-py3-none-any.whl (26.1 kB view details)

Uploaded Python 3

File details

Details for the file p6plab_datasage-1.0.0.tar.gz.

File metadata

  • Download URL: p6plab_datasage-1.0.0.tar.gz
  • Upload date:
  • Size: 229.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.1

File hashes

Hashes for p6plab_datasage-1.0.0.tar.gz
Algorithm Hash digest
SHA256 506ff727f1f87063d50ff168f14a8a8273d76babe41bfb9afac7fcb718ca82b6
MD5 34a554262e82c7466b0f2206455ea666
BLAKE2b-256 8cbac1d1662ed340d9c905bad074dcc7241f9b0eb639cca4212eec5e1e4e477d

See more details on using hashes here.

File details

Details for the file p6plab_datasage-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for p6plab_datasage-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 26db441daf4ab6629d7d0d2e03179f395f228cf6b6ee668a1cfea86ded54cc34
MD5 1b4d384d14c97d69f1ada80c963c7ce1
BLAKE2b-256 fc87bfff59f62aeffa066e47739f73aa550e3ceb98452ce4a80a56d77eaa1c38

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page