Skip to main content

MCP server for read-only sensitive data: tools return metadata and aggregates only

Project description

Glovebox

Glovebox is a Model Context Protocol (MCP) server that exposes read-only access to sensitive files mounted at a fixed path inside a container. Tools return metadata and aggregates only (directory listings, file stats, regex match counts, line numbers, CSV dimensions, line counts)—not raw file contents.

The design follows the “glovebox / cleanroom” idea: data stays in a bounded environment; the model receives structured summaries through tool results, not a dump of secrets.

Full documentation: browse the markdown on GitLab (index), or build the MkDocs site locally with pip install -e '.[docs]' && mkdocs serve.

Quick start (pip)

pip install mcp-glovebox

Set the root and start the MCP server on stdio:

GLOVEBOX_ROOT=/path/to/sensitive glovebox

Pre-flight check before wiring a client:

GLOVEBOX_ROOT=/path/to/sensitive glovebox --doctor
glovebox --print-config    # resolved JSON snapshot for automation

Configure your MCP client to run glovebox with GLOVEBOX_ROOT set. Copy-paste JSON for Claude Desktop and Cursor are on the MCP client examples page.

Quick start (Docker)

Architecture note: the published image targets linux/arm64 (Apple Silicon, AWS Graviton). It runs natively on Mac M-series and arm64 Linux. For x86-64 hosts, build locally (see "Build it yourself" under Releases and versioning).

Pull the published image:

docker pull touchthesun/glovebox:0.1.1

Or build locally:

docker build -t glovebox:local .

Run the MCP server on stdio (required for most MCP clients). Mount your sensitive directory read-only at /glovebox/data. Use the hardened form for any deployment against real sensitive data:

docker run --rm -i \
  --read-only \
  --tmpfs /tmp \
  --cap-drop ALL \
  --security-opt no-new-privileges \
  -v /path/to/sensitive:/glovebox/data:ro \
  touchthesun/glovebox:0.1.1

-i keeps stdin open so the client can speak MCP over stdio. See Security defaults for a full audit of what the flags do and what a naive user gets without them.

Configure your MCP client to launch this command. Copy-paste JSON templates are in MCP client examples; broader integration notes live in Integration.

Before attaching a client locally, sanity-check your mount directory:

GLOVEBOX_ROOT=/path/to/sensitive glovebox --doctor

Environment variables

Variable Default Meaning
GLOVEBOX_ROOT /glovebox/data Directory all tool paths are relative to
GLOVEBOX_MAX_OUTPUT_BYTES 256000 Upper bound on JSON size for a tool result
GLOVEBOX_MAX_SEARCH_FILE_BYTES 1048576 Files larger than this are rejected for search/aggregate
GLOVEBOX_SEARCH_BUDGET 100 Per-session ceiling on search calls (<=0 disables); bounds oracle-style reconstruction
GLOVEBOX_MIN_CELL 5 Small-cell suppression threshold; counts below this are returned as "<k" to reduce re-identification risk
GLOVEBOX_MIN_FILE_ROWS 0 (off) Refuse search/aggregate on files below N rows/lines
GLOVEBOX_REDACT_FILENAMES 0 (off) Hash name fields in glovebox_list responses instead of returning real filenames. Enable when filenames in the mount are themselves sensitive (e.g. patient_HIV_positive.pdf). Trade-off: directory-listing navigation is disabled; directed-analysis workflows (explicit paths) are unaffected.
GLOVEBOX_AUDIT_LOG (stderr only) Append JSONL audit records to this file path in addition to stderr

Built-in tools

  • glovebox_list — List directory entries (name, type, size, mtime). No file contents.
  • glovebox_stat — Metadata for one path. No file contents.
  • glovebox_search — Regex search: count_matches or line_numbers_only. Never returns matching line text.
  • glovebox_aggregatecsv: row and column counts; text: line count only. Never returns cell or line contents.

Paths are relative to GLOVEBOX_ROOT; absolute paths and escapes outside the mount are rejected.

Threat model (summary)

Primary defence — keep secrets out of filenames. Tool responses return metadata verbatim (filenames, directory structure, sizes, mtimes). Do not encode sensitive information in file or directory names; Glovebox protects file contents, not metadata. If you cannot control filenames, set GLOVEBOX_REDACT_FILENAMES=1 — see the env vars table above for the trade-off.

Glovebox reliably answers count and frequency questions (how many rows match this pattern? which lines contain this credential?) without field values entering model context. Segmented analysis over known categories is possible with multiple search calls. Statistical aggregates, value discovery, and open-ended exploration require the constrained-computation roadmap tier. See the use-case boundary analysis for a full ✓/≈/✗ breakdown across PII and code-audit scenarios.

Glovebox is one control in a larger compliance story: it minimizes what crosses into the model context but does not govern LLM providers, compromised hosts, or malicious MCP clients. See the full threat model and harness non-goals.

Development

python -m venv .venv
source .venv/bin/activate
pip install -e '.[dev]'
pytest                                          # CI also runs scripts/export_tool_manifest.py --check

Run the MCP server locally (stdio):

GLOVEBOX_ROOT=/path/to/data python -m glovebox

Pre-flight diagnostics:

GLOVEBOX_ROOT=/path/to/data python -m glovebox --doctor
python -m glovebox --print-config               # resolved JSON snapshot for automation

Documentation site locally:

pip install -e '.[docs]'
mkdocs serve

Releases and versioning

Releases follow Semantic Versioning. See CHANGELOG.md for the full history.

Tagging convention: a git tag v0.1.0 produces Docker images tagged 0.1.0 and latest. Users should pin to the versioned tag, not latest.

docker pull touchthesun/glovebox:0.1.1     # pinned — recommended
docker pull touchthesun/glovebox:latest     # floating — only for local dev

Architecture: published images target linux/arm64 (Apple Silicon, AWS Graviton). x86-64 users should build locally.

To cut a release:

  1. Update version in pyproject.toml and src/glovebox/_version.py to match.
  2. Add a release entry to CHANGELOG.md.
  3. Commit, then push a semver tag:
    git tag v0.1.0 && git push origin v0.1.0
    
  4. In CI, manually trigger docker_push_hub (and docker_push for the GitLab registry). Both jobs require DOCKERHUB_USER / DOCKERHUB_TOKEN CI variables to be set.

Build it yourself (required for x86-64; always available as a fallback):

docker build -t glovebox:local .
docker tag glovebox:local touchthesun/glovebox:0.1.0
docker push touchthesun/glovebox:0.1.0

Adding tools

Use the glovebox-tool Cursor skill and the templates/tool template. New tools must preserve the no-leak contract and add contract tests. Run python scripts/validate_tools.py before committing, then python scripts/export_tool_manifest.py --write when tool schemas change.

Contributing and security

See CONTRIBUTING.md for development setup, the no-leak contract, and the PR checklist. To report a security vulnerability, follow the process in SECURITY.md — do not open a public issue.

Evaluation harness

The harness directory runs four layers of scenarios (tool surface, LLM behavior, inference, evidence). See Harness overview, harness roadmap, and CI semantics.

Optional Falco-sidecar notes: Hardening.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_glovebox-0.1.2.tar.gz (43.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_glovebox-0.1.2-py3-none-any.whl (31.6 kB view details)

Uploaded Python 3

File details

Details for the file mcp_glovebox-0.1.2.tar.gz.

File metadata

  • Download URL: mcp_glovebox-0.1.2.tar.gz
  • Upload date:
  • Size: 43.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for mcp_glovebox-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4d32611b49db86c2d47116469b8f525760df628b1243695bb0703fd83b3795ae
MD5 6468686c3bbc5f6e57c572f03ad4bc05
BLAKE2b-256 6e1ced579b84b5e7a374349998f02ee6b35f7a3a08d5242afca6fac81cf4e2c2

See more details on using hashes here.

File details

Details for the file mcp_glovebox-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: mcp_glovebox-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 31.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for mcp_glovebox-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7602a81c9f2af357b85793fef212114e4dc75b409113d818ff2a1c20d2191d2a
MD5 bb2d628304596130d74b34822ae60755
BLAKE2b-256 7d2434a34a60435f7e451a349dd3c2e5d5e38c07e04a1d0d4d1db45d8cb9990b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page