Skip to main content

A lightweight Model Context Protocol (MCP) server for Stata. Execute commands, inspect data, retrieve stored results (`r()`/`e()`), and view graphs in your chat interface. Built for economists who want to integrate LLM assistance into their Stata workflow.

Project description

Stata MCP Server

Install MCP Server  PyPI - Version

A Model Context Protocol (MCP) server that connects AI agents to a local Stata installation.

If you'd like a fully integrated VS Code extension to run Stata code without leaving your IDE, and also allow AI agent interaction, check out my other project: Stata Workbench.

Built by Thomas Monk, London School of Economics.

This server enables LLMs to:

  • Execute Stata code: run any Stata command (e.g. sysuse auto, regress price mpg).
  • Inspect data: retrieve dataset summaries and variable codebooks.
  • Export graphics: generate and view Stata graphs (histograms, scatterplots).
  • Streaming graph caching: automatically cache graphs during command execution for instant exports.
  • Verify results: programmatically check stored results (r(), e()) for accurate validation.

Prerequisites

  • Stata 17+ (Stata MP, SE, or BE). Must be licensed and installed locally.
  • Python 3.11+
  • uv (recommended)

Note on pystata: This server uses the proprietary pystata module that is included with your Stata installation. There is a third-party package named pystata on PyPI that is not the official Stata package and should not be installed. MCP-Stata handles finding and loading the official module from your Stata directory automatically.

Installation

Run as a published tool with uvx

uvx --refresh --refresh-package mcp-stata --from mcp-stata@latest mcp-stata

uvx is an alias for uv tool run and runs the tool in an isolated, cached environment.

Configuration

This server attempts to automatically discover your Stata installation (supporting standard paths and StataNow).

If auto-discovery fails, set the STATA_PATH environment variable to your Stata executable:

# macOS example
export STATA_PATH="/Applications/StataNow/StataMP.app/Contents/MacOS/stata-mp"

# Windows example (cmd.exe)
set STATA_PATH="C:\Program Files\Stata18\StataMP-64.exe"

If you encounter write permission issues with temporary files (common on Windows), you can override the temporary directory location by setting MCP_STATA_TEMP:

# Example
export MCP_STATA_TEMP="/path/to/writable/temp"

The server will automatically try the following locations in order of preference:

  1. MCP_STATA_TEMP environment variable
  2. System temporary directory
  3. ~/.mcp-stata/temp
  4. Current working directory subdirectory (.tmp/)

Startup Do Files

When a session starts, MCP-Stata loads startup do files in the same order as native Stata:

  1. MCP_STATA_STARTUP_DO_FILE (env var) — one or more custom do files, separated by : (Unix) or ; (Windows).
  2. sysprofile.do — the first one found along the Stata search path.
  3. profile.do — the first one found along the Stata search path.

The search path mirrors native Stata: Stata install directory, current working directory, then the ado-path (PERSONAL, SITE, PLUS, OLDPLACE, ...). Only the first sysprofile.do and first profile.do found are executed, matching native Stata behavior. All paths are deduplicated so the same file is never run twice.

If a command clears programs (clear all, clear programs, or program drop _all), MCP-Stata automatically re-executes the startup files so that any programs they defined remain available. To disable this and let clear all behave exactly as in native Stata (programs are lost), set:

MCP_STATA_NO_RELOAD_ON_CLEAR=1

If you prefer, add these variables to your MCP config's env for any IDE shown below. It's optional and only needed when discovery cannot find Stata.

Optional env example (add inside your MCP server entry):

"env": {
  "STATA_PATH": "/Applications/StataNow/StataMP.app/Contents/MacOS/stata-mp",
  "MCP_STATA_STARTUP_DO_FILE": "/path/to/my/startup.do",
  "MCP_STATA_NO_RELOAD_ON_CLEAR": "1"
}

IDE Setup (MCP)

This MCP server uses the stdio transport (the IDE launches the process and communicates over stdin/stdout).


Claude Desktop

Open Claude Desktop → SettingsDeveloperEdit Config. Config file locations include:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json

Published tool (uvx)

{
  "mcpServers": {
    "mcp-stata": {
      "command": "uvx",
        "args": [
        "--refresh",
        "--refresh-package",
        "mcp-stata",
        "--from",
        "mcp-stata@latest",
        "mcp-stata"
      ]
    }
  }
}

After editing, fully quit and restart Claude Desktop to reload MCP servers.


Cursor

Cursor supports MCP config at:

  • Global: ~/.cursor/mcp.json
  • Project: .cursor/mcp.json

Published tool (uvx)

{
  "mcpServers": {
    "mcp-stata": {
      "command": "uvx",
       "args": [
        "--refresh",
        "--refresh-package",
        "mcp-stata",
        "--from",
        "mcp-stata@latest",
        "mcp-stata"
      ]
    }
  }
}

Windsurf

Windsurf supports MCP plugins and also allows manual editing of mcp_config.json. After adding/editing a server, use the UI’s refresh so it re-reads the config.

A common location is ~/.codeium/windsurf/mcp_config.json.

Published tool (uvx)

{
  "mcpServers": {
    "mcp-stata": {
      "command": "uvx",
        "args": [
        "--refresh",
        "--refresh-package",
        "mcp-stata",
        "--from",
        "mcp-stata@latest",
        "mcp-stata"
      ]
    }
  }
}

Google Antigravity

In Antigravity, MCP servers are managed from the MCP store/menu; you can open Manage MCP Servers and then View raw config to edit mcp_config.json.

Published tool (uvx)

{
  "mcpServers": {
    "mcp-stata": {
      "command": "uvx",
        "args": [
        "--refresh",
        "--refresh-package",
        "mcp-stata",
        "--from",
        "mcp-stata@latest",
        "mcp-stata"
      ]
    }
  }
}

Visual Studio Code

VS Code supports MCP servers via a .vscode/mcp.json file. The top-level key is servers (not mcpServers).

Create .vscode/mcp.json:

Published tool (uvx)

{
  "servers": {
    "mcp-stata": {
      "type": "stdio",
      "command": "uvx",
      "args": [
        "--refresh",
        "--refresh-package",
        "mcp-stata",
        "--from",
        "mcp-stata@latest",
        "mcp-stata"
      ]
    }
  }
}

VS Code documents .vscode/mcp.json and the servers schema, including type and command/args.


Skills

Tools Available (from server.py)

  • run_command(code, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None, session_id="default"): Execute Stata syntax in the specified session.
    • Always writes output to a temporary log file and emits a single notifications/logMessage containing {"event":"log_path","path":"..."} so the client can tail it locally.
    • May emit notifications/progress when the client provides a progress token/callback.
  • read_log(path, offset=0, max_bytes=65536): Read a slice of a previously-provided log file (JSON: path, offset, next_offset, data).
  • find_in_log(path, query, start_offset=0, max_bytes=5_000_000, before=2, after=2, case_sensitive=False, regex=False, max_matches=50): Search a log file for text and return context windows.
    • Returns JSON with matches (context lines, line indices), next_offset, and truncated if max_matches is hit.
    • Supports literal or regex search with bounded read window for large logs.
  • load_data(source, clear=True, as_json=True, raw=False, max_output_lines=None, session_id="default"): Heuristic loader (sysuse/webuse/use/path/URL) for the specified session.
  • get_ui_channel(session_id="default"): Return a short-lived localhost HTTP endpoint + bearer token for the UI-only data browser, targeting the specified session.
  • describe(session_id="default"): View dataset structure via Stata describe.
  • list_graphs(session_id="default"): See available graphs in memory (JSON list with an active flag).
  • export_graph(graph_name=None, format="pdf", session_id="default"): Export a graph to a file path.
  • export_graphs_all(session_id="default"): Export all in-memory graphs. Returns file paths.
  • get_help(topic, plain_text=False, session_id="default"): Markdown-rendered Stata help.
  • codebook(variable, as_json=True, trace=False, raw=False, max_output_lines=None, session_id="default"): Variable-level metadata.
  • run_do_file(path, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None, session_id="default"): Execute a .do file in the specified session.
  • get_stored_results(session_id="default"): Get r() and e() scalars/macros as JSON.
  • get_variable_list(session_id="default"): JSON list of variables and labels.
  • create_session(session_id): Manually create a new Stata session.
  • list_sessions(): List all active sessions and their status.
  • stop_session(session_id): Terminate a specific session.
  • break_session(session_id="default"): Interrupt/Break the currently running command in a specific session. Use this if a command is taking too long and you want to stop it without closing the session and losing your data.

Cancellation

  • Clients may cancel an in-flight request by sending the MCP notification notifications/cancelled with params.requestId set to the original tool call ID.
  • Client guidance:
    1. Pass a _meta.progressToken when invoking the tool if you want progress updates (optional).
    2. If you need to cancel, send notifications/cancelled with the same requestId. You may also stop tailing the log file path once you receive cancellation confirmation (the tool call will return an error indicating cancellation).
    3. Be prepared for partial output in the log file; cancellation is best-effort and depends on Stata surfacing BreakError.

Resources exposed for MCP clients:

  • stata://data/summarysummarize
  • stata://data/metadatadescribe
  • stata://graphs/list → graph list (resource handler delegates to list_graphs tool)
  • stata://variables/list → variable list (resource wrapper)
  • stata://results/stored → stored r()/e() results

UI-only Data Browser (Local HTTP API)

This server also hosts a localhost-only HTTP API intended for a VS Code extension UI to browse data at high volume (paging, filtering) without sending large payloads over MCP.

Important properties:

  • Loopback only: binds to 127.0.0.1.
  • Bearer auth: every request requires an Authorization: Bearer <token> header.
  • Short-lived tokens: clients should call get_ui_channel() to obtain a fresh token as needed.
  • Session Isolate: caches (views, sorting) are isolated per sessionId.
  • No Stata dataset mutation for browsing/filtering:
    • No generated variables.
    • Paging uses sfi.Data.get.
    • Filtering is evaluated in Python over chunked reads.

Discovery via MCP (get_ui_channel)

Call the MCP tool get_ui_channel() and parse the JSON:

{
  "baseUrl": "http://127.0.0.1:53741",
  "token": "...",
  "expiresAt": 1730000000,
  "capabilities": {
    "dataBrowser": true,
    "filtering": true,
    "sorting": true,
    "arrowStream": true
  }
}

Server-enforced limits (current defaults):

  • maxLimit: 500
  • maxVars: 32,767
  • maxChars: 500
  • maxRequestBytes: 1,000,000
  • maxArrowLimit: 1,000,000 (specific to /v1/arrow)

Endpoints

All endpoints are under baseUrl and require the bearer token.

  • GET /v1/dataset?sessionId=default
    • Returns dataset identity and basic state (id, frame, n, k) for the given session.
  • GET /v1/vars?sessionId=default
    • Returns full variable list with labels, types, and formats.
  • POST /v1/page
    • Paged data retrieval. Supports sortBy, filterExpr (ephemeral), and sessionId.
  • POST /v1/arrow
    • Returns a binary Arrow IPC stream (same input as /v1/page).
  • POST /v1/views
    • Create a long-lived filtered view. Returns a viewId. Requires sessionId.
  • POST /v1/views/<viewId>/page
    • Paged retrieval from a previously created view. Supports sortBy and sessionId.
  • POST /v1/views/:viewId/arrow
    • Returns a binary Arrow IPC stream from a filtered view.
  • DELETE /v1/views/:viewId
    • Deletes a view handle.
  • POST /v1/filters/validate
    • Validates a filter expression.

Paging request example

curl -sS \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"datasetId":"...","frame":"default","offset":0,"limit":50,"vars":["price","mpg"],"includeObsNo":true,"maxChars":200}' \
  "$BASE_URL/v1/page"

Sorting

The /v1/page and /v1/views/:viewId/page endpoints support sorting via the optional sortBy parameter:

# Sort by price ascending
curl -sS \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"datasetId":"...","offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["price"]}' \
  "$BASE_URL/v1/page"

# Sort by price descending
curl -sS \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"datasetId":"...","offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]}' \
  "$BASE_URL/v1/page"

# Multi-variable sort: foreign ascending, then price descending
curl -sS \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"datasetId":"...","offset":0,"limit":50,"vars":["foreign","price","mpg"],"sortBy":["foreign","-price"]}' \
  "$BASE_URL/v1/page"

Sort specification format:

  • sortBy is an array of strings (variable names with optional prefix)
  • No prefix or + prefix = ascending order (e.g., "price" or "+price")
  • - prefix = descending order (e.g., "-price")
  • Multiple variables are supported for multi-level sorting
  • Uses the native Rust sorter when available, with a Polars fallback

Sorting with filtered views:

  • Sorting is fully supported with filtered views
  • The sort is computed in-memory over the sort columns, then filtered indices are re-applied
  • Example: Filter for price < 5000, then sort descending by price
# Create a filtered view
curl -sS \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"datasetId":"...","frame":"default","filterExpr":"price < 5000"}' \
  "$BASE_URL/v1/views"
# Returns: {"view": {"id": "view_abc123", "filteredN": 37}}

# Get sorted page from filtered view
curl -sS \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]}' \
  "$BASE_URL/v1/views/view_abc123/page"

Notes:

  • datasetId is used for cache invalidation. If the dataset changes due to running Stata commands, the server will report a new dataset id and view handles become invalid.
  • Filter expressions are evaluated in Python using values read from Stata via sfi.Data.get. Use boolean operators like ==, !=, <, >, and and/or (Stata-style &/| are also accepted).
  • Sorting does not mutate the dataset order in Stata; it computes sorted indices for the response and caches them for subsequent requests.
  • The Rust sorter is the primary implementation; Polars is used only as a fallback when the native extension is unavailable.

License

This project is licensed under the GNU Affero General Public License v3.0 or later. See the LICENSE file for the full text.

Error reporting

  • All tools that execute Stata commands support JSON envelopes (as_json=true) carrying:
    • rc (from r()/c(rc)), stdout, stderr, message, optional line (when Stata reports it), command, optional log_path (for log-file streaming), and a snippet excerpt of error output.
  • Stata-specific cues are preserved:
    • r(XXX) codes are parsed when present in output.
    • “Red text” is captured via stderr where available.
    • trace=true adds set trace on around the command/do-file to surface program-defined errors; the trace is turned off afterward.

Logging

Set MCP_STATA_LOGLEVEL (e.g., DEBUG, INFO) to control server logging. Logs include discovery details (edition/path) and command-init traces for easier troubleshooting.

Development & Contributing

For detailed information on building, testing, and contributing to this project, see CONTRIBUTING.md.

Quick setup:

# Install dependencies
uv sync --extra dev --no-install-project

# Run tests (requires Stata)
pytest

# Run tests without Stata
pytest -v -m "not requires_stata"

# Build the package
python -m build

Tests

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_stata-1.26.1.tar.gz (256.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mcp_stata-1.26.1-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

mcp_stata-1.26.1-cp311-abi3-win_amd64.whl (940.4 kB view details)

Uploaded CPython 3.11+Windows x86-64

mcp_stata-1.26.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

mcp_stata-1.26.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

mcp_stata-1.26.1-cp311-abi3-macosx_11_0_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.11+macOS 11.0+ x86-64

mcp_stata-1.26.1-cp311-abi3-macosx_11_0_arm64.whl (970.2 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

File details

Details for the file mcp_stata-1.26.1.tar.gz.

File metadata

  • Download URL: mcp_stata-1.26.1.tar.gz
  • Upload date:
  • Size: 256.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mcp_stata-1.26.1.tar.gz
Algorithm Hash digest
SHA256 6274d455d82fc605296c0a2b986592325ad27f36a527ba7265bc0b64f963246c
MD5 13143c1a050cb685755418276de77e1b
BLAKE2b-256 74e2c41aba106f5fdba797463b781241a950bb3574603922ee5d3ef657acd1dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_stata-1.26.1.tar.gz:

Publisher: publish.yml on tmonk/mcp-stata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_stata-1.26.1-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mcp_stata-1.26.1-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9856112b542d9ef5510539c787fb74ee2f301df7e0281c0e47f30b38b91a5a27
MD5 9da8b6e5c7d1ac3d99627755c47b9f80
BLAKE2b-256 b3f7e6000110135386bc71fa52e63b767a8fa3650e83d13a0cf4aba3759d0be6

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_stata-1.26.1-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on tmonk/mcp-stata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_stata-1.26.1-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: mcp_stata-1.26.1-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 940.4 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mcp_stata-1.26.1-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 d778b90872a9e96d87b268cb0c71dc3fb989268efe0cb8b3173bb96c54a115a2
MD5 4f6b015c54efe5ba1b60bc801bb32137
BLAKE2b-256 adef83a8d82a01a76804041720618902ebcc425ea964851422e9af8d4b303782

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_stata-1.26.1-cp311-abi3-win_amd64.whl:

Publisher: publish.yml on tmonk/mcp-stata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_stata-1.26.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mcp_stata-1.26.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 01267767fc5a08fa080687d1c7c0eae26854bbbeb5a7e59d595e3ea4c7127d3a
MD5 ed2dd5070e09ef91aa4626b8561f6272
BLAKE2b-256 6596fa2715ddea2371e702f13eb7956495cd183c670c9254aab0ff75925c16d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_stata-1.26.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on tmonk/mcp-stata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_stata-1.26.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for mcp_stata-1.26.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 94dc30670cbb8dd4e4667311d29293009ae5cb44258335c3e586c4791655b31d
MD5 7703d5f18e54689392e3dc99e856cee4
BLAKE2b-256 09fa6575074f176bca4b4314f172cc6afc7c5fb6c932c9e596b123fc91a3ff33

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_stata-1.26.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: publish.yml on tmonk/mcp-stata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_stata-1.26.1-cp311-abi3-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for mcp_stata-1.26.1-cp311-abi3-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 1487fcba301455fa343149864412f49b9f163edfecfaed580e0da8c69979540e
MD5 b0d68a1352830690778c93ececa03aba
BLAKE2b-256 73f739f25a0bcfb9f4f2ba6a72a987fb1c236afa0e3fa54e543863c759c3a515

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_stata-1.26.1-cp311-abi3-macosx_11_0_x86_64.whl:

Publisher: publish.yml on tmonk/mcp-stata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_stata-1.26.1-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mcp_stata-1.26.1-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 343fce7b526ed42dd04803f11167bd36fe8be787f717b1ef9f69e257f8bd999a
MD5 39c0f3fb5aa61138be4ca71568c14759
BLAKE2b-256 24a3fa324b57b9cdd4722efc6a975d21b6aa6442a8e1d4b7e315970b131fcd3c

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_stata-1.26.1-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: publish.yml on tmonk/mcp-stata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page