Skip to main content

MCP server for IDA SDK API workflow retrieval

Project description

IDA Script Helper: Writing IDA scripts easily with MCP tools

This is an MCP server that helps agents write correct IDA Pro scripts by retrieving API call sequences from real IDA SDK source code and IDAPython examples.

Current Status: Tested on IDA Pro 8.4 SDK and corresponding IDAPython version. Since the parser backend is based on tree-sitter for C++ and Python's built-in ast module for IDAPython, minor adjustments would be required to support parsing other versions of the SDK examples.

Example Use Case

When enabled as an MCP server for an agent like Claude Code, you can ask:

Under the examples/ dir, write an IDAPython script that lists all functions
within the .text section, and then print it out in the console. Name the script "list_all_text_funcs.py".

Then the agent can query this toolset to find a proper workflow (API call sequence) that accomplishes this task, and retrieve API docs to understand how to use each function correctly. In the end, the agent can generate a complete script that follows the correct sequence of API calls, with proper arguments and data flow.

Problem Statement

LLMs frequently get IDA SDK API call sequences wrong. Listing cross-references isn't a single API call — it requires calling get_screen_ea(), obtaining a func_t* with get_func(), iterating with xrefblk_t::first_to() / xrefblk_t::next_to(), and formatting output with get_name() and msg(). Miss any step and the script silently fails.

This tool automatically extracts these workflow patterns from IDA's own SDK examples (C++ plugins, processor modules, loaders) and IDAPython example scripts, indexes them, and serves them via MCP so any LLM can query them.

Features

  • Dual-language support: Extracts workflows from both C++ SDK examples and IDAPython scripts
  • SWIG stub parsing: Harvests rich docstrings from ida_*.py stubs (@param, @return, signatures)
  • Data-flow tracking: Follows variable assignments across API calls to show how outputs feed into inputs
  • Trust-ranked results: Official SDK examples surface first, modules and loaders second
  • Semantic search: Natural-language queries matched against workflow descriptions and API briefs
  • Multi-version support: Index multiple SDK versions side-by-side, switch at query time

MCP Tools

Tool Purpose Input
get_workflows Find API call sequences for a task Natural-language task description
get_api_doc Look up a function, struct, or class (fuzzy match) Function/type name or keyword
list_related_apis Find co-occurring APIs Function or type name
get_index_info Show indexed version metadata and record counts
clear_index Delete index data for one version Optional version string
initialize_index Build index from SDK path (and optional IDAPython path) sdk_path, version, optional python_path
get_versions List all indexed SDK versions
select_version Switch active SDK version Version string (e.g., "84")

Example

get_workflows("get the function at an address and print its name")

Returns:

=== Result 1 (trust: highest) ===
Workflow: run
Source: plugins/vcsample/vcsample.cpp

1. get_screen_ea(...)
2. get_func(...)        [uses ea from step 1]
3. get_func_name(...)   [uses pfn from step 2]
4. msg(...)

Source code:
  ea_t ea = get_screen_ea();
  func_t *pfn = get_func(ea);
  qstring name = get_func_name(ea);
  msg("Function: %s\n", name.c_str());

With --python-path, Python results appear alongside C++:

=== Result 2 (trust: highest) ===
Workflow: <module>
Source: python/examples/core/dump_flowchart.py

1. ida_kernwin.get_screen_ea(...)
2. ida_funcs.get_func(...)    [uses ea from step 1]
3. ida_gdl.FlowChart(...)     [uses func from step 2]

Setup

# Clone and install
git clone https://github.com/ruotoy/IDA-Sdk-Workflow-MCP.git
cd IDA-Sdk-Workflow-MCP
python3.10 -m venv .venv
.venv/bin/pip install -e ".[dev]"

Note: Python 3.10 is required — tree-sitter-languages does not ship wheels for 3.13+.

Usage

1. Build the index

C++ only (IDA SDK)

ida-api-mcp-admin build-index \
  --sdk-path /path/to/idasdk_pro84 \
  --version 84

C++ + Python (IDA SDK + IDAPython)

ida-api-mcp-admin build-index \
  --sdk-path /path/to/idasdk_pro84 \
  --python-path /path/to/idapro-8.4/python \
  --version 84

The --python-path should point to the python/ directory inside your IDA Pro installation. It expects:

  • 3/ida_*.py — SWIG-generated API stubs
  • 3/idautils.py, 3/idc.py — higher-level utility modules
  • examples/ — official IDAPython example scripts

Options

Option Description
--sdk-path (required) Path to IDA SDK directory (e.g., idasdk_pro84/)
--python-path Path to IDAPython directory (e.g., idapro-8.4/python/)
--version (required) SDK version string (e.g., 84 for IDA 8.4)
--db-path Base path for ChromaDB storage (default: data/chroma_db)
--max-files Limit number of source files to process (for testing)

2. Test queries

ida-api-mcp-admin inspect workflows "decompile a function"
ida-api-mcp-admin inspect workflows "cross references to an address"
ida-api-mcp-admin inspect workflows "list all functions in a segment"
ida-api-mcp-admin inspect workflows "enumerate file imports"

# Inspect metadata and API docs from CLI:
ida-api-mcp-admin inspect info --version 84
ida-api-mcp-admin inspect api-doc get_func --version 84
ida-api-mcp-admin inspect related get_func --version 84

# Clear one indexed version:
ida-api-mcp-admin clear-index --version 84

3. Add as MCP server

Claude Code

claude mcp add ida-api-mcp /path/to/IDA-Sdk-Workflow-MCP/.venv/bin/ida-api-mcp

Or create a .mcp.json file in the project root:

{
  "mcpServers": {
    "ida-api-mcp": {
      "command": "/path/to/IDA-Sdk-Workflow-MCP/.venv/bin/ida-api-mcp",
      "args": []
    }
  }
}

Claude Desktop

Add to ~/.config/Claude/claude_desktop_config.json (Linux), ~/Library/Application Support/Claude/claude_desktop_config.json (macOS), or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "ida-api-mcp": {
      "command": "/path/to/IDA-Sdk-Workflow-MCP/.venv/bin/ida-api-mcp",
      "args": []
    }
  }
}

After publishing to PyPI

{
  "mcpServers": {
    "ida-api-mcp": {
      "command": "uvx",
      "args": ["ida-api-mcp"]
    }
  }
}

How It Works

[1. Collect]  Enumerate source files from IDA SDK (C++) and IDAPython (Python)
      |         C++: plugins/, module/, ldr/, dbg/
      |         Python: python/examples/core/, hexrays/, analysis/, ...
      ↓
[2. Parse]    C++: tree-sitter C++ → AST
              Python: ast module → AST
              Stubs: ast module → API docs from ida_*.py SWIG docstrings
      ↓
[3. Extract]  Identify IDA API calls per function
              C++: idaman/ida_export patterns, member calls, qualified calls
              Python: module-qualified calls (ida_funcs.get_func),
                      direct imports (from ida_funcs import get_func)
              Track variable assignments to build data-flow edges
      ↓
[4. Index]    Store call chains + source snippets in ChromaDB
              Embed with semantic vectors for natural-language search
              Tag with language metadata (cpp / python)
      ↓
[5. Serve]    MCP server retrieves relevant workflows at query time
              Results ranked by trust level, then similarity

Data sources by trust level

Trust C++ Sources Python Sources
Highest plugins/ — official SDK example plugins python/examples/ — official IDAPython examples
High module/, ldr/, dbg/ — processor modules, loaders, debuggers
Medium include/ — header declarations

Build statistics (SDK 8.4)

Metric C++ Python Combined
Source files 356 89 examples 445
Extracted workflows 921 112 1,033
API calls captured 505
Data-flow edges 143
API doc entries 1,443 (from headers) 8,060 (from stubs) 8,491 (merged)

Python extraction coverage

63 of 89 IDAPython example scripts (71%) produce at least one workflow. The remaining 26 break down as:

Category Count Notes
No IDA API calls at all 10 Hook skeletons, config files, pure boilerplate
Single API call (below min-2 threshold) 8 Trivial one-liners, no meaningful workflow
2 calls spread across separate class methods 6 Each method has only 1 call; no single function reaches threshold
Non-standard import pattern 2 e.g., from Choose import Choose — not an ida_* module

Project Structure

src/ida_api_mcp/
├── server.py                       # MCP server (FastMCP, stdio transport)
├── cli.py                          # CLI: build-index, inspect, list-versions, serve
├── config.py                       # Configuration dataclass
├── version_manager.py              # Multi-version index management
├── collector/
│   ├── sdk_source.py               # Enumerate C++ files, build API names from headers
│   ├── python_source.py            # Enumerate Python examples, collect stub docs
│   └── doc_source.py               # Collect API docs from C++ headers
├── parser/
│   ├── cpp_parser.py               # tree-sitter C++ parsing
│   ├── python_parser.py            # ast-based Python parsing (imports, functions, metadata)
│   ├── html_parser.py              # Doxygen comment extraction from C++ headers
│   └── stub_parser.py              # SWIG stub parsing (ida_*.py → HeaderApiDoc)
├── extractor/
│   ├── models.py                   # Core data models (Workflow, ApiCall, DataFlowEdge, etc.)
│   ├── call_chain.py               # C++ workflow extraction (tree-sitter AST)
│   └── python_call_chain.py        # Python workflow extraction (ast module)
└── indexer/
    ├── store.py                    # ChromaDB ingestion (workflows + API docs)
    └── search.py                   # Semantic search interface

Development

# Run tests (65 tests)
.venv/bin/pytest -v

# Lint
.venv/bin/ruff check src/ tests/

Publish to PyPI (Twine)

# Build distributions
.venv/bin/python -m build

# Validate package metadata
.venv/bin/python -m twine check dist/*

# Upload (token-based auth)
TWINE_USERNAME=__token__ TWINE_PASSWORD=<pypi-token> .venv/bin/python -m twine upload dist/*

GitHub Actions also publishes on release via .github/workflows/pypi.yml.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ida_api_mcp-0.1.0.tar.gz (42.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ida_api_mcp-0.1.0-py3-none-any.whl (41.2 kB view details)

Uploaded Python 3

File details

Details for the file ida_api_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: ida_api_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 42.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for ida_api_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b509ecbb85d091ed731e4f288d8a7ea837c3b33706fdfd864bb255e98e08c510
MD5 f6af1f3dcd9591864ccb779aed9ab74a
BLAKE2b-256 bde635129cca703240ef026bb347dffb27d4771a4e6c8a9e4743c5986e96da30

See more details on using hashes here.

File details

Details for the file ida_api_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ida_api_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 41.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for ida_api_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 01f26c89879772f4bcc67031d1cb04b68dbbd3550d37a5c6118e7f560b0fadc5
MD5 1b0b348ddb9418c78084f13ad78e9055
BLAKE2b-256 1d7fcfb61b703d40d12d2cce9ed0fcf619363cf007f4df25f53119a6892a89ed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page