Skip to main content

IPython Data Analysis MCP Server - 基于 IPython 内核的轻量级数据分析 MCP 工具

Project description

IPython 数据分析 MCP 服务器 / IPython Data Analysis MCP Server

🇨🇳 中文 | 🇺🇸 English


中文版本

基于真正 IPython 内核的轻量级数据分析 MCP (Model Context Protocol) 工具,提供完整的交互式 Python 数据分析环境,支持会话管理、数据加载、实时数据查看等核心功能。

🚀 核心特性

  • 真正的 IPython 环境: 基于 IPython InteractiveShell,支持所有 IPython 功能
  • 多会话管理: 独立的会话空间,变量隔离,持久化状态
  • 智能数据加载: 支持 CSV/Excel/JSON,自动编码检测,智能变量命名
  • 实时监控: 内存使用监控、变量管理、执行历史追踪
  • 完整功能支持: Python代码、IPython魔法命令、系统命令执行
  • 智能采样: 大数据集友好的列数据查看,避免上下文溢出

📋 功能清单

17个核心工具函数

  1. 会话管理

    • create_ipython_session - 创建新的 IPython 会话
    • list_ipython_sessions - 列出所有活跃会话
    • get_session_status - 获取会话详细状态
    • delete_ipython_session - 删除指定会话
  2. 代码执行

    • execute_code - 执行 Python 代码、魔法命令、系统命令
    • get_execution_history - 获取执行历史记录
  3. 数据加载

    • load_csv_file - 加载 CSV 文件(自动编码检测)
    • load_excel_file - 加载 Excel 文件(支持 .xlsx/.xls)
    • load_json_file - 加载 JSON 文件
  4. 数据操作与查看

    • list_dataframes - 列出会话中所有 DataFrame
    • get_dataframe_info - 获取 DataFrame 详细信息
    • preview_dataframe - 预览 DataFrame 数据
    • get_dataframe_summary - 获取统计摘要
    • sample_column_data - 智能采样查看列数据
  5. 内存与变量管理

    • check_memory_usage - 检查内存使用情况
    • get_variable_info - 获取变量详细信息
    • clear_variables - 清理变量释放内存

🛠️ 安装配置

方法一:使用 uvx 直接运行(推荐)

无需克隆项目,直接使用 uvx 从 GitHub 运行:

# 安装 uvx(如果还没有安装)
pip install uvx

# 直接运行 MCP 服务器
uvx --from git+https://github.com/Hillyess/dataHill.git DATA_MCP.py

方法二:本地安装开发

# 1. 克隆项目
git clone git@github.com:Hillyess/dataHill.git
cd dataHill

# 2. 创建虚拟环境
conda create -n data-analyzer python=3.10
conda activate data-analyzer

# 3. 安装依赖
pip install -r requirements.txt

# 4. 测试安装
python DATA_MCP.py

配置 MCP 客户端

Claude Desktop 配置

编辑 Claude Desktop 配置文件:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

推荐配置(使用 uvx)

{
  "mcpServers": {
    "dataHill": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/Hillyess/dataHill.git",
        "DATA_MCP.py"
      ]
    }
  }
}

本地开发配置(如果使用方法二):

{
  "mcpServers": {
    "dataHill": {
      "command": "python",
      "args": ["/path/to/your/DATA_MCP.py"],
      "env": {
        "PYTHONPATH": "/path/to/your/project"
      }
    }
  }
}

📖 使用指南

基本工作流程

# 1. 创建会话
create_ipython_session()
# 返回: {"success": true, "session_id": "session_a1b2c3d4", ...}

# 2. 加载数据
load_csv_file("data.csv", "session_a1b2c3d4", "df")

# 3. 查看数据信息
get_dataframe_info("df", "session_a1b2c3d4")

# 4. 智能采样查看数据
sample_column_data("df", "column_name", "session_a1b2c3d4", method="mixed", sample_size=20)

# 5. 执行分析
execute_code("df.describe()", "session_a1b2c3d4")

# 6. 内存监控
check_memory_usage("session_a1b2c3d4")

# 7. 清理会话
delete_ipython_session("session_a1b2c3d4")

🔧 系统要求

  • Python: 3.8+
  • 内存: 建议 4GB+ (取决于数据规模)
  • 操作系统: Windows/macOS/Linux
  • MCP 客户端: Claude Desktop 或其他支持 stdio 的 MCP 客户端

📦 依赖项

核心依赖

  • fastmcp>=0.5.0 - MCP 服务器框架
  • ipython>=8.0.0 - IPython 交互式环境
  • pandas>=2.0.0 - 数据处理和分析
  • numpy>=1.24.0 - 数值计算基础库

数据支持

  • openpyxl>=3.1.0 - Excel .xlsx 文件支持
  • xlrd>=2.0.0 - Excel .xls 文件支持

系统监控

  • psutil>=5.9.0 - 内存和系统监控

🤝 贡献指南

  1. Fork 本项目
  2. 创建特性分支 (git checkout -b feature/AmazingFeature)
  3. 提交更改 (git commit -m 'Add some AmazingFeature')
  4. 推送到分支 (git push origin feature/AmazingFeature)
  5. 开启 Pull Request

📄 许可证

本项目采用 MIT 许可证 - 查看 LICENSE 文件了解详情。

🙋‍♂️ 支持与反馈


English Version

A lightweight data analysis MCP (Model Context Protocol) tool based on real IPython kernel, providing complete interactive Python data analysis environment with session management, data loading, real-time data viewing and other core functions.

🚀 Core Features

  • Real IPython Environment: Based on IPython InteractiveShell, supports all IPython features
  • Multi-Session Management: Independent session spaces, variable isolation, persistent state
  • Intelligent Data Loading: Supports CSV/Excel/JSON, automatic encoding detection, smart variable naming
  • Real-time Monitoring: Memory usage monitoring, variable management, execution history tracking
  • Complete Feature Support: Python code, IPython magic commands, system command execution
  • Smart Sampling: Large dataset friendly column data viewing, avoiding context overflow

📋 Feature List

17 Core Tool Functions

  1. Session Management

    • create_ipython_session - Create new IPython session
    • list_ipython_sessions - List all active sessions
    • get_session_status - Get detailed session status
    • delete_ipython_session - Delete specified session
  2. Code Execution

    • execute_code - Execute Python code, magic commands, system commands
    • get_execution_history - Get execution history
  3. Data Loading

    • load_csv_file - Load CSV files (automatic encoding detection)
    • load_excel_file - Load Excel files (supports .xlsx/.xls)
    • load_json_file - Load JSON files
  4. Data Operations & Viewing

    • list_dataframes - List all DataFrames in session
    • get_dataframe_info - Get detailed DataFrame information
    • preview_dataframe - Preview DataFrame data
    • get_dataframe_summary - Get statistical summary
    • sample_column_data - Smart sampling for column data viewing
  5. Memory & Variable Management

    • check_memory_usage - Check memory usage
    • get_variable_info - Get detailed variable information
    • clear_variables - Clear variables to free memory

🛠️ Installation & Configuration

Method 1: Direct Run with uvx (Recommended)

No need to clone the project, run directly from GitHub using uvx:

# Install uvx (if not already installed)
pip install uvx

# Run MCP server directly
uvx --from git+https://github.com/Hillyess/dataHill.git DATA_MCP.py

Method 2: Local Installation for Development

# 1. Clone project
git clone git@github.com:Hillyess/dataHill.git
cd dataHill

# 2. Create virtual environment
conda create -n data-analyzer python=3.10
conda activate data-analyzer

# 3. Install dependencies
pip install -r requirements.txt

# 4. Test installation
python DATA_MCP.py
Configure MCP Client
Claude Desktop Configuration

Edit Claude Desktop configuration file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

Recommended Configuration (using uvx):

{
  "mcpServers": {
    "dataHill": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/Hillyess/dataHill.git",
        "DATA_MCP.py"
      ]
    }
  }
}

Local Development Configuration (if using Method 2):

{
  "mcpServers": {
    "dataHill": {
      "command": "python",
      "args": ["/path/to/your/DATA_MCP.py"],
      "env": {
        "PYTHONPATH": "/path/to/your/project"
      }
    }
  }
}

📖 Usage Guide

Basic Workflow

# 1. Create session
create_ipython_session()
# Returns: {"success": true, "session_id": "session_a1b2c3d4", ...}

# 2. Load data
load_csv_file("data.csv", "session_a1b2c3d4", "df")

# 3. View data information
get_dataframe_info("df", "session_a1b2c3d4")

# 4. Smart sampling for data viewing
sample_column_data("df", "column_name", "session_a1b2c3d4", method="mixed", sample_size=20)

# 5. Execute analysis
execute_code("df.describe()", "session_a1b2c3d4")

# 6. Memory monitoring
check_memory_usage("session_a1b2c3d4")

# 7. Clean up session
delete_ipython_session("session_a1b2c3d4")

🔧 System Requirements

  • Python: 3.8+
  • Memory: Recommended 4GB+ (depends on data scale)
  • Operating System: Windows/macOS/Linux
  • MCP Client: Claude Desktop or other stdio-supported MCP clients

📦 Dependencies

Core Dependencies

  • fastmcp>=0.5.0 - MCP server framework
  • ipython>=8.0.0 - IPython interactive environment
  • pandas>=2.0.0 - Data processing and analysis
  • numpy>=1.24.0 - Numerical computation foundation

Data Support

  • openpyxl>=3.1.0 - Excel .xlsx file support
  • xlrd>=2.0.0 - Excel .xls file support

System Monitoring

  • psutil>=5.9.0 - Memory and system monitoring

🤝 Contributing

  1. Fork this project
  2. Create feature branch (git checkout -b feature/AmazingFeature)
  3. Commit changes (git commit -m 'Add some AmazingFeature')
  4. Push to branch (git push origin feature/AmazingFeature)
  5. Open Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙋‍♂️ Support & Feedback


⭐ If this project helps you, please give us a Star!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_hillyess_datahill-0.1.0.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iflow_mcp_hillyess_datahill-0.1.0-py3-none-any.whl (36.5 kB view details)

Uploaded Python 3

File details

Details for the file iflow_mcp_hillyess_datahill-0.1.0.tar.gz.

File metadata

  • Download URL: iflow_mcp_hillyess_datahill-0.1.0.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_hillyess_datahill-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4dc81504cded6af4bc2d0f2d147509420cf74bbb71247750a1f0616f78af0da1
MD5 b80ad7863fbb821b229c28f35c237baa
BLAKE2b-256 cd36c453147fd06e640da7232ece6408742d307250ee32f61e92828210eefb8a

See more details on using hashes here.

File details

Details for the file iflow_mcp_hillyess_datahill-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: iflow_mcp_hillyess_datahill-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 36.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_hillyess_datahill-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4eec4a4d12800b8b1507ae63a03ac8e3c62a3c052b0717ce14335d2c16482a04
MD5 c81c7d844738d27292bf11673d85e221
BLAKE2b-256 88437577cb6b232781264d03cbfb6f074f544f33fc106932b7531bf1219722c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page