Skip to main content

VLM Image Understanding MCP Server - Support OpenAI protocol compatible VLMs

Project description

VLM MCP Server

支持 OpenAI 协议兼容的 VLM 图片理解 MCP Server。

功能

  • extract_text_from_image: 从图片中提取文字 (OCR)
  • ui_to_artifact: 将 UI 截图转换为代码、提示词、设计规格或描述
  • extract_text_from_screenshot: 从截图提取文本,支持代码识别
  • diagnose_error_screenshot: 分析错误截图,诊断问题原因
  • understand_technical_diagram: 分析技术图表(架构图、流程图、UML等)
  • analyze_data_visualization: 分析数据可视化图表
  • ui_diff_check: UI 对比检测,找出视觉差异
  • analyze_image: 通用图片分析

环境变量

变量名 必填 说明
VLM_API_KEY API 密钥
VLM_BASE_URL 自定义 API 地址(默认: https://api.openai.com/v1)
VLM_MODEL 使用的模型(默认: gpt-4o)

快速开始

使用 uvx 运行(推荐)

# 设置环境变量
export VLM_API_KEY=your-api-key
export VLM_MODEL=gpt-4o

# 直接运行
uvx vlm-mcp

使用 pip 安装

# 安装
pip install vlm-mcp

# 或开发模式安装
pip install -e .

配置环境变量

# OpenAI
export VLM_API_KEY=sk-xxx
export VLM_MODEL=gpt-4o

# 自定义 API (如 Ollama)
export VLM_API_KEY=your-api-key
export VLM_BASE_URL=http://localhost:11434/v1
export VLM_MODEL=qwen2.5-vl

运行服务

# 直接运行
python -m vlm_mcp

# 或使用安装的命令
vlm-mcp

支持的模型

任何兼容 OpenAI Chat Completions API 的 VLM 模型:

  • gpt-4o
  • gpt-4o-mini
  • gpt-4-turbo
  • qwen2.5-vl 系列
  • 及其他兼容 OpenAI API 的模型

Claude Code 配置

在 Claude Code 中配置 MCP 服务器:

{
  "mcpServers": {
    "vlm-mcp": {
      "command": "uvx",
      "args": ["vlm-mcp"],
      "env": {
        "VLM_API_KEY": "your-api-key",
        "VLM_MODEL": "gpt-4o"
      }
    }
  }
}

使用示例

在 Claude Code 中使用:

请用 extract_text_from_image 工具分析这张图片 /path/to/image.jpg,提取其中的文字。
请用 ui_to_artifact 工具将这个UI截图转换为 React 代码。

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vlm_mcp-0.1.0.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vlm_mcp-0.1.0-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file vlm_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: vlm_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vlm_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c821b7f888ef33e52f34db2e5a0fef5c5b033191beb02af97dbae38831ad5104
MD5 c11735095bfc5716ad7c7c736fa0c826
BLAKE2b-256 5f0db4602bf0b2b1cd0ce22d590ea395baab6e736e900c3b87a2440ffa2e4e6c

See more details on using hashes here.

File details

Details for the file vlm_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vlm_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vlm_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ec8479e2e166247c5a6ac766ea226ffb376751c1b31dd46e26af2156eeb46f92
MD5 7100b4427296eb683f20a57b482885b6
BLAKE2b-256 94d09e39da7d55c52d1a1c221eae3ff3ab05c8026e00538eac11bd7ef379a49d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page