Medical RAG with Asset-Aware MCP - Precise PDF asset retrieval (tables, figures, sections) for AI Agents
Project description
asset-aware-mcp
๐ฅ Medical RAG with Asset-Aware MCP - ่ฎ AI Agent ็ฒพๆบๅญๅ PDF ๆ็ปไธญ็่กจๆ ผใ็ซ ็ฏ่็ฅ่ญๅ่ญ
๐ ็น้ซไธญๆ
โจ Features
- ๐ Asset-Aware ETL - PDF โ Markdown, using PyMuPDF to automatically identify tables, sections, and images
- ๐ Async Job Pipeline - Supports asynchronous task processing, tracking progress for large documents
- ๐บ๏ธ Document Manifest - Structured list, allowing Agents to "see the map" before precisely accessing data
- ๐ง LightRAG Integration - Knowledge Graph + Vector Index, supporting cross-document comparison and reasoning
- ๐ MCP Server - Exposes tools and resources to Copilot/Claude via FastMCP
- ๐ฅ Medical Research Focus - Optimized for medical literature, supporting Base64 image transmission for Vision AI analysis
๐๏ธ Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AI Agent (Copilot) โ
โโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MCP Protocol (Tools & Resources)
โโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MCP Server (server.py) โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ
โ โ ingest โ โ inspect โ โ fetch โ โ
โ โ documents โ โ manifest โ โ asset โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ consult_knowledge_graph โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ETL Pipeline (DDD) โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ PyMuPDF โ โ Asset โ โ LightRAG โ โ
โ โ Adapter โโ โ Parser โโ โ Index โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Local Storage โ
โ ./data/ โ
โ โโโ doc_{id}/ โ
โ โ โโโ full.md # Markdown Content โ
โ โ โโโ manifest.json # Asset Map โ
โ โ โโโ images/ # Extracted Figures โ
โ โโโ lightrag/ # Knowledge Graph โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ Project Structure (DDD)
asset-aware-mcp/
โโโ src/
โ โโโ domain/ # ๐ต Domain: Entities, Value Objects, Interfaces
โ โโโ application/ # ๐ข Application: Doc Service, Job Service, Asset Service
โ โโโ infrastructure/ # ๐ Infrastructure: PyMuPDF, LightRAG, File Storage
โ โโโ presentation/ # ๐ด Presentation: MCP Server (FastMCP)
โโโ data/ # Document and Asset Storage
โโโ docs/
โ โโโ spec.md # Technical Specification
โโโ tests/ # Unit and Integration Tests
โโโ vscode-extension/ # VS Code Management Extension
โโโ pyproject.toml # uv Project Config
๐ Quick Start
# Install dependencies (using uv)
uv sync
# Run MCP Server
uv run python -m src.presentation.server
# Or use the VS Code extension for graphical management
๐ MCP Tools
| Tool | Purpose |
|---|---|
ingest_documents |
ๅฏๅ ฅ PDF๏ผ่งธ็ผ ETL pipeline (ๆฏๆด async) |
get_job_status |
ๆชขๆฅ ETL ไปปๅ้ฒๅบฆ |
list_documents |
ๅๅบๆๆๅทฒ่็็ๆไปถ |
inspect_document_manifest |
ๆฅ็ๆไปถ็ตๆงๅฐๅ (่กจๆ ผ/ๅ็/็ซ ็ฏ) |
fetch_document_asset |
็ฒพๆบๅๅพ่กจๆ ผ (MD) / ๅ็ (B64) / ็ซ ็ฏ |
consult_knowledge_graph |
็ฅ่ญๅ่ญๆฅ่ฉข๏ผ่ทจๆ็ปๆฏ่ผ |
๐ง Tech Stack
| Category | Technology |
|---|---|
| Language | Python 3.10+ |
| ETL | PyMuPDF (fitz) |
| RAG | LightRAG (lightrag-hku) |
| MCP | FastMCP |
| Storage | Local filesystem (JSON/Markdown/PNG) |
๐ Documentation
- Technical Spec - ่ฉณ็ดฐๆ่ก่ฆๆ ผ
- Architecture - ็ณป็ตฑๆถๆง
- Constitution - ๅฐๆกๅๅ
๐ License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file asset_aware_mcp-0.1.1.tar.gz.
File metadata
- Download URL: asset_aware_mcp-0.1.1.tar.gz
- Upload date:
- Size: 373.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
867d0c41dc3f0caaf51c5b88b315716fb944bfd775598ce5720f14747ec011b8
|
|
| MD5 |
8a61cc3783cb3e8d7ffe2123d4202148
|
|
| BLAKE2b-256 |
80e880ae99580f93c5d55e82fafb832448e0d297b03658ac91600ac47762c0d2
|
Provenance
The following attestation bundles were made for asset_aware_mcp-0.1.1.tar.gz:
Publisher:
release.yml on u9401066/asset-aware-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
asset_aware_mcp-0.1.1.tar.gz -
Subject digest:
867d0c41dc3f0caaf51c5b88b315716fb944bfd775598ce5720f14747ec011b8 - Sigstore transparency entry: 791070364
- Sigstore integration time:
-
Permalink:
u9401066/asset-aware-mcp@5708757262baf6b3d99e0d0502ea17be4c0efb6e -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/u9401066
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5708757262baf6b3d99e0d0502ea17be4c0efb6e -
Trigger Event:
push
-
Statement type:
File details
Details for the file asset_aware_mcp-0.1.1-py3-none-any.whl.
File metadata
- Download URL: asset_aware_mcp-0.1.1-py3-none-any.whl
- Upload date:
- Size: 52.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27ce1f05298290b75f98e0cb299976401217bff606963376b1fd1fcae97fbc03
|
|
| MD5 |
f8d83ccc55661cbaa01e1af11b103176
|
|
| BLAKE2b-256 |
ebeb3b9c27a8b642e442817ae8e92f14b2af1ea5344d206aadfdbf86d68b838c
|
Provenance
The following attestation bundles were made for asset_aware_mcp-0.1.1-py3-none-any.whl:
Publisher:
release.yml on u9401066/asset-aware-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
asset_aware_mcp-0.1.1-py3-none-any.whl -
Subject digest:
27ce1f05298290b75f98e0cb299976401217bff606963376b1fd1fcae97fbc03 - Sigstore transparency entry: 791070368
- Sigstore integration time:
-
Permalink:
u9401066/asset-aware-mcp@5708757262baf6b3d99e0d0502ea17be4c0efb6e -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/u9401066
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5708757262baf6b3d99e0d0502ea17be4c0efb6e -
Trigger Event:
push
-
Statement type: