MCP server for architecture-specific entity extraction using GLiNER with TOGAF ADM phase awareness
Project description
🏗️ Architecture-Focused MCP GLiNER Server
Enhanced GLiNER-based entity extraction for software architecture documents with TOGAF ADM phase awareness and role-based contextual processing, implemented as an MCP (Model Context Protocol) server.
🎯 Overview
This project transforms the basic GLiNER NER model into a specialized MCP (Model Context Protocol) server for software architecture document analysis. It provides intelligent entity extraction that understands TOGAF ADM phases, architecture roles, and contextual relationships to help architects make better decisions and retrieve relevant guidelines.
The server implements the MCP protocol, making it compatible with Claude Desktop, Cody, and other MCP-enabled clients.
Key Differentiators
- Architecture-Specific Intelligence: Pre-configured with 200+ architecture-specific entity labels
- TOGAF ADM Integration: Phase-aware processing for all 10 TOGAF ADM phases
- Role-Based Filtering: Customized extraction for different architect roles
- Contextual Scoring: Relevance-weighted entity scoring based on project context
- Document Classification: Automatic analysis of document type and complexity
🚀 Quick Start
🎯 Easiest Way (Using uvx)
# Install uvx if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh
# Run the MCP server directly (no installation needed!)
uvx mcp-architecture-gliner
Then configure your MCP client (like Claude Desktop) to use:
{
"mcpServers": {
"architecture-gliner": {
"command": "uvx",
"args": ["mcp-architecture-gliner"]
}
}
}
🔧 Development Setup
Prerequisites
- Python 3.9+
- macOS (optimized for Apple Silicon M1/M2/M3)
- 8GB+ RAM (16GB+ recommended for large models)
Installation
# Clone the repository
git clone <repository-url>
cd mcp-gliner
# Install uv (if not already installed)
brew install uv
# Create and activate environment
uv venv
source .venv/bin/activate
# Install dependencies
uv pip install -r requirements.txt
Start the MCP Server
Option 1: Using uvx (Recommended - after publishing)
# Run directly with uvx (no installation needed)
uvx mcp-architecture-gliner
# The server will run in STDIO mode for MCP communication
Option 2: Local Development
# Clone and set up locally
git clone <repository-url>
cd mcp-gliner
# Install and run
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
python mcp_server.py
🧪 Usage Examples
1. MCP Client Integration
Claude Desktop Configuration
Option 1: Using uvx (Recommended) Add to your Claude Desktop MCP settings:
{
"mcpServers": {
"architecture-gliner": {
"command": "uvx",
"args": ["mcp-architecture-gliner"]
}
}
}
Option 2: Local Development
{
"mcpServers": {
"architecture-gliner": {
"command": "python",
"args": ["/absolute/path/to/mcp-gliner/mcp_server.py"],
"env": {
"HF_HUB_ENABLE_HF_TRANSFER": "0"
}
}
}
}
Available MCP Tools
The server provides 6 MCP tools:
extract_architecture_entities- Extract entities with phase/role contextanalyze_architecture_document- Analyze document type and complexityget_architecture_labels- Get available labels by categoryget_phase_specific_labels- Get TOGAF phase-specific labelsget_role_specific_labels- Get role-specific labelsget_gliner_model_info- Get model capabilities
2. Direct Usage (for testing)
import asyncio
from tools.architecture_gliner.models import ArchitectureExtractor
async def extract_example():
extractor = ArchitectureExtractor(model_size='medium-v2.1')
text = """
The solution architect designed a microservices architecture
using Docker containers and Kubernetes orchestration to meet
scalability and availability requirements.
"""
entities = extractor.extract_entities(text)
for entity in entities:
print(f"• {entity['text']} → {entity['label']} (score: {entity['score']:.3f})")
asyncio.run(extract_example())
2. Phase-Aware Processing
# Extract entities relevant to Architecture Vision phase
entities = extractor.extract_entities(
text=architecture_document,
phase="architecture_vision", # TOGAF ADM Phase A
include_context=True
)
# Filter for high-relevance entities
relevant_entities = [
e for e in entities
if e.get('phase_relevant', False) and e.get('contextual_score', 0) > 0.7
]
3. Role-Based Filtering
# Extract entities from a Solution Architect's perspective
solution_entities = extractor.extract_entities(
text=technical_spec,
role="solution_architect",
categories=["patterns", "quality_attributes", "technical_context"]
)
4. Document Analysis
# Automatically analyze document characteristics
analysis = extractor.analyze_document_type(document_text)
print(f"Document complexity: {analysis['document_complexity']}")
print(f"Suggested model: {analysis['suggested_model']}")
print(f"Likely phases: {analysis['likely_phases'][:3]}")
5. API Usage
# Extract entities via REST API
curl -X POST "http://localhost:8000/extract-entities" \
-H "Content-Type: application/json" \
-d '{
"text": "The enterprise architect defined microservices patterns for scalability.",
"phase": "architecture_vision",
"role": "enterprise_architect",
"threshold": 0.5
}'
# Get phase-specific labels
curl "http://localhost:8000/labels/phase/business_architecture"
# Analyze document
curl -X POST "http://localhost:8000/analyze-document" \
-H "Content-Type: application/json" \
-d '{
"text": "Technical specification document content...",
"threshold": 0.3
}'
📚 Architecture Categories
The system recognizes entities across multiple architecture domains:
TOGAF ADM Phases
- Preliminary Phase
- Architecture Vision (Phase A)
- Business Architecture (Phase B)
- Information Systems Architecture (Phase C)
- Technology Architecture (Phase D)
- Opportunities & Solutions (Phase E)
- Migration Planning (Phase F)
- Implementation Governance (Phase G)
- Architecture Change Management (Phase H)
- Requirements Management
Architecture Roles
- Enterprise Architect
- Solution Architect
- Business Architect
- Data Architect
- Application Architect
- Technology Architect
- Security Architect
- Infrastructure Architect
Architecture Patterns
- Layered Architecture
- Microservices Architecture
- Service-Oriented Architecture
- Event-Driven Architecture
- Serverless Architecture
- Hexagonal Architecture
- Clean Architecture
Quality Attributes
- Performance
- Scalability
- Availability
- Reliability
- Security
- Maintainability
- Usability
- Portability
🔧 Configuration
Model Selection
# Choose model size based on your needs
extractors = {
'base': ArchitectureExtractor(model_size='base'), # ~100M params, fastest
'medium': ArchitectureExtractor(model_size='medium-v2.1'), # ~300M params, balanced
'large': ArchitectureExtractor(model_size='large-v2.1'), # ~1B params, most accurate
}
Custom Labels
Edit tools/architecture_gliner/config/architecture_labels.yaml to add organization-specific terms:
architecture_labels:
custom_patterns:
- "my-company-pattern"
- "legacy-integration-pattern"
custom_technologies:
- "proprietary-platform"
- "internal-framework"
🔍 API Reference
Core Endpoints
| Endpoint | Method | Description |
|---|---|---|
/extract-entities |
POST | Extract architecture entities with context |
/analyze-document |
POST | Analyze document type and characteristics |
/labels |
GET | Get available architecture labels |
/labels/phase/{phase} |
GET | Get phase-specific labels |
/labels/role/{role} |
GET | Get role-specific labels |
/model-info |
GET | Get model capabilities and information |
/health |
GET | Health check endpoint |
Request/Response Models
Entity Extraction Request
{
"text": "string",
"labels": ["string"] | null,
"phase": "string" | null,
"role": "string" | null,
"categories": ["string"] | null,
"threshold": 0.5,
"model_size": "medium-v2.1",
"include_context": true
}
Entity Response
{
"entities": [
{
"text": "microservices architecture",
"label": "microservices architecture",
"start": 45,
"end": 70,
"score": 0.92,
"categories": ["patterns"],
"phase_relevant": true,
"contextual_score": 0.98
}
],
"total_count": 1,
"processing_info": {
"model_size": "medium-v2.1",
"phase_context": "architecture_vision",
"role_context": "solution_architect"
}
}
🧪 Examples
Run the comprehensive examples:
# Direct Python usage examples
python examples/architecture_example.py
# API client examples
python examples/api_client_example.py
🏗️ Architecture
mcp-gliner/
├── tools/architecture_gliner/ # Core implementation
│ ├── models/ # GLiNER integration
│ │ └── architecture_extractor.py # Main extraction logic
│ ├── config/ # Configuration files
│ │ └── architecture_labels.yaml # Architecture-specific labels
│ └── tool.py # FastMCP tool integration
├── examples/ # Usage examples
├── server.py # FastAPI server
└── requirements.txt # Dependencies
🔬 Research Applications
1. Contextual Architecture Guidance
Extract entities to provide relevant architecture guidelines based on:
- Current project phase (TOGAF ADM)
- Architect role and responsibilities
- Document type and complexity
2. Architecture Knowledge Graph
Build relationships between:
- Architecture patterns and quality attributes
- Business requirements and technical solutions
- Stakeholders and architectural decisions
3. Document Quality Assurance
Automatically validate:
- Completeness of architectural artifacts
- Compliance with architecture principles
- Consistency across document sets
4. Architecture Decision Support
Enable intelligent retrieval of:
- Relevant design patterns
- Best practices and guidelines
- Historical decisions and rationale
🚧 Future Roadmap
- Architecture Guidelines Database: Integrate comprehensive knowledge base
- Relationship Extraction: Identify connections between architectural concepts
- Multi-document Analysis: Cross-reference entities across document sets
- Template Generation: Create structured outputs for deliverables
- RAG Integration: Context-aware architecture guidance retrieval
- Compliance Checking: Automated validation against standards
- Export Capabilities: Generate reports in multiple formats
🤝 Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- GLiNER - Zero-shot Named Entity Recognition
- FastAPI - Modern Python web framework
- TOGAF - Enterprise Architecture methodology
- FastMCP - MCP server framework
Built for architects, by architects 🏗️
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_architecture_gliner-1.0.0.tar.gz.
File metadata
- Download URL: mcp_architecture_gliner-1.0.0.tar.gz
- Upload date:
- Size: 13.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80744177b14a8019876bb1280d3511c652ccdf9fad516322296cee5f97b31eca
|
|
| MD5 |
363f211323b4391a151d64d412d7fb74
|
|
| BLAKE2b-256 |
3dca6e1a65e4eaa82eadda6fb24c2383777da32bb40e17644f5530dfbb554796
|
File details
Details for the file mcp_architecture_gliner-1.0.0-py3-none-any.whl.
File metadata
- Download URL: mcp_architecture_gliner-1.0.0-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb770919ad2056c64b5bfa3d62f757c4cef01e152f840d050a56078f0e1735ac
|
|
| MD5 |
c601813354559cbd7149cbba4c5c996c
|
|
| BLAKE2b-256 |
3a693827cd87d39a126d5b8ff24ce9a6d6e6622bb005439480e19465f9f28b03
|