Skip to main content

MCP server for architecture-specific entity extraction using GLiNER with TOGAF ADM phase awareness

Project description

🏗️ Architecture-Focused MCP GLiNER Server

Enhanced GLiNER-based entity extraction for software architecture documents with TOGAF ADM phase awareness and role-based contextual processing, implemented as an MCP (Model Context Protocol) server.

Python 3.9+ FastMCP GLiNER MCP

🎯 Overview

This project transforms the basic GLiNER NER model into a specialized MCP (Model Context Protocol) server for software architecture document analysis. It provides intelligent entity extraction that understands TOGAF ADM phases, architecture roles, and contextual relationships to help architects make better decisions and retrieve relevant guidelines.

The server implements the MCP protocol, making it compatible with Claude Desktop, Cody, and other MCP-enabled clients.

Key Differentiators

  • Architecture-Specific Intelligence: Pre-configured with 200+ architecture-specific entity labels
  • TOGAF ADM Integration: Phase-aware processing for all 10 TOGAF ADM phases
  • Role-Based Filtering: Customized extraction for different architect roles
  • Contextual Scoring: Relevance-weighted entity scoring based on project context
  • Document Classification: Automatic analysis of document type and complexity

🚀 Quick Start

🎯 Easiest Way (Using uvx)

# Install uvx if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh

# Run the MCP server directly (no installation needed!)
uvx mcp-architecture-gliner

Then configure your MCP client (like Claude Desktop) to use:

{
  "mcpServers": {
    "architecture-gliner": {
      "command": "uvx",
      "args": ["mcp-architecture-gliner"]
    }
  }
}

🔧 Development Setup

Prerequisites

  • Python 3.9+
  • macOS (optimized for Apple Silicon M1/M2/M3)
  • 8GB+ RAM (16GB+ recommended for large models)

Installation

# Clone the repository
git clone <repository-url>
cd mcp-gliner

# Install uv (if not already installed)
brew install uv

# Create and activate environment
uv venv
source .venv/bin/activate

# Install dependencies
uv pip install -r requirements.txt

Start the MCP Server

Option 1: Using uvx (Recommended - after publishing)

# Run directly with uvx (no installation needed)
uvx mcp-architecture-gliner

# The server will run in STDIO mode for MCP communication

Option 2: Local Development

# Clone and set up locally
git clone <repository-url>
cd mcp-gliner

# Install and run
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
python mcp_server.py

🧪 Usage Examples

1. MCP Client Integration

Claude Desktop Configuration

Option 1: Using uvx (Recommended) Add to your Claude Desktop MCP settings:

{
  "mcpServers": {
    "architecture-gliner": {
      "command": "uvx",
      "args": ["mcp-architecture-gliner"]
    }
  }
}

Option 2: Local Development

{
  "mcpServers": {
    "architecture-gliner": {
      "command": "python",
      "args": ["/absolute/path/to/mcp-gliner/mcp_server.py"],
      "env": {
        "HF_HUB_ENABLE_HF_TRANSFER": "0"
      }
    }
  }
}

Available MCP Tools

The server provides 6 MCP tools:

  1. extract_architecture_entities - Extract entities with phase/role context
  2. analyze_architecture_document - Analyze document type and complexity
  3. get_architecture_labels - Get available labels by category
  4. get_phase_specific_labels - Get TOGAF phase-specific labels
  5. get_role_specific_labels - Get role-specific labels
  6. get_gliner_model_info - Get model capabilities

2. Direct Usage (for testing)

import asyncio
from tools.architecture_gliner.models import ArchitectureExtractor

async def extract_example():
    extractor = ArchitectureExtractor(model_size='medium-v2.1')
    
    text = """
    The solution architect designed a microservices architecture 
    using Docker containers and Kubernetes orchestration to meet 
    scalability and availability requirements.
    """
    
    entities = extractor.extract_entities(text)
    
    for entity in entities:
        print(f"• {entity['text']}{entity['label']} (score: {entity['score']:.3f})")

asyncio.run(extract_example())

2. Phase-Aware Processing

# Extract entities relevant to Architecture Vision phase
entities = extractor.extract_entities(
    text=architecture_document,
    phase="architecture_vision",  # TOGAF ADM Phase A
    include_context=True
)

# Filter for high-relevance entities
relevant_entities = [
    e for e in entities 
    if e.get('phase_relevant', False) and e.get('contextual_score', 0) > 0.7
]

3. Role-Based Filtering

# Extract entities from a Solution Architect's perspective
solution_entities = extractor.extract_entities(
    text=technical_spec,
    role="solution_architect",
    categories=["patterns", "quality_attributes", "technical_context"]
)

4. Document Analysis

# Automatically analyze document characteristics
analysis = extractor.analyze_document_type(document_text)

print(f"Document complexity: {analysis['document_complexity']}")
print(f"Suggested model: {analysis['suggested_model']}")
print(f"Likely phases: {analysis['likely_phases'][:3]}")

5. API Usage

# Extract entities via REST API
curl -X POST "http://localhost:8000/extract-entities" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "The enterprise architect defined microservices patterns for scalability.",
    "phase": "architecture_vision",
    "role": "enterprise_architect",
    "threshold": 0.5
  }'

# Get phase-specific labels
curl "http://localhost:8000/labels/phase/business_architecture"

# Analyze document
curl -X POST "http://localhost:8000/analyze-document" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Technical specification document content...",
    "threshold": 0.3
  }'

📚 Architecture Categories

The system recognizes entities across multiple architecture domains:

TOGAF ADM Phases

  • Preliminary Phase
  • Architecture Vision (Phase A)
  • Business Architecture (Phase B)
  • Information Systems Architecture (Phase C)
  • Technology Architecture (Phase D)
  • Opportunities & Solutions (Phase E)
  • Migration Planning (Phase F)
  • Implementation Governance (Phase G)
  • Architecture Change Management (Phase H)
  • Requirements Management

Architecture Roles

  • Enterprise Architect
  • Solution Architect
  • Business Architect
  • Data Architect
  • Application Architect
  • Technology Architect
  • Security Architect
  • Infrastructure Architect

Architecture Patterns

  • Layered Architecture
  • Microservices Architecture
  • Service-Oriented Architecture
  • Event-Driven Architecture
  • Serverless Architecture
  • Hexagonal Architecture
  • Clean Architecture

Quality Attributes

  • Performance
  • Scalability
  • Availability
  • Reliability
  • Security
  • Maintainability
  • Usability
  • Portability

🔧 Configuration

Model Selection

# Choose model size based on your needs
extractors = {
    'base': ArchitectureExtractor(model_size='base'),           # ~100M params, fastest
    'medium': ArchitectureExtractor(model_size='medium-v2.1'), # ~300M params, balanced
    'large': ArchitectureExtractor(model_size='large-v2.1'),   # ~1B params, most accurate
}

Custom Labels

Edit tools/architecture_gliner/config/architecture_labels.yaml to add organization-specific terms:

architecture_labels:
  custom_patterns:
    - "my-company-pattern"
    - "legacy-integration-pattern"
  
  custom_technologies:
    - "proprietary-platform"
    - "internal-framework"

🔍 API Reference

Core Endpoints

Endpoint Method Description
/extract-entities POST Extract architecture entities with context
/analyze-document POST Analyze document type and characteristics
/labels GET Get available architecture labels
/labels/phase/{phase} GET Get phase-specific labels
/labels/role/{role} GET Get role-specific labels
/model-info GET Get model capabilities and information
/health GET Health check endpoint

Request/Response Models

Entity Extraction Request

{
  "text": "string",
  "labels": ["string"] | null,
  "phase": "string" | null,
  "role": "string" | null,
  "categories": ["string"] | null,
  "threshold": 0.5,
  "model_size": "medium-v2.1",
  "include_context": true
}

Entity Response

{
  "entities": [
    {
      "text": "microservices architecture",
      "label": "microservices architecture",
      "start": 45,
      "end": 70,
      "score": 0.92,
      "categories": ["patterns"],
      "phase_relevant": true,
      "contextual_score": 0.98
    }
  ],
  "total_count": 1,
  "processing_info": {
    "model_size": "medium-v2.1",
    "phase_context": "architecture_vision",
    "role_context": "solution_architect"
  }
}

🧪 Examples

Run the comprehensive examples:

# Direct Python usage examples
python examples/architecture_example.py

# API client examples
python examples/api_client_example.py

🏗️ Architecture

mcp-gliner/
├── tools/architecture_gliner/           # Core implementation
│   ├── models/                          # GLiNER integration
│   │   └── architecture_extractor.py    # Main extraction logic
│   ├── config/                          # Configuration files
│   │   └── architecture_labels.yaml     # Architecture-specific labels
│   └── tool.py                          # FastMCP tool integration
├── examples/                            # Usage examples
├── server.py                           # FastAPI server
└── requirements.txt                    # Dependencies

🔬 Research Applications

1. Contextual Architecture Guidance

Extract entities to provide relevant architecture guidelines based on:

  • Current project phase (TOGAF ADM)
  • Architect role and responsibilities
  • Document type and complexity

2. Architecture Knowledge Graph

Build relationships between:

  • Architecture patterns and quality attributes
  • Business requirements and technical solutions
  • Stakeholders and architectural decisions

3. Document Quality Assurance

Automatically validate:

  • Completeness of architectural artifacts
  • Compliance with architecture principles
  • Consistency across document sets

4. Architecture Decision Support

Enable intelligent retrieval of:

  • Relevant design patterns
  • Best practices and guidelines
  • Historical decisions and rationale

🚧 Future Roadmap

  • Architecture Guidelines Database: Integrate comprehensive knowledge base
  • Relationship Extraction: Identify connections between architectural concepts
  • Multi-document Analysis: Cross-reference entities across document sets
  • Template Generation: Create structured outputs for deliverables
  • RAG Integration: Context-aware architecture guidance retrieval
  • Compliance Checking: Automated validation against standards
  • Export Capabilities: Generate reports in multiple formats

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • GLiNER - Zero-shot Named Entity Recognition
  • FastAPI - Modern Python web framework
  • TOGAF - Enterprise Architecture methodology
  • FastMCP - MCP server framework

Built for architects, by architects 🏗️

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_architecture_gliner-1.0.0.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_architecture_gliner-1.0.0-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file mcp_architecture_gliner-1.0.0.tar.gz.

File metadata

  • Download URL: mcp_architecture_gliner-1.0.0.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for mcp_architecture_gliner-1.0.0.tar.gz
Algorithm Hash digest
SHA256 80744177b14a8019876bb1280d3511c652ccdf9fad516322296cee5f97b31eca
MD5 363f211323b4391a151d64d412d7fb74
BLAKE2b-256 3dca6e1a65e4eaa82eadda6fb24c2383777da32bb40e17644f5530dfbb554796

See more details on using hashes here.

File details

Details for the file mcp_architecture_gliner-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_architecture_gliner-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eb770919ad2056c64b5bfa3d62f757c4cef01e152f840d050a56078f0e1735ac
MD5 c601813354559cbd7149cbba4c5c996c
BLAKE2b-256 3a693827cd87d39a126d5b8ff24ce9a6d6e6622bb005439480e19465f9f28b03

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page