Skip to main content

MCP server for Rhubarb document and video understanding capabilities

Project description

Rhubarb MCP Server

A dedicated MCP server package that exposes all of Rhubarb's document and video understanding capabilities through the Model Context Protocol (MCP).

This is a standalone package designed for easy integration with MCP-compatible AI assistants like Cline, Claude Desktop, and other MCP clients.

โœจ Features

๐Ÿ”ง Tools (8 total)

  • analyze_document - Multi-modal document Q&A, summarization, structured extraction
  • stream_document_chat - Streaming conversations with chat history
  • extract_entities - Named Entity Recognition with 50+ built-in entities + PII detection
  • generate_extraction_schema - AI-assisted JSON schema generation
  • create_classification_samples - Vector sample creation for document classification
  • classify_document - Document classification using pre-trained samples
  • view_classification_sample - Classification sample inspection
  • analyze_video - Video content analysis with Amazon Nova models

๐Ÿ“š Resources (4 available)

  • rhubarb://entities/built-in - List of 50+ built-in entity types
  • rhubarb://models/supported - Supported Bedrock models and capabilities
  • rhubarb://schemas/built-in/{type} - Built-in schemas for common use cases
  • rhubarb://classification-samples/{bucket}/{id} - Classification sample details

๐Ÿš€ Quick Start

No Installation Required!

The MCP server automatically installs when first used through uvx or pipx:

# Test the server (optional)
uvx pyrhubarb-mcp@latest --check-deps

# Run with AWS profile (via environment variable)
AWS_PROFILE=my-profile uvx pyrhubarb-mcp@latest

# Run with access keys (via environment variables)
AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=secret... uvx pyrhubarb-mcp@latest

๐Ÿ”ง MCP Client Configuration

Cline Integration

{
  "rhubarb": {
    "command": "uvx",
    "args": [
      "pyrhubarb-mcp@latest", 
      "--aws-region", "us-east-1",
      "--default-model", "claude-sonnet"
    ],
    "env": {
      "AWS_PROFILE": "your-aws-profile"
    }
  }
}

Claude Desktop Integration

{
  "mcpServers": {
    "rhubarb": {
      "command": "uvx", 
      "args": [
        "pyrhubarb-mcp@latest",
        "--default-model", "claude-sonnet"
      ],
      "env": {
        "AWS_PROFILE": "your-aws-profile"
      }
    }
  }
}

Using Access Keys (for CI/CD)

{
  "rhubarb": {
    "command": "uvx",
    "args": [
      "pyrhubarb-mcp@latest",
      "--aws-region", "us-west-2",
      "--default-model", "nova-pro"
    ],
    "env": {
      "AWS_ACCESS_KEY_ID": "AKIAIOSFODNN7EXAMPLE",
      "AWS_SECRET_ACCESS_KEY": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }
  }
}

Alternative: Using pipx

{
  "rhubarb": {
    "command": "pipx",
    "args": [
      "run", "pyrhubarb-mcp@latest",
      "--default-model", "claude-sonnet"
    ],
    "env": {
      "AWS_PROFILE": "your-profile"
    }
  }
}

๐Ÿ“‹ Configuration Options

Command-Line Arguments

Argument Description Default
--aws-region AWS region us-east-1
--enable-cri Enable cross-region inference false
--default-model Default model claude-sonnet
--default-bucket S3 bucket for classification samples -
--check-deps Check dependencies and exit -

Environment Variables (AWS Credentials)

Variable Description Example
AWS_PROFILE AWS profile name my-profile
AWS_ACCESS_KEY_ID AWS access key ID AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY AWS secret access key wJalrXUtnFEMI/K7MDENG...
AWS_REGION AWS region (can also use CLI arg) us-east-1

Supported Models

Model Documents Video Best For
claude-opus โœ… โŒ Complex reasoning, detailed analysis
claude-sonnet โœ… โŒ Balanced performance and cost
claude-haiku โœ… โŒ Fast, lightweight tasks
nova-pro โœ… โœ… High-quality video analysis
nova-lite โœ… โœ… Cost-effective video processing

๐Ÿ’ก Usage Examples

Once configured, ask your AI assistant to use Rhubarb's tools:

Document Analysis

Use the analyze_document tool on s3://my-bucket/report.pdf to extract:
- Key findings
- Recommendations  
- Financial data

Structure the output as JSON with those three categories.

Entity Extraction

Use extract_entities on ./contract.pdf to find all:
- PERSON entities
- ORGANIZATION entities  
- MONEY amounts
- Important dates
- Any PII information

Video Analysis

Use analyze_video on s3://my-bucket/presentation.mp4 to:
- Summarize key points
- Extract any text shown on screen
- Identify main topics discussed

Document Classification

First create classification samples from ./training_manifest.csv in bucket 'my-docs'.
Then classify ./unknown_document.pdf using those samples.

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   MCP Client    โ”‚โ”€โ”€โ”€โ”€โ”‚ pyrhubarb-mcp    โ”‚โ”€โ”€โ”€โ”€โ”‚   pyrhubarb     โ”‚
โ”‚  (Cline, etc.)  โ”‚    โ”‚   (MCP Server)   โ”‚    โ”‚  (Core Library) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                                         โ”‚
                                               โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                               โ”‚  Amazon Bedrock โ”‚
                                               โ”‚ (Claude & Nova) โ”‚
                                               โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key Benefits

  • ๐Ÿ”Œ Plug & Play - No pre-installation, auto-installs on first use
  • ๐Ÿ Native Python - Direct access to Rhubarb capabilities
  • ๐Ÿ’ฌ Conversation Memory - Maintains chat history across interactions
  • ๐Ÿ” Resource Discovery - Built-in resources for exploration
  • โšก High Performance - Optimized for large documents and video processing

๐Ÿ› ๏ธ Development

Local Development

# Clone and install
git clone https://github.com/awslabs/rhubarb.git
cd rhubarb/pyrhubarb-mcp
poetry install

# Run locally  
poetry run pyrhubarb-mcp --check-deps

Dependencies

This package automatically installs:

  • pyrhubarb - Core Rhubarb document/video processing
  • fastmcp - MCP server framework
  • boto3 - AWS SDK for Python

๐Ÿ” Security & Requirements

AWS Credentials

Requires AWS credentials with permissions for:

  • Amazon Bedrock - For Claude and Nova model access
  • Amazon S3 - For document/video storage (optional but recommended)

Supported Regions

Works in any AWS region with Amazon Bedrock availability. Commonly used regions:

  • us-east-1 (N. Virginia) - Default, supports all models
  • us-west-2 (Oregon) - All models available
  • eu-west-1 (Ireland) - Most models available

๐Ÿ“š Related Links

๐Ÿ“„ License

Apache 2.0 - See LICENSE for details.

๐Ÿค Contributing

This package is part of the larger Rhubarb project. Please see the main CONTRIBUTING.md for contribution guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrhubarb_mcp-0.1.0.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyrhubarb_mcp-0.1.0-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file pyrhubarb_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: pyrhubarb_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.11.13 Linux/6.11.0-1018-azure

File hashes

Hashes for pyrhubarb_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 171ee54d99408ffd37dc5d70d20e08bf71a803360249ac3bad300d000536454f
MD5 fdf39b4e136343949e679b5d1d3366ce
BLAKE2b-256 e33819bd6e92a8abe40d3df13db163efaeb14b20cb08f5e0bdf7b25bb2fac0d6

See more details on using hashes here.

File details

Details for the file pyrhubarb_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyrhubarb_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.11.13 Linux/6.11.0-1018-azure

File hashes

Hashes for pyrhubarb_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 986b70d339ca322ef222d5240e4066c893d6940c130b6369de542e638d29f7c6
MD5 41e978b6f62e218bfe45166eb89d700b
BLAKE2b-256 35bf24bc9a8c62147a123618c6c189f38cfeec3d87c12585902dfaf2656f4a79

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page