Skip to main content

MCP server for Rhubarb document and video understanding capabilities

Project description

Rhubarb MCP Server

A dedicated MCP server package that exposes all of Rhubarb's document and video understanding capabilities through the Model Context Protocol (MCP).

This is a standalone package designed for easy integration with MCP-compatible AI assistants like Cline, Claude Desktop, and other MCP clients.

โœจ Features

๐Ÿ”ง Tools (8 total)

  • analyze_document - Multi-modal document Q&A, summarization, structured extraction
  • stream_document_chat - Streaming conversations with chat history
  • extract_entities - Named Entity Recognition with 50+ built-in entities + PII detection
  • generate_extraction_schema - AI-assisted JSON schema generation
  • create_classification_samples - Vector sample creation for document classification
  • classify_document - Document classification using pre-trained samples
  • view_classification_sample - Classification sample inspection
  • analyze_video - Video content analysis with Amazon Nova models

๐Ÿ“š Resources (4 available)

  • rhubarb://entities/built-in - List of 50+ built-in entity types
  • rhubarb://models/supported - Supported Bedrock models and capabilities
  • rhubarb://schemas/built-in/{type} - Built-in schemas for common use cases
  • rhubarb://classification-samples/{bucket}/{id} - Classification sample details

๐Ÿš€ Quick Start

No Installation Required!

The MCP server automatically installs when first used through uvx or pipx:

# Test the server (optional)
uvx pyrhubarb-mcp@latest --check-deps

# Run with AWS profile (via environment variable)
AWS_PROFILE=my-profile uvx pyrhubarb-mcp@latest

# Run with access keys (via environment variables)
AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=secret... uvx pyrhubarb-mcp@latest

๐Ÿ”ง MCP Client Configuration

Cline Integration

{
  "rhubarb": {
    "command": "uvx",
    "args": [
      "pyrhubarb-mcp@latest", 
      "--aws-region", "us-east-1",
      "--default-model", "claude-sonnet"
    ],
    "env": {
      "AWS_PROFILE": "your-aws-profile"
    }
  }
}

Claude Desktop Integration

{
  "mcpServers": {
    "rhubarb": {
      "command": "uvx", 
      "args": [
        "pyrhubarb-mcp@latest",
        "--default-model", "claude-sonnet"
      ],
      "env": {
        "AWS_PROFILE": "your-aws-profile"
      }
    }
  }
}

Using Access Keys (for CI/CD)

{
  "rhubarb": {
    "command": "uvx",
    "args": [
      "pyrhubarb-mcp@latest",
      "--aws-region", "us-west-2",
      "--default-model", "nova-pro"
    ],
    "env": {
      "AWS_ACCESS_KEY_ID": "AKIAIOSFODNN7EXAMPLE",
      "AWS_SECRET_ACCESS_KEY": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }
  }
}

Alternative: Using pipx

{
  "rhubarb": {
    "command": "pipx",
    "args": [
      "run", "pyrhubarb-mcp@latest",
      "--default-model", "claude-sonnet"
    ],
    "env": {
      "AWS_PROFILE": "your-profile"
    }
  }
}

๐Ÿ“‹ Configuration Options

Command-Line Arguments

Argument Description Default
--aws-region AWS region us-east-1
--enable-cri Enable cross-region inference false
--default-model Default model claude-sonnet
--default-bucket S3 bucket for classification samples -
--check-deps Check dependencies and exit -

Environment Variables (AWS Credentials)

Variable Description Example
AWS_PROFILE AWS profile name my-profile
AWS_ACCESS_KEY_ID AWS access key ID AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY AWS secret access key wJalrXUtnFEMI/K7MDENG...
AWS_REGION AWS region (can also use CLI arg) us-east-1

Supported Models

Model Documents Video Best For
claude-opus โœ… โŒ Complex reasoning, detailed analysis
claude-sonnet โœ… โŒ Balanced performance and cost
claude-haiku โœ… โŒ Fast, lightweight tasks
nova-pro โœ… โœ… High-quality video analysis
nova-lite โœ… โœ… Cost-effective video processing

๐Ÿ’ก Usage Examples

Once configured, ask your AI assistant to use Rhubarb's tools:

Document Analysis

Use the analyze_document tool on s3://my-bucket/report.pdf to extract:
- Key findings
- Recommendations  
- Financial data

Structure the output as JSON with those three categories.

Entity Extraction

Use extract_entities on ./contract.pdf to find all:
- PERSON entities
- ORGANIZATION entities  
- MONEY amounts
- Important dates
- Any PII information

Video Analysis

Use analyze_video on s3://my-bucket/presentation.mp4 to:
- Summarize key points
- Extract any text shown on screen
- Identify main topics discussed

Document Classification

First create classification samples from ./training_manifest.csv in bucket 'my-docs'.
Then classify ./unknown_document.pdf using those samples.

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   MCP Client    โ”‚โ”€โ”€โ”€โ”€โ”‚ pyrhubarb-mcp    โ”‚โ”€โ”€โ”€โ”€โ”‚   pyrhubarb     โ”‚
โ”‚  (Cline, etc.)  โ”‚    โ”‚   (MCP Server)   โ”‚    โ”‚  (Core Library) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                                         โ”‚
                                               โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                               โ”‚  Amazon Bedrock โ”‚
                                               โ”‚ (Claude & Nova) โ”‚
                                               โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key Benefits

  • ๐Ÿ”Œ Plug & Play - No pre-installation, auto-installs on first use
  • ๐Ÿ Native Python - Direct access to Rhubarb capabilities
  • ๐Ÿ’ฌ Conversation Memory - Maintains chat history across interactions
  • ๐Ÿ” Resource Discovery - Built-in resources for exploration
  • โšก High Performance - Optimized for large documents and video processing

๐Ÿ› ๏ธ Development

Local Development

# Clone and install
git clone https://github.com/awslabs/rhubarb.git
cd rhubarb/pyrhubarb-mcp
poetry install

# Run locally  
poetry run pyrhubarb-mcp --check-deps

Dependencies

This package automatically installs:

  • pyrhubarb - Core Rhubarb document/video processing
  • fastmcp - MCP server framework
  • boto3 - AWS SDK for Python

๐Ÿ” Security & Requirements

AWS Credentials

Requires AWS credentials with permissions for:

  • Amazon Bedrock - For Claude and Nova model access
  • Amazon S3 - For document/video storage (optional but recommended)

Supported Regions

Works in any AWS region with Amazon Bedrock availability. Commonly used regions:

  • us-east-1 (N. Virginia) - Default, supports all models
  • us-west-2 (Oregon) - All models available
  • eu-west-1 (Ireland) - Most models available

๐Ÿ“š Related Links

๐Ÿ“„ License

Apache 2.0 - See LICENSE for details.

๐Ÿค Contributing

This package is part of the larger Rhubarb project. Please see the main CONTRIBUTING.md for contribution guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrhubarb_mcp-0.1.2.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyrhubarb_mcp-0.1.2-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file pyrhubarb_mcp-0.1.2.tar.gz.

File metadata

  • Download URL: pyrhubarb_mcp-0.1.2.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.11.14 Linux/6.11.0-1018-azure

File hashes

Hashes for pyrhubarb_mcp-0.1.2.tar.gz
Algorithm Hash digest
SHA256 99065c0ba24d30b9ac9a42ec320f288f05b9058e8aace5dc7cb705c078ef95f6
MD5 76bd67908d0cba0e94cb52dda27e19ef
BLAKE2b-256 6567d5335b923bdb9b3ece7c0b061b231bc9b3f6bc61a3bf0ae3e40f9fdf5483

See more details on using hashes here.

File details

Details for the file pyrhubarb_mcp-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pyrhubarb_mcp-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.11.14 Linux/6.11.0-1018-azure

File hashes

Hashes for pyrhubarb_mcp-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 85e884e89f266b518174e5dda10f17168bbff38c54e6443005ae03ebde5c49a1
MD5 32b294cf94dba3c697f2e72ad82aa411
BLAKE2b-256 5124ef421f5b73e0027301cbfd69c2d911ae08b713743353e530abd88f4d3267

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page