Skip to main content

MCP server for Rhubarb document and video understanding capabilities

Project description

Rhubarb MCP Server

A dedicated MCP server package that exposes all of Rhubarb's document and video understanding capabilities through the Model Context Protocol (MCP).

This is a standalone package designed for easy integration with MCP-compatible AI assistants like Cline, Claude Desktop, and other MCP clients.

โœจ Features

๐Ÿ”ง Tools (8 total)

  • analyze_document - Multi-modal document Q&A, summarization, structured extraction
  • stream_document_chat - Streaming conversations with chat history
  • extract_entities - Named Entity Recognition with 50+ built-in entities + PII detection
  • generate_extraction_schema - AI-assisted JSON schema generation
  • create_classification_samples - Vector sample creation for document classification
  • classify_document - Document classification using pre-trained samples
  • view_classification_sample - Classification sample inspection
  • analyze_video - Video content analysis with Amazon Nova models

๐Ÿ“š Resources (4 available)

  • rhubarb://entities/built-in - List of 50+ built-in entity types
  • rhubarb://models/supported - Supported Bedrock models and capabilities
  • rhubarb://schemas/built-in/{type} - Built-in schemas for common use cases
  • rhubarb://classification-samples/{bucket}/{id} - Classification sample details

๐Ÿš€ Quick Start

No Installation Required!

The MCP server automatically installs when first used through uvx or pipx:

# Test the server (optional)
uvx pyrhubarb-mcp@latest --check-deps

# Run with AWS profile (via environment variable)
AWS_PROFILE=my-profile uvx pyrhubarb-mcp@latest

# Run with access keys (via environment variables)
AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=secret... uvx pyrhubarb-mcp@latest

๐Ÿ”ง MCP Client Configuration

Cline Integration

{
  "rhubarb": {
    "command": "uvx",
    "args": [
      "pyrhubarb-mcp@latest", 
      "--aws-region", "us-east-1",
      "--default-model", "claude-sonnet"
    ],
    "env": {
      "AWS_PROFILE": "your-aws-profile"
    }
  }
}

Claude Desktop Integration

{
  "mcpServers": {
    "rhubarb": {
      "command": "uvx", 
      "args": [
        "pyrhubarb-mcp@latest",
        "--default-model", "claude-sonnet"
      ],
      "env": {
        "AWS_PROFILE": "your-aws-profile"
      }
    }
  }
}

Using Access Keys (for CI/CD)

{
  "rhubarb": {
    "command": "uvx",
    "args": [
      "pyrhubarb-mcp@latest",
      "--aws-region", "us-west-2",
      "--default-model", "nova-pro"
    ],
    "env": {
      "AWS_ACCESS_KEY_ID": "AKIAIOSFODNN7EXAMPLE",
      "AWS_SECRET_ACCESS_KEY": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }
  }
}

Alternative: Using pipx

{
  "rhubarb": {
    "command": "pipx",
    "args": [
      "run", "pyrhubarb-mcp@latest",
      "--default-model", "claude-sonnet"
    ],
    "env": {
      "AWS_PROFILE": "your-profile"
    }
  }
}

๐Ÿ“‹ Configuration Options

Command-Line Arguments

Argument Description Default
--aws-region AWS region us-east-1
--enable-cri Enable cross-region inference false
--default-model Default model claude-sonnet
--default-bucket S3 bucket for classification samples -
--check-deps Check dependencies and exit -

Environment Variables (AWS Credentials)

Variable Description Example
AWS_PROFILE AWS profile name my-profile
AWS_ACCESS_KEY_ID AWS access key ID AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY AWS secret access key wJalrXUtnFEMI/K7MDENG...
AWS_REGION AWS region (can also use CLI arg) us-east-1

Supported Models

Model Documents Video Best For
claude-opus โœ… โŒ Complex reasoning, detailed analysis
claude-sonnet โœ… โŒ Balanced performance and cost
claude-haiku โœ… โŒ Fast, lightweight tasks
nova-pro โœ… โœ… High-quality video analysis
nova-lite โœ… โœ… Cost-effective video processing

๐Ÿ’ก Usage Examples

Once configured, ask your AI assistant to use Rhubarb's tools:

Document Analysis

Use the analyze_document tool on s3://my-bucket/report.pdf to extract:
- Key findings
- Recommendations  
- Financial data

Structure the output as JSON with those three categories.

Entity Extraction

Use extract_entities on ./contract.pdf to find all:
- PERSON entities
- ORGANIZATION entities  
- MONEY amounts
- Important dates
- Any PII information

Video Analysis

Use analyze_video on s3://my-bucket/presentation.mp4 to:
- Summarize key points
- Extract any text shown on screen
- Identify main topics discussed

Document Classification

First create classification samples from ./training_manifest.csv in bucket 'my-docs'.
Then classify ./unknown_document.pdf using those samples.

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   MCP Client    โ”‚โ”€โ”€โ”€โ”€โ”‚ pyrhubarb-mcp    โ”‚โ”€โ”€โ”€โ”€โ”‚   pyrhubarb     โ”‚
โ”‚  (Cline, etc.)  โ”‚    โ”‚   (MCP Server)   โ”‚    โ”‚  (Core Library) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                                         โ”‚
                                               โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                               โ”‚  Amazon Bedrock โ”‚
                                               โ”‚ (Claude & Nova) โ”‚
                                               โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key Benefits

  • ๐Ÿ”Œ Plug & Play - No pre-installation, auto-installs on first use
  • ๐Ÿ Native Python - Direct access to Rhubarb capabilities
  • ๐Ÿ’ฌ Conversation Memory - Maintains chat history across interactions
  • ๐Ÿ” Resource Discovery - Built-in resources for exploration
  • โšก High Performance - Optimized for large documents and video processing

๐Ÿ› ๏ธ Development

Local Development

# Clone and install
git clone https://github.com/awslabs/rhubarb.git
cd rhubarb/pyrhubarb-mcp
poetry install

# Run locally  
poetry run pyrhubarb-mcp --check-deps

Dependencies

This package automatically installs:

  • pyrhubarb - Core Rhubarb document/video processing
  • fastmcp - MCP server framework
  • boto3 - AWS SDK for Python

๐Ÿ” Security & Requirements

AWS Credentials

Requires AWS credentials with permissions for:

  • Amazon Bedrock - For Claude and Nova model access
  • Amazon S3 - For document/video storage (optional but recommended)

Supported Regions

Works in any AWS region with Amazon Bedrock availability. Commonly used regions:

  • us-east-1 (N. Virginia) - Default, supports all models
  • us-west-2 (Oregon) - All models available
  • eu-west-1 (Ireland) - Most models available

๐Ÿ“š Related Links

๐Ÿ“„ License

Apache 2.0 - See LICENSE for details.

๐Ÿค Contributing

This package is part of the larger Rhubarb project. Please see the main CONTRIBUTING.md for contribution guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrhubarb_mcp-0.1.3.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyrhubarb_mcp-0.1.3-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file pyrhubarb_mcp-0.1.3.tar.gz.

File metadata

  • Download URL: pyrhubarb_mcp-0.1.3.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.11.14 Linux/6.11.0-1018-azure

File hashes

Hashes for pyrhubarb_mcp-0.1.3.tar.gz
Algorithm Hash digest
SHA256 d25359b3b5172a82dd9f89dc1688b28103b0ade6155be31476a855649fa810c5
MD5 b97dddfc71ecbb66817b2e0ca6b1ea90
BLAKE2b-256 8ca068a97c1789de46d0c79bd6c7e56a2006958b157dd49f43d87e1b85d1c2d2

See more details on using hashes here.

File details

Details for the file pyrhubarb_mcp-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pyrhubarb_mcp-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.11.14 Linux/6.11.0-1018-azure

File hashes

Hashes for pyrhubarb_mcp-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 174e75aef629766ef2596bc11b8e9d6b90df2e05c1d1b1825bd6ce9ceda43f58
MD5 0eadee35b46e843752fb44a93aacd727
BLAKE2b-256 c63450440c50b220c152d80805b70393e521431133dedc79069f0c6d5282fc07

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page