Skip to main content

YouTube video analysis and X feed digest pipeline exposed as MCP tools

Project description

mcp-content-pipeline

PyPI version Downloads License: MIT Python

A content analysis and digest pipeline for YouTube videos and X (Twitter) feeds, exposed as MCP tools. Extract transcripts, fetch posts from curated accounts, and generate key takeaways, TLDRs, social hooks, and comic-book infographics — all callable by any MCP-compatible AI client like Claude Desktop.

Why?

Keeping up with YouTube channels and X accounts means scattered tabs, manual note-taking, and lost insights. This MCP server turns content consumption into structured, chainable tools. Analyse a Bloomberg video, digest your X feed, generate infographics, and sync everything to GitHub — all from a single conversation with Claude.

Quick Start

uvx mcp-content-pipeline

Or install explicitly:

uv tool install mcp-content-pipeline
mcp-content-pipeline

Claude Desktop Configuration

Add to your Claude Desktop MCP config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "content-pipeline": {
      "command": "/usr/local/bin/uvx",
      "args": ["mcp-content-pipeline"],
      "env": {
        "MCP_CP_ANTHROPIC_API_KEY": "sk-ant-...",
        "MCP_CP_SUPADATA_API_KEY": "sd_...",
        "MCP_CP_GITHUB_TOKEN": "ghp_...",
        "MCP_CP_GITHUB_REPO": "your-username/your-repo",
        "MCP_CP_GEMINI_API_KEY": "your-gemini-api-key",
        "MCP_CP_X_BEARER_TOKEN": "your-x-bearer-token",
        "MCP_CP_X_ACCOUNTS": "karpathy,bcherny,atmoio,steipete",
        "MCP_CP_X_TOPICS": "AI,tech,engineering"
      }
    }
  }
}

Usage

Once configured in Claude Desktop, use the tools in a single conversation.

Tip: Including "content-pipeline" for YouTube or "X feed" for Twitter helps Claude Desktop route to the right tool.

YouTube Analysis

"Use content-pipeline to analyse this video: https://www.youtube.com/watch?v=..." "Generate an image for this analysis" "Sync the analysis and image to GitHub"

Or all in one prompt:

"Use content-pipeline to analyse this video, generate the image, and sync to GitHub: https://www.youtube.com/watch?v=..."

X Feed Digest

"Analyse the X feed" "Analyse the X feed for karpathy, bcherny, atmoio, and steipete about AI today" "Analyse the X feed from the last 7 days"

Or with the full pipeline:

"Analyse the X feed, generate the image, and sync to GitHub"

Tools

Tool Description Requires
analyse_video Analyse a single YouTube video — transcript, takeaways, TLDR, social hook ANTHROPIC_API_KEY, SUPADATA_API_KEY
batch_analyse Analyse multiple videos from a URL list or config file ANTHROPIC_API_KEY, SUPADATA_API_KEY
list_channel_videos Fetch recent videos from a YouTube channel YOUTUBE_API_KEY
sync_to_github Push analyses as markdown files to a GitHub repo GITHUB_TOKEN, GITHUB_REPO
analyse_x_feed Analyse recent posts from curated X accounts — daily digest X_BEARER_TOKEN
generate_image Generate comic-book infographic from analysis result GEMINI_API_KEY

Environment Variables

All prefixed with MCP_CP_:

Variable Required Description
MCP_CP_ANTHROPIC_API_KEY Yes Anthropic API key for Claude analysis
MCP_CP_SUPADATA_API_KEY Yes for YouTube Supadata API key for YouTube transcript extraction
MCP_CP_YOUTUBE_API_KEY No YouTube Data API v3 key (only for list_channel_videos)
MCP_CP_GITHUB_TOKEN For sync GitHub personal access token
MCP_CP_GITHUB_REPO For sync Target repo in owner/repo format
MCP_CP_GITHUB_BRANCH No Branch to push to (default: main)
MCP_CP_GITHUB_OUTPUT_DIR No Output directory for YouTube analyses (default: content/youtube)
MCP_CP_GITHUB_X_OUTPUT_DIR No Output directory for X digests (default: content/x-digest)
MCP_CP_IMAGE_OUTPUT_DIR No Directory for generated images (default: ~/Downloads)
MCP_CP_CLAUDE_MODEL No Claude model to use (default: claude-sonnet-4-20250514)
MCP_CP_MAX_TRANSCRIPT_TOKENS No Max transcript length in tokens (default: 100000)
MCP_CP_GEMINI_API_KEY For image Google AI Studio API key for image generation
MCP_CP_GEMINI_MODEL No Gemini model for images (default: gemini-3.1-flash-image-preview)
MCP_CP_X_BEARER_TOKEN For X digest X API v2 bearer token
MCP_CP_X_ACCOUNTS For X digest Comma-separated X usernames
MCP_CP_X_TOPICS No Comma-separated topics (default: AI,tech)

Cost Projections

Estimated monthly costs for two usage patterns:

Service Daily (every day) Weekly X + daily YouTube
YouTube analysis (Claude API) ~$3–5/mo (1 video/day) ~$3–5/mo (1 video/day)
X feed digest (Claude API) ~$2–3/mo ~$0.50/mo
Image generation (Gemini API) ~$2/mo ($0.067/image) ~$2/mo ($0.067/image)
X API reads ~$4/mo ($0.13/day) ~$0.60/mo ($0.15/week)
Supadata transcript API ~$0 (free tier: 100/mo) ~$0 (free tier: 100/mo)
Total (excl. Claude API) ~$6–9/mo ~$3–5/mo

Claude API costs depend on your Anthropic billing plan and are not included in the totals above. If you already use Claude Pro ($20/mo), there is no additional Claude cost. The X API spending cap can be configured in the developer console.

What this replaces

Subscription Monthly cost What the pipeline covers instead
Google One AI Premium ~$20/mo Image generation via Gemini API (~$2/mo)
X Premium ~$8/mo X feed reading via API (~$0.60–4/mo)
YouTube Premium ~$14/mo Transcript extraction via Supadata (free tier)
Total saved ~$42/mo Pipeline cost: ~$3–9/mo (plus your existing Claude plan)

Eval Gates

Prompt and model changes are automatically evaluated in CI using mcp-llm-eval. The eval dataset covers both YouTube analysis and X feed digest prompts, benchmarking Claude Sonnet and Gemini 2.5 Flash on the same test cases. PRs that touch system prompts or model config trigger an evaluation run that scores faithfulness and relevance against a reference dataset. The PR is blocked if quality regresses below configured thresholds.

See .eval-gate.yml for threshold configuration and eval/dataset.json for the test dataset.

Running benchmarks locally

The benchmark requires API keys for all providers. Create a .env file in the project root:

ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AIza...

Then run:

make benchmark        # Run eval against all 5 models
make benchmark-copy   # Copy results to llm-benchmarks repo

Results are written to eval/results/ (gitignored). The benchmark output feeds into LLMShot via the llm-benchmarks repo at text-generation/content-pipeline-summary.json and text-generation/content-pipeline-benchmark.json.

The model used in the production pipeline is Claude Sonnet (claude-sonnet-4-6), configured via MCP_CP_ANTHROPIC_API_KEY. The benchmark tests all 5 models against the same prompts to track quality and cost across providers.

Development

git clone https://github.com/your-username/mcp-content-pipeline.git
cd mcp-content-pipeline
uv sync
uv run pytest -v --cov=src/mcp_content_pipeline
uv run ruff check src/ tests/

Security

  • All credentials are configured via local environment variables — never committed to the repo
  • The tool is open source but your API keys, YouTube key, and GitHub token stay on your machine
  • Never create a .env file in the repo — use shell exports or Claude Desktop config instead

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feat/my-feature)
  3. Commit using Conventional Commits (feat: add new feature)
  4. Push and open a Pull Request

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_content_pipeline-0.12.0.tar.gz (146.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_content_pipeline-0.12.0-py3-none-any.whl (28.2 kB view details)

Uploaded Python 3

File details

Details for the file mcp_content_pipeline-0.12.0.tar.gz.

File metadata

  • Download URL: mcp_content_pipeline-0.12.0.tar.gz
  • Upload date:
  • Size: 146.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mcp_content_pipeline-0.12.0.tar.gz
Algorithm Hash digest
SHA256 36a5c592d0db816da68ee64ad522735a275a20bff72c683f578c4e00bbfabb76
MD5 4ce910c27c177fab10d526760dcafd1f
BLAKE2b-256 98c5de6c0080f521e6c3b3cb71bd6652d57115538950242a8a75676b144bb363

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_content_pipeline-0.12.0.tar.gz:

Publisher: release.yml on berkayildi/mcp-content-pipeline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_content_pipeline-0.12.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_content_pipeline-0.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0a9c07d3b1dad917fda824ead2f2f0d6d568deecd592a40e304dbe1053d0ad14
MD5 a91eeddb166c4044be6b56dc9a44244c
BLAKE2b-256 3b52f2609a7bd0ad173e99a11ce9aca792fa982b54aca718f9bf4a0ece23144b

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_content_pipeline-0.12.0-py3-none-any.whl:

Publisher: release.yml on berkayildi/mcp-content-pipeline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page