Unified MCP server for OpenAI multimodal APIs (Sora, Whisper, GPT Vision)

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

rcaputo3tjclp

These details have not been verified by PyPI

Project description

sanzaru

A stateless, lightweight MCP server that wraps OpenAI's Sora Video API, Whisper, and GPT-4o Audio APIs via the OpenAI Python SDK.

Features

Video Generation (Sora)

Create videos with sora-2 or sora-2-pro models
Use reference images to guide generation
Remix and refine existing videos
Download variants (video, thumbnail, spritesheet)

Image Generation

Generate reference images with GPT-5/GPT-4.1
Iterative refinement and image editing
Automatic resizing for Sora compatibility

Audio Processing

Transcription: Whisper and GPT-4o models
Audio Chat: Interactive analysis with GPT-4o
Text-to-Speech: Multi-voice TTS generation
Processing: Format conversion, compression, file management

Note: Content guardrails are enforced by OpenAI. This server does not run local moderation.

Requirements

Python 3.10+
OPENAI_API_KEY environment variable

Feature-specific paths (set only what you need):

VIDEO_PATH - Enables video generation features
IMAGE_PATH - Enables image generation features
AUDIO_PATH - Enables audio processing features

Quick Start

Clone the repository:

git clone https://github.com/TJC-LP/sanzaru.git
cd sanzaru

Run the setup script:
```
./setup.sh
```
The script will:
- Prompt for your OpenAI API key
- Create directories and .env configuration
- Install dependencies with uv sync --all-extras --dev
Start using:
```
claude
```

That's it! Claude Code will automatically connect and you can start generating videos, images, and processing audio.

Installation

Quick Install

# All features
uv add "sanzaru[all]"

# Specific features
uv add "sanzaru[audio]"  # With audio support
uv add sanzaru           # Base (video + image only)

Alternative Installation Methods

From Source

git clone https://github.com/TJC-LP/sanzaru.git
cd sanzaru
uv sync --all-extras

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "sanzaru": {
      "command": "uvx",
      "args": ["sanzaru[all]"],
      "env": {
        "OPENAI_API_KEY": "your-api-key-here",
        "VIDEO_PATH": "/absolute/path/to/videos",
        "IMAGE_PATH": "/absolute/path/to/images",
        "AUDIO_PATH": "/absolute/path/to/audio"
      }
    }
  }
}

Or from source:

{
  "mcpServers": {
    "sanzaru": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/sanzaru", "sanzaru"]
    }
  }
}

Codex MCP

# Using uvx (from PyPI)
codex mcp add sanzaru \
  --env OPENAI_API_KEY="sk-..." \
  --env VIDEO_PATH="$HOME/sanzaru-videos" \
  --env IMAGE_PATH="$HOME/sanzaru-images" \
  --env AUDIO_PATH="$HOME/sanzaru-audio" \
  -- uvx "sanzaru[all]"

# Or from source
cd /path/to/sanzaru
set -a; source .env; set +a
codex mcp add sanzaru \
  --env OPENAI_API_KEY="$OPENAI_API_KEY" \
  --env VIDEO_PATH="$VIDEO_PATH" \
  --env IMAGE_PATH="$IMAGE_PATH" \
  --env AUDIO_PATH="$AUDIO_PATH" \
  -- uv run --directory "$(pwd)" sanzaru

Manual Setup

uv venv
uv sync

# Set required environment variables
export OPENAI_API_KEY=sk-...
export VIDEO_PATH=~/videos
export IMAGE_PATH=~/images
export AUDIO_PATH=~/audio

# Run server
uv run sanzaru

Feature Auto-Detection: Features are automatically enabled based on configured paths. Set only the paths you need.

Available Tools

Category	Tools	Description
Video	`create_video`, `get_video_status`, `download_video`, `list_videos`, `delete_video`, `remix_video`	Generate and manage Sora videos with optional reference images
Image	`create_image`, `get_image_status`, `download_image`	Generate reference images with GPT-5/GPT-4.1
Reference	`list_reference_images`, `prepare_reference_image`	Manage and resize images for Sora compatibility
Audio	`transcribe_audio`, `chat_with_audio`, `create_audio`, `convert_audio`, `compress_audio`, `list_audio_files`, `get_latest_audio`, `transcribe_with_enhancement`	Transcription, analysis, TTS, and file management

Full API documentation: See docs/api-reference.md

Basic Workflows

Generate a Video

# Create video from text
video = create_video(
    prompt="A serene mountain landscape at sunrise",
    model="sora-2",
    seconds="8",
    size="1280x720"
)

# Poll for completion
status = get_video_status(video.id)

# Download when ready
download_video(video.id, filename="mountain_sunrise.mp4")

Generate with Reference Image

# 1. Generate reference image
resp = create_image(prompt="futuristic pilot in mech cockpit")
download_image(resp.id, filename="pilot.png")

# 2. Prepare for video
prepare_reference_image("pilot.png", "1280x720", resize_mode="crop")

# 3. Animate
video = create_video(
    prompt="The pilot looks up and smiles",
    size="1280x720",
    input_reference_filename="pilot_1280x720.png"
)

Audio Transcription

# List available audio files
files = list_audio_files(format="mp3")

# Transcribe
result = transcribe_audio("interview.mp3")

# Or analyze with GPT-4o
analysis = chat_with_audio(
    "meeting.mp3",
    user_prompt="Summarize key decisions and action items"
)

Documentation

API Reference - Complete tool documentation with parameters and examples
Reference Images Guide - Working with reference images and resizing
Image Generation Guide - Generating and editing reference images
Sora Prompting Guide - Crafting effective video prompts
Audio Features - Audio transcription, chat, and TTS
Performance & Architecture - Technical details and benchmarks

Performance

Fully asynchronous architecture with proven scalability:

✅ 32+ concurrent operations verified
✅ 8-10x speedup for parallel tasks
✅ Non-blocking I/O with aiofiles + anyio
✅ Python 3.14 free-threading ready

See docs/async-optimizations.md for technical details.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

rcaputo3tjclp

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.6.2

Apr 25, 2026

0.6.1

Apr 24, 2026

0.6.0

Apr 22, 2026

0.5.0

Feb 24, 2026

0.4.5

Feb 9, 2026

0.4.4

Feb 9, 2026

0.4.3

Feb 9, 2026

0.4.2

Feb 9, 2026

0.4.1

Feb 8, 2026

0.4.0

Feb 8, 2026

0.3.2

Jan 30, 2026

0.3.1

Dec 17, 2025

This version

0.3.0

Dec 17, 2025

0.2.1

Nov 5, 2025

0.2.0

Nov 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sanzaru-0.3.0.tar.gz (53.1 kB view details)

Uploaded Dec 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sanzaru-0.3.0-py3-none-any.whl (64.0 kB view details)

Uploaded Dec 17, 2025 Python 3

File details

Details for the file sanzaru-0.3.0.tar.gz.

File metadata

Download URL: sanzaru-0.3.0.tar.gz
Upload date: Dec 17, 2025
Size: 53.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sanzaru-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`9384ca49a3790a7cdd97efcdb861e044a5f47a849e7b636fc21ccd0fad3e7d05`
MD5	`3424303b005424acdd1555a6c12ae109`
BLAKE2b-256	`dc6c1c65bd9a94010ccd15fcb8ec1461134c519712b990af5f6eb8f3847c347a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sanzaru-0.3.0.tar.gz:

Publisher: publish-to-pypi.yml on TJC-LP/sanzaru

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sanzaru-0.3.0.tar.gz
- Subject digest: 9384ca49a3790a7cdd97efcdb861e044a5f47a849e7b636fc21ccd0fad3e7d05
- Sigstore transparency entry: 768662969
- Sigstore integration time: Dec 17, 2025
Source repository:
- Permalink: TJC-LP/sanzaru@e32e68dab6effd121f8fac3e0d2c7d31c92143b2
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/TJC-LP
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@e32e68dab6effd121f8fac3e0d2c7d31c92143b2
- Trigger Event: release

File details

Details for the file sanzaru-0.3.0-py3-none-any.whl.

File metadata

Download URL: sanzaru-0.3.0-py3-none-any.whl
Upload date: Dec 17, 2025
Size: 64.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sanzaru-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b3f9aca9a50df374e2ddfb947d1fb94b1ea1e005eb2bc9866c43219e13a8eb6a`
MD5	`1c35afde513f5fcba1b94e095baf95de`
BLAKE2b-256	`5fa56709598e18fa8e2c7cbc4a0f604a3af9f359ada4e0cab8d138dac801f250`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sanzaru-0.3.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on TJC-LP/sanzaru

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sanzaru-0.3.0-py3-none-any.whl
- Subject digest: b3f9aca9a50df374e2ddfb947d1fb94b1ea1e005eb2bc9866c43219e13a8eb6a
- Sigstore transparency entry: 768662980
- Sigstore integration time: Dec 17, 2025
Source repository:
- Permalink: TJC-LP/sanzaru@e32e68dab6effd121f8fac3e0d2c7d31c92143b2
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/TJC-LP
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@e32e68dab6effd121f8fac3e0d2c7d31c92143b2
- Trigger Event: release

sanzaru 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

sanzaru

Features

Video Generation (Sora)

Image Generation

Audio Processing

Requirements

Quick Start

Installation

Quick Install

From Source

Claude Desktop

Codex MCP

Manual Setup

Available Tools

Basic Workflows

Generate a Video

Generate with Reference Image

Audio Transcription

Documentation

Performance

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance