Skip to main content

Video editing toolkit for AI agents — CLI-first, agent-native, open source

Project description

Montaj

Montaj

A video editing CLIP for AI agents. CLI-first, agent-native, open source.

Montaj is a CLIP — a CLI Program for agents. It clips onto your existing AI agent (Claude Code, OpenClaw, Cursor, or any harness) and gives it the specialized tools to edit video. Built-in steps cover the full editing pipeline. The agent decides what to run, in what order, and with what parameters.

The fundamental dependency is an agent. Montaj doesn't edit on its own. It provides the tools; the agent makes the creative decisions.

Quick install

Send this to your agent:

Install Montaj from https://github.com/theSamPadilla/montaj, then read skills/onboarding/SKILL.md to get us started.

Install

PyPI:

pip install montaj
# install Node.js >=18 separately: https://nodejs.org
montaj install whisper   # whisper-cpp binary + model weights
montaj install ui        # npm deps + UI build

Homebrew (macOS) — installs Node.js and all Python deps including bundled ffmpeg:

brew install theSamPadilla/montaj/montaj

From source:

git clone https://github.com/theSamPadilla/montaj
cd montaj
pip install -e ".[connectors]"
montaj install whisper
montaj install ui

Optional extras:

pip install "montaj[connectors]"  # Kling, Gemini, OpenAI API connectors
pip install "montaj[rvm]"         # background removal (torch + RVM)
pip install "montaj[demucs]"      # audio stem separation
montaj install all                # whisper + ui + rvm

Quick Start

# Headless — agent edits, renders, done
montaj run ./clips --prompt "tight cuts, remove filler, 9:16"

# With UI — watch the agent work live, tweak the result
montaj serve

# AI video generation — agent creates from a text prompt via Kling
montaj serve   # then create an ai_video project in the UI

What's Inside

steps/              Step executables + JSON schemas (probe, trim, transcribe, generate, etc.)
workflows/          Editing plans (clean_cut, overlays, ai_video, lyrics_video, etc.)
skills/             Agent skill contracts (onboarding, edit-session, ai-video-plan, ai-video-generate, etc.)
connectors/         API connectors (Kling, Gemini, OpenAI)

render/             React + Puppeteer + ffmpeg render engine
serve/              Local HTTP + SSE server (montaj serve)
ui/                 Browser UI (Vite + React + Tailwind)
docs/               Architecture, CLI reference, UI design, schemas

How It Works

1. Upload clips + write an editing prompt (or describe an AI video)
2. montaj creates project.json [pending]
3. Agent picks it up, reads the workflow, calls steps as tools
4. Agent writes project.json as it works → UI updates live via SSE
5. Agent marks project [draft]
6. Human reviews in browser (optional) → tweaks → marks [final]
7. Render engine → final MP4

CLI

Every operation is available from the terminal. montaj is the full command, mtj is the short alias.

montaj run ./clips --prompt "tight cuts, upbeat pacing"
montaj serve
montaj render
montaj fetch https://youtube.com/watch?v=...

See the CLI Reference for the full documentation.

Steps & Workflows

Steps are individual editing operations — Python executables with JSON schemas, callable as agent tools, CLI commands, or API calls. Native steps ship with Montaj; custom steps are any executable that follows the output convention.

Category Steps
Inspect probe, snapshot, analyze_media
Clean waveform_trim, rm_fillers, rm_nonspeech
Edit materialize_cut, resize, extract_audio, crop_spec
Enrich transcribe, caption, normalize, lyrics_sync, lyrics_render
Generate kling_generate, generate_image, eval_scene
VFX remove_bg, stem_separation
Acquire fetch — download from any URL via yt-dlp

Workflows are suggested editing plans — which steps to use and with what default params. The agent reads the plan, reads the prompt, and decides the actual execution.

Workflow Description
clean_cut Trim, remove filler, clean audio
overlays Add animated overlays and titles
ai_video Generate video from text via Kling + storyboard
lyrics_video Sync lyrics to audio with animated captions
animations Custom JSX animation compositions
explainer Educational/explainer video style
floating_head Speaker overlay on background footage

Custom steps and workflows are discovered automatically — no registration needed. See the Steps Reference and Core Concepts for details.

Skills

Skills are agent-readable contracts that teach the agent how to approach a specific editing task. Each skill describes the goal, the steps to use, the parameter choices, and the quality criteria.

Available skills: onboarding, edit-session, ai-video-plan, ai-video-generate, eval-scenes, overlay, write-overlay, animation-sections, lyrics-video, style-profile, serve, parallel, select-takes, waveform-silence, camera-vocabulary, workflow-builder, mcp.

Connectors

API connectors for external services. Installed via pip install "montaj[connectors]".

Connector Used for
Kling AI video generation (v3-omni, video-o1)
Gemini Media analysis, scene evaluation, image generation
OpenAI Image generation, analysis

Render Engine

React + Puppeteer + ffmpeg. Reads project.json [final], renders captions and overlays frame-by-frame via headless Chrome, composites with source footage via ffmpeg → final MP4.

See the Render Engine docs for the full breakdown.

UI

Optional browser interface. Upload → watch the agent work live → review → render.

montaj serve   # http://localhost:3000
  • Editor — timeline, preview player, caption editor, overlay editor
  • Storyboard — AI video scene planning with image/style references
  • Workflows — n8n-style node graph for building editing plans
  • Overlays — live animated preview of custom JSX overlays
  • Profiles — view creator style profiles (pacing, color palette, editorial direction)

The UI is a layer on top of the CLI, not a separate system. Every action maps to a CLI command.

Project JSON

The single format that flows through the entire pipeline. One file, three states:

State Who writes Contents
pending montaj serve / montaj run Clip paths, prompt, workflow name
draft Agent Trim points, ordering, captions, overlays — complete edit
final Human (via UI) Reviewed and tweaked, ready to render

For AI video projects, the storyboard (scenes, image refs, style refs) lives inside the same project.json.

See the Core Concepts page and docs/schemas/project.md for the full schema.

Docs

Full documentation at docs.montaj.ag:

Internal references (for contributors):

License

MIT — do whatever you want. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

montaj-2.0.2.tar.gz (348.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

montaj-2.0.2-py3-none-any.whl (123.1 kB view details)

Uploaded Python 3

File details

Details for the file montaj-2.0.2.tar.gz.

File metadata

  • Download URL: montaj-2.0.2.tar.gz
  • Upload date:
  • Size: 348.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for montaj-2.0.2.tar.gz
Algorithm Hash digest
SHA256 ff64f7e6f7881d722ad3f36088d6571498fd46568f67cba36b6746578e378e92
MD5 28d3ab04ee60ce95b2ebc26ffbd858de
BLAKE2b-256 24b3fe2c8e551cbca2b8e8eee7370888861e6d68e639c86a47d9590e658d57da

See more details on using hashes here.

File details

Details for the file montaj-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: montaj-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 123.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for montaj-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 89de3eea3a2a1b15490941b3083079f5f6bcde82e62209a7185e5b730a898d8d
MD5 6377027fc5c4782b242a01ab86fc3ffa
BLAKE2b-256 edb5ec8c2d6f7e9ee63a9fa0ef3cec76bd7d7dee9d2cc97cafc08971a960a448

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page