Agent-driven video creation toolkit: neural TTS voiceover, highlight cutting, timeline composition, and NLE export.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

videopilot

Agent-driven video creation toolkit. An MCP server giving LLMs 20 tools to author voiceover, cut highlights, compose timelines, and render finished MP4s — plus a CLI for manual / scripted runs when you don't want an agent in the loop.

videopilot is an MCP (Model Context Protocol) server that lets a calling LLM — driven by GitHub Copilot CLI or any other MCP-aware client — turn raw screen recordings into narrated, edited MP4s. The server exposes 20 tools covering the full pipeline: neural TTS voiceover, faster-whisper transcription, silence detection, clip cutting, timeline composition with slides and audio ducking, and NLE export to EDL / FCPXML.

The MCP server is the primary interface. A standalone videopilot CLI ships alongside it for manual or scripted runs — useful for one-off stages, CI, or workflows without an agent in the loop.

source.mp4  ->  script.json  ->  tts  ->  cut-plan.json  ->  cut  ->  compose-plan.json  ->  compose  ->  final.mp4
                                                                                                          + EDL / FCPXML / replay script

Highlights

Capability	Engine
Neural voiceover, 400+ voices, 100+ locales	Microsoft Edge TTS (free, no key)
Premium neural voices	Azure Speech (optional, with key)
Word-level transcription	faster-whisper (local)
Silence trimming, scene cuts	ffmpeg
Title slides, picture-in-picture, audio ducking, music underlay	ffmpeg filter graph composer
Ken Burns motion on still images (zoom in/out, pan)	ffmpeg `zoompan` over a Lanczos-oversampled source (subpixel-smooth)
MP4 render at any resolution / fps	ffmpeg
Hand-off to Premiere / Resolve / Final Cut	EDL (CMX 3600) + FCPXML export
Replayable render scripts	PowerShell / bash export
Agent integration	MCP server with 20 tools — see MCP tools
Authoring contract	JSON state files documented in `AGENT.md` (incremental authoring, schema introspection, idempotency probes)

Install

From PyPI (recommended)

pip install --user videopilot

Two console scripts are installed:

Script	Purpose
`videopilot-mcp`	The MCP server (stdio transport). Wire this into your MCP client.
`videopilot`	The manual CLI. Useful for one-off stages and CI.

Verify the install:

videopilot doctor

You also need ffmpeg on PATH:

OS	Command
Windows	`winget install --id Gyan.FFmpeg -e`
macOS	`brew install ffmpeg`
Debian / Ubuntu	`sudo apt install ffmpeg`
Fedora	`sudo dnf install ffmpeg`
Arch	`sudo pacman -S ffmpeg`

videopilot doctor exits 0 when ffmpeg, ffprobe, Python deps, and optional Azure keys are all in order; otherwise it prints exactly what's missing. The same check is also exposed as the doctor MCP tool.

From source (development)

git clone https://github.com/mbahgatTech/videopilot.git
cd videopilot
pip install --user -e .

Via the Agency plugin

If you use Copilot or Claude inside Microsoft and have access to the Agency Playground marketplace, install the videopilot plugin and ask:

set up videopilot

The plugin's init skill runs the same installer logic for you.

Connect to an MCP client

videopilot-mcp runs the MCP server over stdio. The verified config for the GitHub Copilot CLI (~/.copilot/mcp-config.json) is:

{
  "mcpServers": {
    "videopilot": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "videopilot", "videopilot-mcp"],
      "tools": ["*"]
    }
  }
}

uvx pulls the latest videopilot from PyPI into an ephemeral environment and runs the videopilot-mcp entry point — no global install required. If you already have videopilot installed globally (pip install --user videopilot) you can instead use "command": "videopilot-mcp" with no args.

Any MCP-aware client that supports stdio servers can run videopilot-mcp the same way — consult your client's docs for the exact config-file location and schema.

After your client restarts, the agent can call any of the 20 videopilot.* tools below.

MCP tools

Tool	Purpose
`doctor`	Verify ffmpeg, ffprobe, Python deps, optional Azure keys.
`voices`	List available neural TTS voices (Edge TTS or Azure).
`list_projects`	List all projects under `projects/`.
`project_status`	Pipeline status for one project: which JSON state files exist, which stages have run.
`init`	Create a new project, optionally with a first source video.
`import_source`	Add another source to an existing project.
`read_state`	Read a JSON state file (`project` / `script` / `cut-plan` / `compose-plan`).
`write_state`	Write a JSON state file with schema validation.
`tts`	Synthesize voiceover MP3s from `script.json` (async, emits progress notifications).
`transcribe`	Run faster-whisper; returns word-level segments and writes SRT.
`silence`	Emit a cut-plan candidate that strips silence.
`cut`	Cut clips per `cut-plan.json`.
`compose`	Render final MP4 per `compose-plan.json`.
`export`	Emit NLE projects (EDL, FCPXML) and replayable ffmpeg script.
`schema`	Return JSON schemas (agent-facing) for every state file.
`add_vo_segment`	Append or upsert a voiceover segment in `script.json`.
`add_slide`	Append a slide entry (with optional body text) to `compose-plan.json`.
`set_compose_output`	Set compose output resolution / fps / codec.
`preview_slide`	Render a single slide as a PNG for fast preview without running `compose`.
`is_up_to_date`	Probe whether a stage's outputs are current for its inputs (idempotency check).

The contract — what each tool reads and writes, the JSON state-file schemas, and the recommended call order — is documented in AGENT.md. Calling agents should read AGENT.md before issuing tool calls.

CLI reference (manual mode)

Each pipeline stage is also exposed as a videopilot CLI subcommand. Use it when you want to run a step by hand, drop the agent, or invoke from CI.

Command	Purpose
`videopilot doctor`	Verify ffmpeg, ffprobe, Python deps, optional Azure keys.
`videopilot voices [--locale en-US]`	List available TTS voices.
`videopilot init <slug> [--source PATH]`	Create a new project with optional first source.
`videopilot import <slug> <path>`	Add another source to an existing project.
`videopilot tts <slug> [--force]`	Synthesize voiceover MP3s from `script.json`.
`videopilot transcribe <slug> <source-id>`	Run faster-whisper; emits word-level JSON + SRT.
`videopilot silence <slug> <source-id>`	Emit a cut-plan candidate that strips silence.
`videopilot cut <slug> [--force] [--reencode]`	Cut clips per `cut-plan.json`.
`videopilot compose <slug>`	Render final MP4 per `compose-plan.json`.
`videopilot export <slug> [--edl] [--fcpxml] [--script]`	Emit NLE projects + replayable ffmpeg script.

Run videopilot <command> --help for per-command flags.

Manual quick start

# 1. Create a project with a source video
videopilot init demo --source "/path/to/raw-recording.mp4"

# 2. Hand-author projects/demo/script.json (one segment per beat of narration),
#    OR have your agent draft it from AGENT.md.

# 3. Synthesize the voiceover
videopilot tts demo

# 4. (Optional) transcribe to help pick highlights
videopilot transcribe demo raw1

# 5. Hand-author projects/demo/cut-plan.json (which spans to keep)

# 6. Cut clips from sources
videopilot cut demo

# 7. Hand-author projects/demo/compose-plan.json (timeline + slides + ducking)

# 8. Render the final video
videopilot compose demo

# 9. Optional: emit NLE projects + replay script
videopilot export demo --edl --fcpxml --script

Final output: projects/demo/out/final.mp4 plus optional final.edl, final.fcpxml, and render.ps1.

Project layout

videopilot/
- AGENT.md           <- contract for calling LLMs (start here if you're driving the tool)
- README.md          <- this file
- LICENSE            <- MIT
- pyproject.toml
- videopilot_mcp.py  <- MCP server (primary entry point; console-script: videopilot-mcp)
- videopilot.py      <- argparse router (CLI implementation)
- videopilot_cli.py  <- console-script shim for the CLI
- lib/               <- shared implementation modules
  - tts.py
  - transcribe.py
  - silence.py
  - cut.py
  - compose.py
  - export.py
  - ffmpeg_wrap.py
  - voices.py
  - init_cmd.py
  - doctor.py
- examples/          <- copyable starter JSON state files
- tests/             <- standalone test scripts (mcp_e2e.py, progress_smoke.py)
- projects/<slug>/   <- per-project workspace (one folder per video)
  - project.json
  - script.json
  - cut-plan.json
  - compose-plan.json
  - sources/
  - voice/
  - transcripts/
  - clips/
  - tmp/
  - out/

Configuration

Environment variable	Purpose
`AZURE_SPEECH_KEY`	Optional. Enables Azure Speech voices (premium neural TTS).
`AZURE_SPEECH_REGION`	Required when `AZURE_SPEECH_KEY` is set (e.g. `eastus`).

Edge TTS is the default and requires no configuration.

Development

git clone https://github.com/mbahgatTech/videopilot.git
cd videopilot
pip install --user -e ".[dev]"

# Build the package
python -m build

# Validate the dist
python -m twine check dist/*

# Local smoke test
videopilot doctor

# MCP server stdio + progress-notification smoke test
python tests/progress_smoke.py

# End-to-end MCP test (real ffmpeg + Edge TTS + full pipeline)
python tests/mcp_e2e.py

Releasing

Releases publish to PyPI automatically when a v* tag is pushed. The version is derived from the tag itself via setuptools_scm — there is no version = line in pyproject.toml and no version-bump commit is required.

The workflow uses PyPI Trusted Publishing (OIDC), so no API tokens are stored in the repo or in GitHub Secrets — PyPI verifies the GitHub OIDC token at publish time.

One-time setup (PyPI side, done once before the first release):

Sign in to https://pypi.org/.
Account settings → Publishing → Add a new pending publisher:
- PyPI project name: videopilot
- Owner: mbahgatTech
- Repository: videopilot
- Workflow filename: release.yml
- Environment name: pypi
On GitHub, repo Settings → Environments → New environment → pypi. Optionally add a required reviewer for an extra approval gate.

Cutting a release:

git tag v0.2.0
git push origin v0.2.0

That's it. The release workflow then:

Builds sdist + wheel (version derived from the tag)
Verifies the tag matches the setuptools_scm-derived version
Runs twine check
Publishes to PyPI via OIDC
Creates a GitHub Release with the sdist + wheel attached

License

MIT. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mazenbahgat

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.5

May 16, 2026

0.1.4

May 16, 2026

0.1.3

May 16, 2026

0.1.2

May 16, 2026

0.1.1

May 16, 2026

0.1.0

May 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

videopilot-0.1.5.tar.gz (80.0 kB view details)

Uploaded May 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

videopilot-0.1.5-py3-none-any.whl (52.2 kB view details)

Uploaded May 16, 2026 Python 3

File details

Details for the file videopilot-0.1.5.tar.gz.

File metadata

Download URL: videopilot-0.1.5.tar.gz
Upload date: May 16, 2026
Size: 80.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for videopilot-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`950acfeee84ea836934c9310b1fff1392ec4f5cf3d3ad6c70acaf5e9c1a78866`
MD5	`0a09a730f67a1616bac03505a7b25ef9`
BLAKE2b-256	`28c9bd8ee05117abbfe49d4008c39fb6815f7e6b85bf488f72e1f849c5a06de5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for videopilot-0.1.5.tar.gz:

Publisher: release.yml on mbahgatTech/videopilot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: videopilot-0.1.5.tar.gz
- Subject digest: 950acfeee84ea836934c9310b1fff1392ec4f5cf3d3ad6c70acaf5e9c1a78866
- Sigstore transparency entry: 1554487910
- Sigstore integration time: May 16, 2026
Source repository:
- Permalink: mbahgatTech/videopilot@3d2ee9d3b6efc82409b1148a88d631eb8151e170
- Branch / Tag: refs/tags/v0.1.5
- Owner: https://github.com/mbahgatTech
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@3d2ee9d3b6efc82409b1148a88d631eb8151e170
- Trigger Event: push

File details

Details for the file videopilot-0.1.5-py3-none-any.whl.

File metadata

Download URL: videopilot-0.1.5-py3-none-any.whl
Upload date: May 16, 2026
Size: 52.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for videopilot-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e05f7e94c19ddc5361070f3e9ba4eef1938793e396b72f770cdcd5bb37110d69`
MD5	`a3e7a78f649d86ee6e3aa85753825bd6`
BLAKE2b-256	`f0cfded633e373239d3ed6e0c60ceaa081bf3908fc9156aad17dffe6cddd2e6c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for videopilot-0.1.5-py3-none-any.whl:

Publisher: release.yml on mbahgatTech/videopilot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: videopilot-0.1.5-py3-none-any.whl
- Subject digest: e05f7e94c19ddc5361070f3e9ba4eef1938793e396b72f770cdcd5bb37110d69
- Sigstore transparency entry: 1554487929
- Sigstore integration time: May 16, 2026
Source repository:
- Permalink: mbahgatTech/videopilot@3d2ee9d3b6efc82409b1148a88d631eb8151e170
- Branch / Tag: refs/tags/v0.1.5
- Owner: https://github.com/mbahgatTech
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@3d2ee9d3b6efc82409b1148a88d631eb8151e170
- Trigger Event: push

videopilot 0.1.5

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

videopilot

Highlights

Install

From PyPI (recommended)

From source (development)

Via the Agency plugin

Connect to an MCP client

MCP tools

CLI reference (manual mode)

Manual quick start

Project layout

Configuration

Development

Releasing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance