Media processing toolkit for presentation localization

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Montaigne

A Python toolkit for presentation animation. Extract slides, translate visuals, generate voiceovers, and create videos—powered by Google Gemini AI, ElevenLabs, and local TTS.

Features

PDF Extraction: Convert PDF presentations to high-quality images. Configurable DPI settings (150-300+) with PNG or JPG output formats.
Script Generation: Generate professional voiceover scripts using a two-pass AI approach. Holistic context analysis, narrative arc awareness, and production notes with pronunciation guides.
Image Translation: Translate text in images to any target language. Context-aware translations powered by Gemini.
Voice Synthesis: Generate natural voiceover audio from scripts. Three providers: Gemini TTS (cloud), ElevenLabs (premium voices), or Coqui XTTS-v2 (local, no API key required).
Video Generation: Combine translated slides and voiceover audio into polished videos. Configurable resolution up to 1920x1080.
PowerPoint Export: Create PPTX presentations from PDF or images. Optionally add voiceover scripts as speaker notes for each slide.
Cloud Deployment: Offload video generation to Google Cloud Run. Upload PDFs, process in the cloud, and download results with secure signed URLs.
Model Configuration: Customize Gemini models for each operation. Use --model flags to switch between flash and pro models based on your needs.
Video Annotation: Frame-accurate video and audio annotation tool. Add timestamps, export to WebVTT/SRT formats for captions. Waveform visualization with click-to-seek.
Web Editor: Streamlit-based slide editor for managing presentations

Installation

Using pip

pip install montaigne

With optional dependencies

# Install with web editor support
pip install "montaigne[edit]"

# Install with annotation tool support
pip install "montaigne[annotate]"

# Install all optional dependencies
pip install "montaigne[all]"

Using uv

uv pip install montaigne

Using uvx (no installation required)

uvx --from montaigne essai setup
uvx --from montaigne essai script --input presentation.pdf

Setup

Get a Gemini API key from Google AI Studio
Create a .env file:
```
GEMINI_API_KEY=your-api-key
```
Verify setup:
```
essai setup
```

Usage

Extract PDF to Images

essai pdf presentation.pdf
essai pdf presentation.pdf --dpi 200 --format jpg

Generate Voiceover Script from Slides

essai script --input presentation.pdf
essai script --input slides_images/ --context "AI workshop"
essai script --input presentation.pdf --output custom_script.md
essai script --input presentation.pdf --model gemini-2.5-flash

Options:

--input, -i: PDF file or folder of slide images
--output, -o: Output markdown file path
--context, -c: Additional context to guide script generation
--model, -m: Gemini model to use (default: gemini-3-pro-preview)

Generate Audio from Script

essai audio --script voiceover.md
essai audio --script voiceover.md --voice Kore
essai audio --script voiceover.md --model gemini-2.5-flash-preview-tts

TTS Providers:

Provider	Description	Installation
`gemini`	Google Gemini TTS API (default)	Included
`elevenlabs`	ElevenLabs TTS API	Included
`coqui`	Local Coqui XTTS-v2 (no API key)	`pip install "montaigne[coqui]"`

Gemini voices: Puck, Charon, Kore, Fenrir, Aoede, Orus

Local TTS with Coqui:

# Install Coqui dependencies
pip install "montaigne[coqui]"

# Generate audio locally (no API key required)
essai audio --script voiceover.md --provider coqui
essai audio --script voiceover.md --provider coqui --voice male
essai audio --list-voices --provider coqui

Coqui voices: female, male, neutral

Note: First run downloads the XTTS-v2 model (~1.5GB). Requires accepting the CPML license.

Options:

--script, -s: Path to voiceover markdown script
--provider, -p: TTS provider (gemini, elevenlabs, coqui)
--voice, -v: TTS voice to use (default: Orus for Gemini, female for Coqui)
--model, -m: Gemini TTS model (default: gemini-2.5-pro-preview-tts)

Translate Images

essai translate --input slides/
essai translate --input image.png --lang Spanish
essai translate --input slides/ --model gemini-2.0-flash-exp

Options:

--input, -i: Image file or folder of images
--lang, -l: Target language (default: French)
--model, -m: Gemini model (default: gemini-3-pro-image-preview)

Create PowerPoint from PDF or Images

essai ppt --input presentation.pdf
essai ppt --input slides/ --script voiceover.md
essai ppt --input presentation.pdf --keep-images

This will create a .pptx file with each PDF page or image as a slide. If a voiceover script is provided, it will be added as speaker notes.

Generate Video from Slides

essai video --pdf presentation.pdf
essai video --images slides/ --audio audio/

Full Localization Pipeline

essai localize --pdf presentation.pdf --script voiceover.md --lang French

This will:

Extract PDF pages to images
Translate all images to the target language
Generate audio for all slides

Video/Audio Annotation Tool

Launch an interactive web UI for annotating videos or audio files with frame-accurate timestamps:

# Install annotation dependencies first
pip install "montaigne[annotate]"

# Launch annotation UI
essai annotate video.mp4
essai annotate audio.wav
essai annotate                        # Auto-detect media in current dir
essai annotate video.mp4 --network    # Make accessible on local network

# Export annotations
essai annotate video.mp4 --export srt   # Export to SRT (Premiere, DaVinci)
essai annotate video.mp4 --export vtt   # Export to WebVTT (browsers)
essai annotate video.mp4 --export json  # Export to JSON

Keyboard shortcuts:

Key	Action
Space	Play/Pause
I	Set In point for range
O	Set Out point for range
[ ]	Step frame backward/forward
Ctrl+Enter	Submit annotation
Escape	Clear range / exit input

Features:

Frame-accurate timing using requestVideoFrameCallback API
Waveform visualization with click-to-seek
Light/dark theme toggle
Local-first SQLite storage (zero-latency)
Export to WebVTT, SRT, JSON formats

Web Editor

Launch a Streamlit-based web interface for managing slides and scripts:

# Install editor dependencies first
pip install "montaigne[edit]"

# Launch the editor
essai edit
essai edit --pdf presentation.pdf --script voiceover.md

Model Configuration

Each AI command supports a --model / -m flag to override the default Gemini model:

Command	Default Model	Purpose
`essai script`	`gemini-3-pro-preview`	Script generation
`essai audio`	`gemini-2.5-pro-preview-tts`	Text-to-speech
`essai translate`	`gemini-3-pro-image-preview`	Image translation

List available models:

essai models

Voiceover Script Format

Scripts should follow this markdown format:

## SLIDE 1: Title
**[Duration: ~45 seconds]**

Your narration text for slide 1 goes here.

---

## SLIDE 2: Next Topic
**[Duration: ~60 seconds]**

Narration for slide 2.

Demo

See the demo/hamlet/ folder for a complete example with:

Sample PDF presentation
Voiceover script
Image asset

cd demo/hamlet
essai localize --lang French

Requirements

Python 3.10+
Google Gemini API key
ffmpeg (for video generation)
Dependencies: google-genai, python-dotenv, pymupdf, python-pptx, Pillow

Optional Dependencies

edit: streamlit - Web editor interface
annotate: flask - Video/audio annotation tool
coqui: TTS, torch, torchaudio - Local TTS with Coqui XTTS-v2 (no API key required)
cloud: fastapi, uvicorn, google-cloud-storage - Cloud API deployment

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

slevin48

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.4.0

Mar 2, 2026

1.3.0

Mar 2, 2026

1.2.0

Feb 28, 2026

1.1.2

Feb 28, 2026

1.1.1

Jan 25, 2026

1.1.0

Jan 24, 2026

1.0.1

Jan 20, 2026

1.0.0

Jan 19, 2026

0.9.9

Jan 19, 2026

0.9.8

Jan 19, 2026

0.9.7

Jan 19, 2026

0.9.6

Jan 18, 2026

0.9.5

Jan 18, 2026

0.9.4

Jan 18, 2026

0.9.3

Jan 18, 2026

0.9.2

Jan 18, 2026

0.9.1

Jan 17, 2026

0.9.0

Jan 17, 2026

0.8.9

Jan 17, 2026

0.8.4

Jan 17, 2026

0.8.3

Jan 17, 2026

0.8.2

Jan 17, 2026

0.8.1

Jan 17, 2026

0.8.0

Jan 17, 2026

0.7.0

Jan 12, 2026

0.6.0

Jan 12, 2026

0.5.1

Jan 11, 2026

0.4.0

Jan 9, 2026

0.3.0

Jan 7, 2026

0.2.1

Jan 7, 2026

0.2.0

Jan 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

montaigne-1.4.0.tar.gz (99.0 kB view details)

Uploaded Mar 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

montaigne-1.4.0-py3-none-any.whl (93.2 kB view details)

Uploaded Mar 2, 2026 Python 3

File details

Details for the file montaigne-1.4.0.tar.gz.

File metadata

Download URL: montaigne-1.4.0.tar.gz
Upload date: Mar 2, 2026
Size: 99.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for montaigne-1.4.0.tar.gz
Algorithm	Hash digest
SHA256	`e5034b01e4822aa00a1a476c12ccde4cd197dd8f65f51cc30f03cb236847bbde`
MD5	`e8b978bce1f1bf30b6dc0cff9fb0f727`
BLAKE2b-256	`44aa8cf54fdfd890a508ee291de500e10f689ac63148b5c6f966f968e0132bc5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for montaigne-1.4.0.tar.gz:

Publisher: python-publish.yml on yanndebray/montaigne

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: montaigne-1.4.0.tar.gz
- Subject digest: e5034b01e4822aa00a1a476c12ccde4cd197dd8f65f51cc30f03cb236847bbde
- Sigstore transparency entry: 1011014908
- Sigstore integration time: Mar 2, 2026
Source repository:
- Permalink: yanndebray/montaigne@00ff1bde90a0c15d5ca325f6d436b4d138f748ee
- Branch / Tag: refs/tags/v1.4.0
- Owner: https://github.com/yanndebray
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@00ff1bde90a0c15d5ca325f6d436b4d138f748ee
- Trigger Event: release

File details

Details for the file montaigne-1.4.0-py3-none-any.whl.

File metadata

Download URL: montaigne-1.4.0-py3-none-any.whl
Upload date: Mar 2, 2026
Size: 93.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for montaigne-1.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`139af908ff77cef4de496a02c115bd3e4ea2400a1542f9c3cb106c82b6001bd0`
MD5	`37615bcee19837080cf95a3aa32a8313`
BLAKE2b-256	`b7683c84336e4d9ff4bc17cf01418ca7f109b76eb85bc880ca3452c5a4f17355`

See more details on using hashes here.

Provenance

The following attestation bundles were made for montaigne-1.4.0-py3-none-any.whl:

Publisher: python-publish.yml on yanndebray/montaigne

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: montaigne-1.4.0-py3-none-any.whl
- Subject digest: 139af908ff77cef4de496a02c115bd3e4ea2400a1542f9c3cb106c82b6001bd0
- Sigstore transparency entry: 1011014965
- Sigstore integration time: Mar 2, 2026
Source repository:
- Permalink: yanndebray/montaigne@00ff1bde90a0c15d5ca325f6d436b4d138f748ee
- Branch / Tag: refs/tags/v1.4.0
- Owner: https://github.com/yanndebray
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@00ff1bde90a0c15d5ca325f6d436b4d138f748ee
- Trigger Event: release

montaigne 1.4.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Montaigne

Features

Installation

Using pip

With optional dependencies

Using uv

Using uvx (no installation required)

Setup

Usage

Extract PDF to Images

Generate Voiceover Script from Slides

Generate Audio from Script

Translate Images

Create PowerPoint from PDF or Images

Generate Video from Slides

Full Localization Pipeline

Video/Audio Annotation Tool

Web Editor

Model Configuration

Voiceover Script Format

Demo

Requirements

Optional Dependencies

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance