Skip to main content

AI-Planned, Scene-Exact Video Generation with Continuity using Sora 2

Project description

Sora Extend

AI-Planned, Scene-Exact Video Generation with Continuity using Sora 2

Originally built by Matt Shumer. CLI package conversion by Mason Hall.

Usage

uvx sora-extend generate "A tron motorcycle racing through NYC at night" --api-key <your-api-key>

Overview

Sora Extend enables you to generate extended videos (>12s) using Sora 2 by:

  1. AI Planning: Using an LLM (e.g., GPT-5) to intelligently plan N scene prompts from a base idea, optimized for continuity
  2. Chained Generation: Rendering each segment with Sora 2, passing the prior segment's final frame as input_reference for seamless transitions
  3. Automatic Concatenation: Combining all segments into a single MP4

This approach allows you to create videos much longer than Sora's standard 12-second limit while maintaining visual continuity across segments.

Installation

Using pip

pip install -e .

Using uv (recommended)

uv pip install -e .

Prerequisites

  • Python 3.12 or higher
  • OpenAI API key with access to:
    • Sora 2 (video generation)
    • A reasoning model for planning (e.g., GPT-5, or fallback to gpt-4)

Note: This tool also supports Echo API keys. If your API key starts with echo_, the tool will automatically route requests through Echo's API gateway.

Performance

This package is optimized for fast startup using lazy imports:

  • Initial import: ~1-2ms (heavy dependencies not loaded)
  • CLI startup: Fast, loads only what's needed
  • No breaking changes: All APIs work exactly the same

Heavy libraries (OpenCV, MoviePy, Typer, Rich) are only loaded when their functions are actually called, making programmatic usage extremely fast.

For more details and benchmarks, see PERFORMANCE.md.

Usage

Basic Command

sora-extend generate "Your video description here"

Full Example

export OPENAI_API_KEY="your-api-key-here"

sora-extend generate \
  "Gameplay footage of a game releasing in 2027, a car driving through a futuristic city" \
  --seconds 8 \
  --num-segments 3 \
  --output-dir ./my_video

This will create a 24-second video (3 segments × 8 seconds each).

Command-Line Options

sora-extend generate PROMPT [OPTIONS]

Arguments:
  PROMPT  Base prompt describing the video concept [required]

Options:
  -k, --api-key TEXT           OpenAI API key (or set OPENAI_API_KEY env var)
  -s, --seconds INTEGER        Duration of each segment in seconds (4, 8, or 12) [default: 8]
  -n, --num-segments INTEGER   Number of segments to generate [default: 2]
  --sora-model TEXT           Sora model to use (sora-2 or sora-2-pro) [default: sora-2]
  --planner-model TEXT        Planning model for generating prompts [default: gpt-5]
  --size TEXT                 Video size (must stay constant) [default: 1280x720]
  -o, --output-dir PATH       Output directory for generated videos [default: sora_ai_planned_chain]
  --no-concatenate            Skip concatenating segments into final video
  -q, --quiet                 Suppress progress output
  --help                      Show this message and exit

Examples

Create a 16-second video (2 × 8s segments):

sora-extend generate "A drone flying over a cyberpunk city at night"

Create a 48-second video using 12-second segments:

sora-extend generate "A nature documentary about arctic foxes" \
  --seconds 12 \
  --num-segments 4

Generate segments without concatenating:

sora-extend generate "Product showcase for a new smartphone" \
  --seconds 8 \
  --num-segments 3 \
  --no-concatenate

Use Sora 2 Pro model:

sora-extend generate "High-quality cinematic scene" \
  --sora-model sora-2-pro \
  --seconds 12 \
  --num-segments 2

How It Works

1. AI Planning Phase

The planner model (default: GPT-5) receives your base prompt and generates detailed, scene-specific prompts for each segment. These prompts include:

  • Context sections: Guidance for the AI about what came before
  • Continuity instructions: Explicit directions to start from the previous segment's final frame
  • Visual consistency: Maintained lighting, style, and subject identity across segments

Detailed steps:

  • [1.1] Preparing planning request
  • [1.2] Calling the planner model (e.g., GPT-5)
  • [1.3] Parsing AI response
  • [1.4] Validating and formatting segments

2. Chained Video Generation

For each segment:

  1. Generate video using Sora 2 with the planned prompt
  2. Extract the final frame from the generated video
  3. Use that frame as the input_reference for the next segment

This creates smooth transitions between segments.

Detailed steps (per segment N):

  • [2.N.1] Creating video generation job
  • [2.N.2] Job started confirmation
  • [2.N.3] Polling for completion
  • [2.N.4] Downloading video segment
  • [2.N.5] Video saved confirmation
  • [2.N.6] Extracting last frame for continuity
  • [2.N.7] Reference frame saved

3. Concatenation

All segments are concatenated into a single MP4 file using MoviePy, preserving the framerate and quality.

Detailed steps:

  • [3.1] Loading video segments
  • [3.2] Combining video segments
  • [3.3] Writing final video
  • [3.4] Cleaning up resources

Output Structure

sora_ai_planned_chain/
├── segment_01.mp4        # First video segment
├── segment_01_last.jpg   # Last frame of first segment
├── segment_02.mp4        # Second video segment
├── segment_02_last.jpg   # Last frame of second segment
├── ...
└── combined.mp4          # Final concatenated video

API Usage

You can also use Sora Extend as a Python library:

from openai import OpenAI
from pathlib import Path
from sora_extend import plan_prompts_with_ai, chain_generate_sora, concatenate_segments

# Initialize client (supports both OpenAI and Echo API keys)
api_key = "your-api-key"
if api_key.startswith("echo_"):
    client = OpenAI(api_key=api_key, base_url="https://echo.router.merit.systems")
else:
    client = OpenAI(api_key=api_key)

output_dir = Path("./output")
output_dir.mkdir(exist_ok=True)

# Plan segments
segments = plan_prompts_with_ai(
    base_prompt="A journey through space",
    seconds_per_segment=8,
    num_generations=3,
    planner_model="gpt-5",
    client=client,
)

# Generate videos
segment_paths = chain_generate_sora(
    segments=segments,
    size="1280x720",
    model="sora-2",
    api_key="your-api-key",
    out_dir=output_dir,
)

# Concatenate
final_video = concatenate_segments(segment_paths, output_dir / "final.mp4")
print(f"Video saved to: {final_video}")

Configuration

Environment Variables

  • OPENAI_API_KEY: Your OpenAI API key (required)
    • Supports both OpenAI keys and Echo keys (starting with echo_)
    • Echo keys automatically route through https://echo.router.merit.systems

Supported Video Sizes

  • 1280x720 (default)
  • 1920x1080
  • 1080x1920 (vertical)
  • Other sizes supported by Sora 2

Segment Durations

Recommended values: 4, 8, or 12 seconds

The total video duration will be: seconds_per_segment × num_segments

Tips for Best Results

  1. Clear Base Prompts: Be specific about the setting, style, and mood
  2. Consistent Style: Mention visual style in your base prompt (e.g., "cinematic", "documentary style")
  3. Subject Continuity: If featuring a character/object, describe them clearly
  4. Reasonable Segment Count: Start with 2-3 segments to test, then scale up
  5. Check Planner Output: Review the AI-planned prompts before generation (use --quiet to see them)

Troubleshooting

"Planner returned no text"

  • Try a different planner model: --planner-model gpt-4o

"Create video failed"

  • Check your API key has Sora 2 access
  • Verify your prompt doesn't violate content policies

Discontinuities between segments

  • Try shorter segments (e.g., 4s instead of 12s)
  • Make your base prompt more specific about visual style

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is based on the original Jupyter notebook by Matt Shumer.

Credits

  • Original concept and notebook: Matt Shumer
  • CLI package conversion: Mason Hall

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sora_extend-0.1.3.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sora_extend-0.1.3-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file sora_extend-0.1.3.tar.gz.

File metadata

  • Download URL: sora_extend-0.1.3.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.13

File hashes

Hashes for sora_extend-0.1.3.tar.gz
Algorithm Hash digest
SHA256 48fdbaed11956244c50f6a7b6c29cebf34d5b91eebd9025d0e2991260bb52cc4
MD5 ff313775a52668eabdb6cb1c9c5c111b
BLAKE2b-256 1102e331363068e19737056ad6623cca5e5654ab0279ee76db431066d5e6cd2c

See more details on using hashes here.

File details

Details for the file sora_extend-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for sora_extend-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 87a5bd1c14c6e637ab68c09f4ce13e260cb82f5f01eb737cf9819fefc791db30
MD5 91352d175fe5617a9b6ba2f0a14e769a
BLAKE2b-256 3a18e5d860563e1ca3abc83d73283babaa533e4e3a84c8c04babf69adf8946e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page