AI agent that converts screenplays to consistent Text-to-Video prompts. Supports Sora, Runway, Kling, Veo via LangGraph & LLMs.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

deepeng

These details have not been verified by PyPI

Project links

Documentation

Project description

story-to-shot (PenShot)

A multi-agent collaborative screenplay storyboarding system that splits scripts in various formats into script units optimized for AI text-to-video generation durations. It outputs high-quality storyboard fragment descriptions while ensuring narrative continuity. Built on LangChain and LangGraph, the system leverages LLMs to parse any script format into "Text-to-Video" prompt fragments compatible with mainstream AI video models. It supports task pool priority queuing, multi-level memory management, and Chroma vector retrieval.

中文 | English | Documentation | PyPI | WebSite

GitHub stars

From Story to Shot - Transform your scripts into AI-powered storyboards.

Named "penshot" on PyPI - because every story starts with a pen.

Core Features

Feature	Description
Intelligent Script Parsing	Automatically identifies scenes, dialogue, and action cues; understands narrative structure; supports long-text chunking.
Precise Temporal Planning	Intelligently segments content at the shot level, allocating optimal durations that strictly comply with AI video model constraints.
Continuity Guard	Leverages task pool priority queuing, multi-level memory (short/mid/long-term), and Chroma vector retrieval to ensure high consistency in character states, scenes, and plot across adjacent shots.
High-Quality Prompt Output	Generates detailed bilingual (Chinese/English) visual descriptions, negative prompts, and audio prompts, ready for immediate use.
Multi-Model Compatibility	Supports OpenAI, Qwen, DeepSeek, Ollama, and other major LLM providers with plug-and-play switching.
Multi-Protocol Integration	Provides Python SDK, REST API, LangGraph nodes, A2A collaboration protocol, and standard MCP interfaces.
Robustness & Traceability	Built-in auto-retry and error fallback mechanisms. Every storyboard fragment is bidirectionally traceable to its original script location.

System Architecture & Workflow

flowchart TD
    subgraph Input [Input Layer]
        A1[Client / Upstream Agent] --> A2[REST API / MCP / A2A]
        A2 --> A3[Task Manager]
    end

    subgraph Core [LangGraph Multi-Agent Core Workflow]
        direction TB
        
        P1[Script Parser Agent] --> P2[Storyboard Generator Agent]
        P2 --> P3[Video Splitter Agent]
        P3 --> P4[Prompt Converter Agent]
        P4 --> P5[Quality Auditor Agent]
        P5 --> P6[Continuity Guardian Agent]
        P6 --> P7[Auxiliary Generator Agent<br/>Three-view/Background/Keyframe]
        
        subgraph Control [Control Nodes]
            C1[Loop Check] --> C2[Error Handling]
            C2 --> C3[Human Intervention]
            C3 --> C4[Result Generation]
        end
        
        P1 -.->|Retry/Fix| Control
        P2 -.->|Retry/Fix| Control
        P3 -.->|Retry/Fix| Control
        P4 -.->|Retry/Fix| Control
        P5 -.->|Retry/Fix| Control
        P6 -.->|Retry/Fix| Control
        Control -.->|Routing Decision| P1
    end

    subgraph Memory [Memory Layer]
        M1[(Short-term Memory)]
        M2[(Medium-term Memory)]
        M3[(Long-term Memory)]
        M4[(Vector Database<br/>Chroma)]
        
        M1 <--> Core
        M2 <--> Core
        M3 <--> Core
        M4 <--> Core
    end

    subgraph Output [Output Layer]
        O1[Workflow Output Fixer<br/>Segment Sequence Repair] --> O2[Result Formatting]
        O2 --> O3[JSON / SDK / MCP / A2A]
    end

    subgraph Downstream [Downstream Rendering]
        D1[Multi-model Adapter] --> D2[Sora/Veo/Runway/Keling/SVD]
        D2 --> D3[FFmpeg Synthesis]
        D3 --> D4[Final Video]
    end

    A3 --> P1
    P7 --> O1
    O3 --> D1

This system is a typical Natural Language Processing (NLP) application that achieves end-to-end storyboard transcoding through multi-agent collaboration and memory mechanisms. For detailed architectural design, memory pool implementation, and continuity assurance, please refer to: Architecture Design & Implementation

Quick Start

1. Environment Setup

# Install via PyPI
pip install penshot

Note: penshot is the PyPI package name, while story-shot-agent is the GitHub repository name. Both refer to the same project.

2. Configuration

cp .env.example .env

Edit the .env file to configure the required LLM and Embedding parameters:

########################## LLM Configuration #########################
PENSHOT_LLM__DEFAULT__BASE_URL=https://api.openai.com/v1
PENSHOT_LLM__DEFAULT__API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PENSHOT_LLM__DEFAULT__MODEL_NAME=gpt-4o
PENSHOT_LLM__DEFAULT__TIMEOUT=30

########################## Embedding Model Configuration #########################
PENSHOT_EMBED__DEFAULT__BASE_URL=https://api.openai.com/v1
PENSHOT_EMBED__DEFAULT__API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PENSHOT_EMBED__DEFAULT__MODEL_NAME=text-embedding-v4

########################## Redis Configuration ##########################
PENSHOT_REDIS_URL=redis://:123456@localhost:6379/0

3.Usage Methods

1. Python SDK

from penshot.api import create_penshot_agent

agent = create_penshot_agent(max_concurrent=5)

script = "Morning, a girl reading in a cafe, sunlight streaming through the window..."
task_id = agent.breakdown_script_async(
    script,
    callback=lambda r: print(f"Task {r.task_id} completed")
)

status = agent.get_task_status(task_id)
result = await agent.wait_for_result_async(task_id)

Full example: direct_usage.py

2. FastAPI Web Application Integration

Integrate into existing systems via standard HTTP endpoints:

from fastapi import FastAPI, HTTPException
from penshot.api import create_penshot_agent

app = FastAPI(title="Penshot API", version="0.1.0")
agent = create_penshot_agent(max_concurrent=5)

@app.post("/api/generate")
async def generate(script_text: str):
    task_id = agent.breakdown_script_async(script_text)
    return {"task_id": task_id, "status": "PENDING"}

Full example: web_app.py

3. LangGraph Node Integration

Can be embedded as an independent node in LangChain/LangGraph workflows for end-to-end automation. Full example: langgraph_integration.py

4. A2A Protocol Collaboration

Supports context passing and task orchestration with upstream scriptwriting agents and downstream text-to-video/editing agents. Full example: a2a_integration.py

5. MCP (Model Context Protocol) Support

Start the MCP Server:

python -m penshot.mcp_server --max-concurrent 5 --queue-size 500

Clients can call the breakdown_script and get_task_result tools to seamlessly integrate with MCP-compatible IDEs or agent frameworks. Full example: mcp_client.py

Output Data Structure

The system returns standardized JSON containing video prompts, negative prompts, duration estimates, style parameters, and accompanying audio prompts:

{
  "fragments": [
    {
      "fragment_id": "frag_001",
      "prompt": "Cinematic wide shot: midnight 11 PM in a compact urban apartment living room...",
      "negative_prompt": "cartoon, anime, 3D render, bright lighting, text, watermark...",
      "duration": 4.2,
      "model": "runway_gen2",
      "style": "cinematic 35mm film, moody realism, shallow depth of field...",
      "audio_prompt": {
        "audio_id": "audio_001",
        "prompt": "Low-frequency rain ambience (intensity 0.95), distant muffled TV static...",
        "model_type": "AudioLDM_3",
        "audio_style": "cinematic"
      }
    }
  ]
}

System Notes & Considerations

Category	Description
Network Dependency	Requires stable access to external LLM APIs. Proxy or domestic mirrors are recommended.
Long Text Processing	For extremely long scripts, segmented input is advised. The system includes built-in context memory and RAG mechanisms.
Generation Duration	AI video models may output clips with ±10% duration variance, which is industry-standard.
Multilingual Support	Currently optimized for Chinese scripts. Support for other languages is under active iteration.
Audio Synchronization	Audio prompts are provided. Lip-sync and environmental sound fusion require downstream tooling.
Error Handling	Auto-retry and fallback mechanisms are built-in. Extreme edge cases may require manual intervention.

Development Roadmap

Short-Term

Optimize long-shot segmentation logic for action continuity
Implement consistency validators for character clothing, positioning, and props
Specialized prompt format adaptation for Sora, Pika, and other models
Hybrid architecture combining rule-based engines and LLMs
Full English script support and intelligent node failure fallback
Fragment confidence scoring and debug mode (intermediate result persistence)

Mid-Term

Advanced camera language support (pan, tilt, zoom, tracking, follow)
Emotion-driven automatic visual style adjustment
Ultra-long script chunking + vector DB context memory
Multi-script batch queue processing & Web visualization interface
Character/scene reference image integration & multi-format export (XML/EDL/JSON)

Long-Term

Multimodal input (image + audio + text hybrid)
Real-time low-resolution preview & automatic continuity repair
Professional editing software plugins (Premiere/FCP/DaVinci)
Multi-user collaboration, version control, & autonomous learning from feedback
Bidirectional script-fragment traceability, semantic alignment detection, & multi-round correction mechanisms

Ultimate Goal

Achieve zero-information-loss visualization for scripts of any length, language, or genre, delivering a standardized workflow that meets professional director-level storyboarding standards. The system will feature customizable styles, full traceability, automatic optimization loops, and cross-modal high consistency.

Contributing

We welcome contributions via Issues or Pull Requests:

Bug Reports: Please provide reproduction steps, environment details, and error logs.
Feature Requests: Use the enhancement label.
Code Optimization: Performance tuning, architectural refactoring, or adding test cases.
Documentation: Translations, example additions, or technical corrections.

Quick dev environment setup:

git clone https://github.com/neopen/story-shot-agent.git
cd story-shot-agent
pip install -e ".[dev]"
pytest tests/

License

Contact

Project Homepage: https://github.com/neopen/story-shot-agent
Documentation: https://pengline.cn/2026/02/7e6cd67dd5ee45248f2276ac145555f5/

Special thanks to LangChain, LangGraph, Chroma, Ollama, and the open-source community for their technical support. If this project has been helpful to your work, please consider starring the repository and sharing your feedback.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

deepeng

These details have not been verified by PyPI

Project links

Documentation

Release history Release notifications | RSS feed

0.3.4

May 21, 2026

0.3.3

May 15, 2026

0.3.2

May 5, 2026

0.3.1

May 3, 2026

This version

0.3.0

Apr 27, 2026

0.2.4

Apr 8, 2026

0.2.3

Apr 6, 2026

0.2.2

Apr 1, 2026

0.2.1

Mar 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

penshot-0.3.0.tar.gz (423.6 kB view details)

Uploaded Apr 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

penshot-0.3.0-py3-none-any.whl (463.4 kB view details)

Uploaded Apr 27, 2026 Python 3

File details

Details for the file penshot-0.3.0.tar.gz.

File metadata

Download URL: penshot-0.3.0.tar.gz
Upload date: Apr 27, 2026
Size: 423.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for penshot-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`f7449168de0246cf03860607816bb332e466a4dc48446e4c00697c38b9117401`
MD5	`e8da9241dee91f7adc359d5f8950dd5e`
BLAKE2b-256	`bf3715b2643dfdb8c0443e4fb50521c9434ee2a10d4c6acfe18876acb91efddd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for penshot-0.3.0.tar.gz:

Publisher: publish-pypi.yml on neopen/story-shot-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: penshot-0.3.0.tar.gz
- Subject digest: f7449168de0246cf03860607816bb332e466a4dc48446e4c00697c38b9117401
- Sigstore transparency entry: 1394010825
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: neopen/story-shot-agent@7edb36eaf01bdbc7e08740451088d971f7324e4a
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/neopen
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@7edb36eaf01bdbc7e08740451088d971f7324e4a
- Trigger Event: release

File details

Details for the file penshot-0.3.0-py3-none-any.whl.

File metadata

Download URL: penshot-0.3.0-py3-none-any.whl
Upload date: Apr 27, 2026
Size: 463.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for penshot-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f4cf9a02319f9db224ba73718ebf03fa18c15bc2a5c36b6a3c61ba59a318d1ad`
MD5	`7746eed5fb5c4008b6cd4b576dc93977`
BLAKE2b-256	`54161d994722e5b777d1df7da7e1f3cf1f605bf3a92d1ac4dbc5eb4107c4f9cb`

See more details on using hashes here.

Provenance

The following attestation bundles were made for penshot-0.3.0-py3-none-any.whl:

Publisher: publish-pypi.yml on neopen/story-shot-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: penshot-0.3.0-py3-none-any.whl
- Subject digest: f4cf9a02319f9db224ba73718ebf03fa18c15bc2a5c36b6a3c61ba59a318d1ad
- Sigstore transparency entry: 1394010835
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: neopen/story-shot-agent@7edb36eaf01bdbc7e08740451088d971f7324e4a
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/neopen
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@7edb36eaf01bdbc7e08740451088d971f7324e4a
- Trigger Event: release

penshot 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

story-to-shot (PenShot)

Core Features

System Architecture & Workflow

Quick Start

1. Environment Setup

2. Configuration

3.Usage Methods

1. Python SDK

2. FastAPI Web Application Integration

3. LangGraph Node Integration

4. A2A Protocol Collaboration

5. MCP (Model Context Protocol) Support

Output Data Structure

System Notes & Considerations

Development Roadmap

Short-Term

Mid-Term

Long-Term

Ultimate Goal

Contributing

License

Contact

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance