Skip to main content

A tool to a screenplay PDF to JSON format using OpenAI Vision Transformer Analysis.

Project description

Screenplay PDF to JSON Converter

Description

After years of trying to convert screenplay PDFs to a machine-readable format consistently using Final Draft and PDF Python tools, we decided to create our own screenplay PDF to JSON converter using OpenAI vision transformers. The results are much more reliable for our purposes.

The package converts a Screenplay PDF into a JSON file using the OpenAI API, returns JSON and writes it to a local file whose name is the screenplay filename.json. The process currently costs about 50 cents via the OpenAI API to convert an hour-long pilot, but will no doubt go down in price exponentially over time.

Below is an example of the JSON structure output to file:

[
    {
        "type": "dialogue",
        "name": "JOHN",
        "modifier": "(V.O.)",
        "content": "Hello, how are you?",
        "page": 1
    },
    {
        "type": "action",
        "content": "John walks into the room.",
        "page": 3
    },
    {
        "type": "dialogue",
        "name": "MARY",
        "modifier": "(smiling)",
        "content": "I'm good, thank you!",
        "page": 4
    },
    {
        "type": "dialogue",
        "name": "JOHN",
        "content": "That's great to hear.",
        "page": 4
    },
    {
        "type": "scene",
        "content": "INT. LIVING ROOM - DAY",
        "page": 5
    }
]

Getting Started

Installation

pip install screenplay-pdf-to-json-openai

OpenAI

You will need to provide your own OpenAI key. Follow the instructions here.

Known Issues

  • Because it uses a statistical model, sometimes split action lines will be combined into a single JSON Action element and sometimes into multiple Action elements.
  • If slug-lines are used WITHOUT INT and EXT then their behaviour is unpredictable but easily detectable.

Quickstart

1. Convert Whole Screenplay and Save to JSON File

from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJSON(api_key=<your_openai_key>)
data = sptj.convert('TheEmpireStrikesBack.pdf')

2. Convert First 3 Pages of a Screenplay and Save to JSON File

from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJSON(api_key=<your_openai_key>)
data = sptj.convert('TheEmpireStrikesBack.pdf', end_page=3)

3. Estimate Cost of Converting a Screenplay

from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJSON(api_key=<your_openai_key>)
cost = sptj.estimate_cost('TheEmpireStrikesBack.pdf')
print(f"Estimated cost to convert screenplay: ${cost:.2f}")

4. Convert Screenplay with No Title Page and Save to JSON File

from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJSON(api_key=KEY, skip_title_page=False)
data3 = sptj.convert('screenplay_with_no_title_page.pdf')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

screenplay_to_json_openai-1.0.3.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file screenplay_to_json_openai-1.0.3.tar.gz.

File metadata

File hashes

Hashes for screenplay_to_json_openai-1.0.3.tar.gz
Algorithm Hash digest
SHA256 377eb22170a23195e202395a47e8e28965a17a9a96ebff0d6122c95e7e095aab
MD5 f18712010009c1674e3a0912bd0d83d0
BLAKE2b-256 08a2c27873853f68eb74a75d208790f0d10002196bbf87045ad49086ece8449c

See more details on using hashes here.

File details

Details for the file screenplay_to_json_openai-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for screenplay_to_json_openai-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8a3934ee9b70fa853274b1c039087aff9e5b3e227a6d95e36918ed53628d1b54
MD5 dd8cb125ce6eef65fe8d3a3ab6386c1d
BLAKE2b-256 6d5d31982d2b16f7867380b0b74fd2b5a64b4a295e45f8c586b34ccffb8916e8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page