Skip to main content

A tool to a screenplay PDF to JSON format using OpenAI Vision Transformer Analysis.

Project description

# Screenplay PDF to JSON Converter

## Description

After years of trying to convert screenplay PDFs to a machine-readable format consistently using Final Draft and PDF Python tools, we decided to create our own screenplay PDF to JSON converter using OpenAI vision transformers. The results are much more reliable for our purposes.

The package converts a Screenplay PDF into a JSON file using the OpenAI API, returns JSON and writes it to a local file whose name is the screenplay filename.json. The process currently costs about 50 cents via the OpenAI API to convert an hour-long pilot, but will no doubt go down in price exponentially over time.

Below is an example of the JSON structure output to file:
```json
[
{
"type": "dialogue",
"name": "JOHN",
"modifier": "(V.O.)",
"content": "Hello, how are you?",
"page": 1
},
{
"type": "action",
"content": "John walks into the room.",
"page": 3
},
{
"type": "dialogue",
"name": "MARY",
"modifier": "(smiling)",
"content": "I'm good, thank you!",
"page": 4
},
{
"type": "dialogue",
"name": "JOHN",
"content": "That's great to hear.",
"page": 4
},
{
"type": "scene",
"content": "INT. LIVING ROOM - DAY",
"page": 5
}
]
```

## Getting Started

### Installation
```bash
pip install screenplay-pdf-to-json-openai
```

### OpenAI
You will need to provide your own OpenAI key. Follow the instructions [here](https://platform.openai.com/docs/quickstart).

### Known Issues
- Because it uses a statistical model, sometimes split action lines will be combined into a single JSON Action element and sometimes into multiple Action elements.
- If slug-lines are used WITHOUT INT and EXT then their behaviour is unpredictable but easily detectable.

## Quickstart

### 1. Convert Whole Screenplay and Save to JSON File
```python
from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJSON(api_key=<your_openai_key>)
data = sptj.convert('TheEmpireStrikesBack.pdf')
```

### 2. Convert First 3 Pages of a Screenplay and Save to JSON File
```python
from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJSON(api_key=<your_openai_key>)
data = sptj.convert('TheEmpireStrikesBack.pdf', end_page=3)
```

### 3. Estimate Cost of Converting a Screenplay
```python
from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJSON(api_key=<your_openai_key>)
cost = sptj.estimate_cost('TheEmpireStrikesBack.pdf')
print(f"Estimated cost to convert screenplay: ${cost:.2f}")
```

### 4. Convert Screenplay with No Title Page and Save to JSON File
```python
from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJSON(api_key=KEY, skip_title_page=False)
data3 = sptj.convert('screenplay_with_no_title_page.pdf')
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

screenplay_to_json_openai-1.0.2.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file screenplay_to_json_openai-1.0.2.tar.gz.

File metadata

File hashes

Hashes for screenplay_to_json_openai-1.0.2.tar.gz
Algorithm Hash digest
SHA256 c58fac45f8a8973954e79769cc52066ade40c188899d05bb4b32149f921dee06
MD5 b13f6d609094efb625538e9e7d32a940
BLAKE2b-256 74c0d36af9ad09c99982e04e964c917cc15f46e996f4c91934fb0779197499c4

See more details on using hashes here.

File details

Details for the file screenplay_to_json_openai-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for screenplay_to_json_openai-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 75833d80988f779ccaa8bcdb38edc3c4ad64e1a44511b59dd5a93c05d1ed3c21
MD5 7b3b0b35a39a9fb8ec1682e282e90b6f
BLAKE2b-256 2c53ab74db66fb6213c5a55bd06fdd61a7d8734910afe324ca738056b02bb9ee

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page