A tool to a screenplay PDF to JSON format using OpenAI Vision Transformer Analysis.
Project description
DESCRIPTION
After years of trying to convert screenplay PDFs to a machine-readable format consistently
using Final Draft and PDF Python tools, we decided to create our own screenplay PDF to JSON
converter using OpenAI vision transformers. The results are much more reliable for our purposes.
The package converts a Screenplay PDF into a JSON file using the OpenAI API,
returns JSON and writes it to a local file whose name is the screenplay filename.json
The process currently costs about 50 cents via the OpenAI API to convert an hour-long pilot,
but will not doubt go down in price exponentially over time.
Below is an example of the JSON structure output to file:
[
{
"type": "Dialogue",
"Name": "JOHN",
"Modifier": "(V.O.)",
"Speech": "Hello, how are you?",
"page": 1
},
{
"type": "Action",
"text": "John walks into the room.",
"page": 3
},
{
"type": "Dialogue",
"Name": "MARY",
"Parenthetical": "(smiling)",
"Speech": "I'm good, thank you!",
"page": 4
},
{
"type": "Dialogue",
"Name": "JOHN",
"Speech": "That's great to hear.",
"page": 4
},
{
"type": "Scene",
"text": "INT. LIVING ROOM - DAY",
"page": 5
}
]
GETTING STARTED
Installation:
pip install screenplay_pdf_to_json
OpenAI:
You will need to provide you own OpenAI key.
Follow the instructions here: https://platform.openai.com/docs/quickstart
Known Issues:
Because it uses a statistical model, sometimes \n\n split action lines will be combined into a single JSON Action element and sometimes into multiple Action elements.
Also, if slug-lines are used WITHOUT INT and EXT then their behaviour is unpredictable but easily detectable.
QUICKSTART
1. CONVERT WHOLE SCREENPLAY
from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJson(api_key=<your_openai_key>)
data = sptj.convert('TheEmpireStrikesBack.pdf')
2. CONVERT FIRST 3 PAGES OF A SCREENPLAY
from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJson(api_key=<your_openai_key>)
data = sptj.convert('TheEmpireStrikesBack.pdf', end_page=3)
3. ESTIMATE COST OF CONVERTING A SCREENPLAY
from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJson(api_key=<your_openai_key>)
cost = sptj.estimate_cost('TheEmpireStrikesBack.pdf')
print(f"Estimated cost to convert screenplay: ${cost:.2f}")
4. CONVERT SCREENPLAY WITH NO TITLE PAGE
from screenplay_pdf_to_json import ScreenplayPDFToJSON
sptj = ScreenplayPDFToJSON(api_key=KEY, skip_title_page=False)
data3 = sptj.convert('screenplay_with_no_title_page.pdf')
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file screenplay_to_json_openai-1.0.1.tar.gz
.
File metadata
- Download URL: screenplay_to_json_openai-1.0.1.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f3211b6988bee0f87624cf2375dc35cfdec64a0e56284e8153158402183bae30 |
|
MD5 | 063138b4517b7c82535a068b7279f6a6 |
|
BLAKE2b-256 | 44bc4b28c542abf3b6263ee83e3178e30e3fe82a60ac55182e8c5ef2586c77ba |
File details
Details for the file screenplay_to_json_openai-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: screenplay_to_json_openai-1.0.1-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b556d28eae320a3f1880ba47bc4ae2a389f8916e7dd2c42812571563dfbf8c3f |
|
MD5 | 1b726a7dcd0f510d0d52c3a58b6916cf |
|
BLAKE2b-256 | cc7eca6f124ddb3d4d1f1bdba20e2f75291710760efe8a774103e31b3ac37ee7 |