Turn handwritten forms, notes, and scanned paperwork into automation-ready JSON.

These details have not been verified by PyPI

Project description

Handwriting JSON

Turn handwritten forms, notes, and scanned paperwork into automation-ready JSON.

Handwriting JSON is a Python package and CLI for automating handwritten document workflows. It uses vision LLMs and optional schema guidance to convert PDFs/images into structured JSON your applications can use.

It is built for the messy documents that still slow teams down: registration forms, field notes, inspection sheets, school permission slips, donation forms, clinic intake paperwork, KYC forms, surveys, maintenance reports, delivery notes, lab slips, and old scanned records.

The project was inspired by OmmSai, a healthcare automation project that processed roughly 15,000 handwritten prescription files for a charitable healthcare event. Prescriptions are now just one example. The package is designed for many handwritten-document automation workflows.

Why It Exists

Many business workflows still start with paper. A person fills out a form, writes a note, signs a slip, or scans an old record. OCR can return text, but automation usually needs structured data.

Handwriting JSON focuses on the automation step:

handwritten document -> schema-guided extraction -> structured JSON -> downstream workflow

Use it to turn:

handwritten signup sheets into CRM records
field notes into tickets
inspection forms into compliance reports
school slips into student records
donation forms into spreadsheets
clinic intake forms into review queues
scanned records into searchable JSON

Features

Extract structured JSON from handwritten PDFs and images.
Guide extraction with JSON Schema or an example JSON object.
Use multiple vision LLM providers through LiteLLM.
Process one document or a directory of documents from the CLI.
Use the same extraction path from Python code or the command line.
Keep domain-specific behavior in examples and presets.

Install

pip install handwriting-json

Provider credentials are configured through the environment variables expected by LiteLLM for the model you choose.

For local development:

git clone https://github.com/ramdhavepreetam/handwriting-json.git
cd handwriting-json
python3 -m pip install -e ".[dev]"

Python API

from handwriting_json import extract

result = extract(
    "handwritten_registration_form.jpg",
    model="anthropic/claude-sonnet-4-5",
    schema={
        "full_name": "",
        "phone": "",
        "email": "",
        "address": "",
        "date": "",
        "notes": "",
        "signature_present": False
    },
)

print(result.data)

CLI

handwriting-json extract \
  --input handwritten_signup_form.jpg \
  --schema examples/signup_form_schema.json \
  --output result.json \
  --model anthropic/claude-sonnet-4-5

Batch mode:

handwriting-json batch \
  --input-dir ./forms \
  --output results.jsonl \
  --model anthropic/claude-sonnet-4-5

Check installation:

handwriting-json version
python3 -m handwriting_json --help

Example Schemas

Registration form:

{
  "full_name": "",
  "phone": "",
  "email": "",
  "address": "",
  "date": "",
  "notes": "",
  "signature_present": false
}

Field inspection note:

{
  "site_name": "",
  "inspection_date": "",
  "inspector": "",
  "issues": [],
  "recommended_action": "",
  "urgency": ""
}

School permission slip:

{
  "student_name": "",
  "parent_name": "",
  "class": "",
  "event": "",
  "consent_given": false,
  "emergency_contact": ""
}

More examples live in examples/.

Schema Guidance

You can pass either:

a formal JSON Schema, or
a simpler example JSON object.

The schema is injected into the prompt so the model knows the desired output shape. Formal JSON Schema responses are also validated after extraction.

Why This Is Not Just OCR

OCR asks:

What text is visible?

Handwriting JSON asks:

What structured data should this document become so software can use it?

That distinction matters for automation. A CRM, ticketing system, spreadsheet import, compliance workflow, or review queue does not need a paragraph of text. It needs fields.

Why LiteLLM, Not LangChain/LangGraph?

V0.1 is a focused extraction library: normalize input, build a schema-guided prompt, call a vision model, parse JSON, and optionally validate the output.

LiteLLM solves the provider-routing problem without adding orchestration complexity. LangChain or LangGraph may become useful later for multi-step workflows such as OCR fallback, validation repair loops, routing by document type, and human review queues.

Roadmap

V0.1: Python package, CLI, schema guidance, LiteLLM provider abstraction.
V0.1.x: stronger examples, provider setup docs, README demos.
V0.2: checkpointed batch processing and validation repair loop.
Later: Docker image, REST API mode, OCR fallback, cost reporting, field-level evidence.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

May 18, 2026

0.1.0

May 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

handwriting_json-0.1.1.tar.gz (17.1 kB view details)

Uploaded May 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

handwriting_json-0.1.1-py3-none-any.whl (15.3 kB view details)

Uploaded May 18, 2026 Python 3

File details

Details for the file handwriting_json-0.1.1.tar.gz.

File metadata

Download URL: handwriting_json-0.1.1.tar.gz
Upload date: May 18, 2026
Size: 17.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.1

File hashes

Hashes for handwriting_json-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`e6a00510e57aece2eb856871bf6c5d6ee3665882d63abf62a3d461cb31cd50e9`
MD5	`7dbe2fbba70f8af60a01c9ff5e65561f`
BLAKE2b-256	`cef3a6dbe360dd4ea7918c9e90984b42ac3505e71d4a547fadeae07721db9d01`

See more details on using hashes here.

File details

Details for the file handwriting_json-0.1.1-py3-none-any.whl.

File metadata

Download URL: handwriting_json-0.1.1-py3-none-any.whl
Upload date: May 18, 2026
Size: 15.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.1

File hashes

Hashes for handwriting_json-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fd77004ede1aa316ea5b7caa2fa4c8e4edc1ebaef5adafc660c17008ef01917a`
MD5	`ff18c75a8cba48f85c5856cdc3614a10`
BLAKE2b-256	`55d25348b9b4c01b89a8d85e8599c5b4d62661a3e6d86282fe53f617364474e8`

See more details on using hashes here.

handwriting-json 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Handwriting JSON

Why It Exists

Features

Install

Python API

CLI

Example Schemas

Schema Guidance

Why This Is Not Just OCR

Why LiteLLM, Not LangChain/LangGraph?

Roadmap

Links

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes