Skip to main content

Python bindings for LiteParse - fast, lightweight PDF and document parsing

Project description

LiteParse Python

Python bindings for LiteParse — fast, lightweight PDF and document parsing with spatial text extraction.

Installation

pip install liteparse

This also installs the lit CLI command.

Quick Start

from liteparse import LiteParse

parser = LiteParse()
result = parser.parse("document.pdf")
print(result.text)

# Access structured data
for page in result.pages:
    print(f"Page {page.page_num}: {len(page.text_items)} text items")

Configuration

All options are passed to the constructor:

parser = LiteParse(
    ocr_enabled=True,              # Enable OCR (default: True)
    ocr_language="eng",            # Tesseract language code
    ocr_server_url=None,           # HTTP OCR server URL (optional)
    tessdata_path=None,            # Path to tessdata directory (optional)
    max_pages=1000,                # Max pages to parse
    target_pages="1-5,10",         # Specific pages (optional)
    dpi=150,                       # Rendering DPI
    preserve_very_small_text=False, # Keep tiny text
    password=None,                 # Password for protected documents
    quiet=False,                   # Suppress progress output
    num_workers=4,                 # Concurrent OCR workers
)

Parsing from Bytes

Pass raw PDF bytes directly — useful for web uploads or downloaded files:

with open("document.pdf", "rb") as f:
    result = parser.parse(f.read())
print(result.text)

Screenshots

Generate PNG screenshots of document pages:

screenshots = parser.screenshot("document.pdf", page_numbers=[1, 2, 3])
for s in screenshots:
    print(f"Page {s.page_num}: {s.width}x{s.height}")
    with open(f"page_{s.page_num}.png", "wb") as f:
        f.write(s.image_bytes)

Supported Formats

  • PDF (.pdf)
  • Microsoft Office (.docx, .xlsx, .pptx, etc.) — requires LibreOffice
  • OpenDocument (.odt, .ods, .odp) — requires LibreOffice
  • Images (.png, .jpg, .tiff, etc.) — requires ImageMagick
  • And more!

CLI

The Python package includes the lit CLI:

lit parse document.pdf
lit parse document.pdf --format json -o output.json
lit screenshot document.pdf -o ./screenshots
lit batch-parse ./input ./output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liteparse-2.0.1.tar.gz (114.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

liteparse-2.0.1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

liteparse-2.0.1-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ ARM64

liteparse-2.0.1-cp315-cp315-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ x86-64

liteparse-2.0.1-cp315-cp315-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ ARM64

liteparse-2.0.1-cp314-cp314-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.14Windows x86-64

liteparse-2.0.1-cp314-cp314-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

liteparse-2.0.1-cp314-cp314-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

liteparse-2.0.1-cp314-cp314-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

liteparse-2.0.1-cp313-cp313-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.13Windows x86-64

liteparse-2.0.1-cp313-cp313-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

liteparse-2.0.1-cp313-cp313-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

liteparse-2.0.1-cp313-cp313-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

liteparse-2.0.1-cp312-cp312-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.12Windows x86-64

liteparse-2.0.1-cp312-cp312-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

liteparse-2.0.1-cp312-cp312-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

liteparse-2.0.1-cp312-cp312-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

liteparse-2.0.1-cp311-cp311-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.11Windows x86-64

liteparse-2.0.1-cp311-cp311-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

liteparse-2.0.1-cp311-cp311-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

liteparse-2.0.1-cp311-cp311-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

liteparse-2.0.1-cp310-cp310-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.10Windows x86-64

liteparse-2.0.1-cp310-cp310-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

liteparse-2.0.1-cp310-cp310-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file liteparse-2.0.1.tar.gz.

File metadata

  • Download URL: liteparse-2.0.1.tar.gz
  • Upload date:
  • Size: 114.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.1.tar.gz
Algorithm Hash digest
SHA256 c036eae33edaa8da527e0e8f9b5449fe36eadb0b6dc48a2bf3136095f918c170
MD5 2f7fcb74eccc6f473226cf345585bee1
BLAKE2b-256 07d93bc239a79beac3374f3811f9d62d4b462a158370fcb3cda20e07991fe892

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 de988d0856b2ead34ab52dbed831f24b65539ec91fc77034ca7522958a186193
MD5 3d534939bd2e9a677efffefd5184e31b
BLAKE2b-256 f04a214de14a01d955f85c671f000ad8a1f985997e72ccfa7b42a322aa5c6852

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 46e7d946cbb619b8a62d52a2fd99cbf6e72e022b0eb86a9d903ce0888ff80e6b
MD5 1f710581d9545643388775d362a9ed47
BLAKE2b-256 9d37ebff1f2325717030893ece2398e12714b22dfab47b8044c9a4c4f3cc4de1

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp315-cp315-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp315-cp315-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1026656b8245a68604dc59e5dece0063b70f3568f266843c4048d6632b3aaa08
MD5 4dca5e6f5066f9a261bc741ad48d158e
BLAKE2b-256 57b65fbfb8fbae1d0868f3a5de65b7bfe683f7a4a326eb9132b7658a0deeafbf

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp315-cp315-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp315-cp315-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a22d466b18887aa656e4955664b6cd417eecc883b909a8687e4952e0b1f78949
MD5 21940ad00c6449d5794ba4cf7e8ddc98
BLAKE2b-256 ab31b27db864276597b31c2f065ff8cdc2d219c418e33df0e85ecd87d56e6e0f

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.1-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.1-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 af4d13e1cb510e17b400cf57350f36e8b0cefe0f2cb128472ac842b29be41633
MD5 d3fa1cc8c703f6dfefd61a250b6fed1b
BLAKE2b-256 0de72a6f2379c4f2229f4b9a27f5e967c31b497582522a6e19d7329d575d7a60

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e1abe036f9aa71c44cfde35fbde8db260f18d71b8783af1724e05d53d7ddfd45
MD5 66c17254d379d0e9741a540a85451078
BLAKE2b-256 093ea64ed861c40355c61c8139007d5c4c8a4454340b429bfec7176adf68b89d

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6b5da9915563f2cc34c0097bb6a9ed7dbee9994091c94b2e88137dc7ef88e41b
MD5 b7c2ff38e3f0f71da94d8aaa0f9f4b3d
BLAKE2b-256 aee74bc6963bb9eb06621ef4bd162c17c33158a00cad077de81109f80d46a59d

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e6b1e5ffdb63f131a6e28b70c582ca02fe8149fc5da8d827426eeb0b33e51fc3
MD5 7b77574d8d2501a7eb5fed1ff6a8be61
BLAKE2b-256 a3378a485829ff3f37f201eba9c5294e370651d93203cd6e1ea022751cdddc24

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 c5522ea627c484f0c097cf850e2f303b931f10d765a817dc49c808c43af3353d
MD5 81782d36335c704e270eb5aa273851b7
BLAKE2b-256 21a9f7626c5f1772ea0d0d26a6de2d28d19af4f6de3849b7bf7af37930ec9820

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 dc5c281254694c40b9d9f94fc36c02d8392c33282cd49a2791d9724bc3e70e85
MD5 1aad6433639087f8d3a709f08f8d43a2
BLAKE2b-256 2f663fe8d200de8ae4af10d935655c7a6a2a568c9c865ec6a2383fecb5c0420e

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b383b6042e748dd6606e1b864c46af64551a18b85d7a16dc2fb09b41124843b3
MD5 ae2fdd9f09cb6e3b29783fadda0c95e6
BLAKE2b-256 b064512661ec5334491c5acf398e9ea6903ad6941a2c6883d7df9ca9ce411d52

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 09bfa365a9612b23053f4021d4d42d2ea9f271e82b2191d665b0c1fb5062bd0d
MD5 b9b7a43e1ce3323cbf97c620571f7266
BLAKE2b-256 e9980f817e45e8d0ba79e86cb9c2253fa458d809e92534ff1f48df7fcdd7afe0

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 cd82a3cfdfbebbc7f0807522a3f1fea3ac002bcdfb86238757f99a5e91473fa0
MD5 34195c51dd1eaedb3468171bf3c43862
BLAKE2b-256 b08d41d0ba979fbabef5a0e046592278990e803951accb57f30da4986d9fe964

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f18f17755d5b95f222e3ac55b22096f6f48ae03c4931eca3af92b4de4b16d1e8
MD5 dce2c3997678c81c5027d0c752c7a423
BLAKE2b-256 548118f0136613363060721680b2e67cd4c342275c26cdde00e6f8f60a0fb301

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 baa662c179e122980c2cb36fb75d3a7c3b3f0d1b03956629d6725ff9b4b7c810
MD5 0b712f18aa4fc1aa9bddaa271939d30f
BLAKE2b-256 d3273c80a35c9294bef646ff4c166e73f0b8bd78d4fc8b8c6fa194fe967a15dd

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 77145b5623b0f62553be313b4c1755464a9f185e99990ad0faee5592b8223fa3
MD5 bab01ee91a8c9691592c69c03aecb107
BLAKE2b-256 3edbae675fd5b45518c12950ce2b35f1fe2f70510a4abe5eb6e42918f0bde5f4

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 a04911b1a778f76374d47a371e4950f6525886545b3a18db35b5bd927e127926
MD5 49841b581509e7548c06590f8a7211b5
BLAKE2b-256 c1e06f67c867f08ea955e6c12518fb7836101f90f72010cc29f1f4af73f8eaf2

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f52997f874fa7b58a6c89f34d51cb93af2ea585d1b77f2b9fbc90eeb257d82e1
MD5 fba41abaeeaf9a0ab43de2d8b6f68dd7
BLAKE2b-256 0791554d1e85901a95c9a49db5817e5e2e2b8e546632f217f23fa9e78793228d

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 309e31a8ba260e1ec09280c6762f4807724cd1fcf43a618f167784139ce3048c
MD5 491dcb0f87d2c209ac2d956fbb22569d
BLAKE2b-256 3c5842fc6230849bfb4a975a48d84a6ac34f6230b1ed3f7354f59ec404a0bcc6

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 62cf80956fc4eb2d99de6b46428031b82fa47c04af26c394d855eb337eec66e0
MD5 402b1ff663e35afc59b4f643b47dad3c
BLAKE2b-256 db387926db1f786856e33a5a6da88a06fa9f3f88d045b64c0ac93658d7824fda

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 4e2ed6349dabb51c22d80e5003a33fee6276e7590c0085739c5e167c205028dc
MD5 c2da842c77a8142847becadfa5df5580
BLAKE2b-256 14a206bd6d6725f7f5e1908c335189dc282269503eaf13a40468522c2fcf19e3

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 36afd5c9c51f566ee960bd3101471826f9c57b3f244784c3c0476b6641e18b2e
MD5 a8c71b53dd36263dd00b3f84353c64e2
BLAKE2b-256 5c77fac7abd789d45bb49557321d4f41dda0dd30b905946278359489f4aa1d04

See more details on using hashes here.

File details

Details for the file liteparse-2.0.1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 fb731a44784078279d89717b4c9328c4f0dffc79d05d6375f5ad42a196dc5066
MD5 1ed981064708c4a2b1880dc95452a59d
BLAKE2b-256 52973076aa6dbf3b2528f95d105970e9b7cd759923ff9c5f26421738934e075a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page