Skip to main content

Python bindings for LiteParse - fast, lightweight PDF and document parsing

Project description

LiteParse Python

Python bindings for LiteParse — fast, lightweight PDF and document parsing with spatial text extraction.

Installation

pip install liteparse

This also installs the lit CLI command.

Quick Start

from liteparse import LiteParse

parser = LiteParse()
result = parser.parse("document.pdf")
print(result.text)

# Access structured data
for page in result.pages:
    print(f"Page {page.page_num}: {len(page.text_items)} text items")

Configuration

All options are passed to the constructor:

parser = LiteParse(
    ocr_enabled=True,              # Enable OCR (default: True)
    ocr_language="eng",            # Tesseract language code
    ocr_server_url=None,           # HTTP OCR server URL (optional)
    tessdata_path=None,            # Path to tessdata directory (optional)
    max_pages=1000,                # Max pages to parse
    target_pages="1-5,10",         # Specific pages (optional)
    dpi=150,                       # Rendering DPI
    preserve_very_small_text=False, # Keep tiny text
    password=None,                 # Password for protected documents
    quiet=False,                   # Suppress progress output
    num_workers=4,                 # Concurrent OCR workers
)

Parsing from Bytes

Pass raw PDF bytes directly — useful for web uploads or downloaded files:

with open("document.pdf", "rb") as f:
    result = parser.parse(f.read())
print(result.text)

Screenshots

Generate PNG screenshots of document pages:

screenshots = parser.screenshot("document.pdf", page_numbers=[1, 2, 3])
for s in screenshots:
    print(f"Page {s.page_num}: {s.width}x{s.height}")
    with open(f"page_{s.page_num}.png", "wb") as f:
        f.write(s.image_bytes)

Supported Formats

  • PDF (.pdf)
  • Microsoft Office (.docx, .xlsx, .pptx, etc.) — requires LibreOffice
  • OpenDocument (.odt, .ods, .odp) — requires LibreOffice
  • Images (.png, .jpg, .tiff, etc.) — requires ImageMagick
  • And more!

CLI

The Python package includes the lit CLI:

lit parse document.pdf
lit parse document.pdf --format json -o output.json
lit screenshot document.pdf -o ./screenshots
lit batch-parse ./input ./output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liteparse-2.0.0.tar.gz (109.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

liteparse-2.0.0-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

liteparse-2.0.0-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ ARM64

liteparse-2.0.0-cp315-cp315-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ x86-64

liteparse-2.0.0-cp315-cp315-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ ARM64

liteparse-2.0.0-cp314-cp314-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.14Windows x86-64

liteparse-2.0.0-cp314-cp314-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

liteparse-2.0.0-cp314-cp314-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

liteparse-2.0.0-cp314-cp314-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

liteparse-2.0.0-cp313-cp313-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.13Windows x86-64

liteparse-2.0.0-cp313-cp313-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

liteparse-2.0.0-cp313-cp313-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

liteparse-2.0.0-cp313-cp313-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

liteparse-2.0.0-cp312-cp312-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.12Windows x86-64

liteparse-2.0.0-cp312-cp312-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

liteparse-2.0.0-cp312-cp312-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

liteparse-2.0.0-cp312-cp312-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

liteparse-2.0.0-cp311-cp311-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.11Windows x86-64

liteparse-2.0.0-cp311-cp311-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

liteparse-2.0.0-cp311-cp311-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

liteparse-2.0.0-cp311-cp311-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

liteparse-2.0.0-cp310-cp310-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.10Windows x86-64

liteparse-2.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

liteparse-2.0.0-cp310-cp310-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file liteparse-2.0.0.tar.gz.

File metadata

  • Download URL: liteparse-2.0.0.tar.gz
  • Upload date:
  • Size: 109.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0.tar.gz
Algorithm Hash digest
SHA256 a723ca76ade9625831840cf248f26b699cc6acfb3aa0b30f5b0ffa56a4bdbb61
MD5 92acf1f3843da0c72c46ac604a256c55
BLAKE2b-256 fed5ccaa4f8bb00103ca43e8c26b0e73fb8cde40d59da96a1c99113ba83d4899

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 afaf44eebb6b935e639c17030a1b84e718317416e2234fe89001252e3d8381ac
MD5 73fca39d7bc1627218b543e7737d63c0
BLAKE2b-256 837a38914fdf918fecf40d9d99bc4898cfff9fd0ab48ea058af5d0a6df74009a

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3620383acbd229bdbeb08ae657337308f933008cb6466aa26f8fe205320978e1
MD5 db1c6a90ca078bfc3a6f7c5ce35356fc
BLAKE2b-256 dfd982174170f87615bf0eb44caf7056658bf9ae74a9b0e278028a99e03fddb2

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp315-cp315-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp315-cp315-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e9e2aadfa4e02ca25c0a5cefba0fbd54037983db701c3f64bfd4a04b15fe2436
MD5 551748d615b09a768deeb0ed8b0332ac
BLAKE2b-256 30f2671457f7e24fbe5278bbc62826caee7dea9015e2065fe121d0a1958bb8d9

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp315-cp315-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp315-cp315-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c93eab1d635edce51e2a380ef7dfb0f166b7dece925409850588273aba900fe8
MD5 a0a885f5fd1d83a4a05c383acf6140c9
BLAKE2b-256 f93cbb5b9540fdea6edbca2a9a21aa04631c443f1e01b64d82435c37fe81d913

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.0-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 dbe3bd6e650566bd6d4a187977b22641674388accb7ced4fb74cd0386528989e
MD5 b25affb522c330c98abe1f5f755d7387
BLAKE2b-256 a093c4ba6d5c5513890ae550dc39034b637675484ca5bc81e0e32d1322ea02a0

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cf699201e1696b4dad2fe107f91e05c5bb60d1ee4c622dc96e7efb7fdb1c247b
MD5 d8ce1603b659e4bf1bdab3be1f3054cc
BLAKE2b-256 d220617e166450e2629a3683ac445fe88f0c907bf76cabd42b996ba34eac66b6

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1adde028264ccb21725fd4ae01aa9ef925921022acff539bdc39eefb4b03c3c5
MD5 025a990ba1aa90144e0ee3e4681ef3a5
BLAKE2b-256 b69a513636c787267335e32f0af684e8159681029628699b023072e4ba4cb945

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 276d234b0338ef9f50bc7554021027537663e3090093727944525506bf5e2208
MD5 590fa18b9f0e2dd890d5805383520737
BLAKE2b-256 dc2a51398b321017b5519b7b94fee38f73d393198189158ea3831c2bfd1df9fd

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 cf11d305859a3c7a1b54947af6cdef5fbc913247b14aa76652f6cd0dbcee9eac
MD5 82244b232e04f1aeddf7ae3cd112d07c
BLAKE2b-256 713690e37fe63689d05728ff064580696da44d117432f8c3dbfe0747a46594df

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1a10002ca0759bdea2ff4afeda5effdcc7e6fd658119b852a4ed06b5bce7f6ed
MD5 9aa04b7904ef8dab3b99f88e44d3ba6e
BLAKE2b-256 c7bd9e0e58e3005d781d567fc892595a52404814f0ccd6196a19f86b6f0bf241

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 15154c95c4991e832d89d832679c273a723502b3e2d5fb53fdf2a3d12f5a58f2
MD5 79695852c13ed13864a42fcd3a848aa6
BLAKE2b-256 789dc703e73f47eb1bf5927e7c5efb31038dd2248c645071b7389be312596d2b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b0e5e9e2b9c1df71280cf6e0d10a4f56ea5f83792adcba795e26d0927603b620
MD5 02150f7a1ea46a004668903f82b2079e
BLAKE2b-256 305c2ed890ed11b1552e2465618976ba7f513c208bec3c1099cbc6a92712b017

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 d71c6b68a66ac7b480ccc089e9336c3a740b4d5058c178c08cb0b3e1337a8630
MD5 cf6ac4d982e2b5998022a4f503fce56d
BLAKE2b-256 5dc3d7f0764c627cff107cfc624736d6773628da5eeccfb595eb514ac603eca2

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b0832a799152ffd80957fe4b3b338cfea085dc18163518da245c6f2029870e4b
MD5 ba68bb8345085ec43b334bf6b36a55bb
BLAKE2b-256 be768ca7b5e8d34514d21cfa1f5bea58bcb1a444727f2f8a1c025f9a32f560f1

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8d17ea3a072b89f5f8e7f28ba598588c3eeba796648e95500cbe9c4f302d0f45
MD5 9319d1c62385d876cac5ac6cf89ee422
BLAKE2b-256 9a593f7c79b2af0324f63a9ba37d3a3bdca6b706d7bdecdd851d972d42224ef0

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c6e6bf6c22b585f30e16766d3eae0f5b8e21393a9cd21026909877a6528c0b81
MD5 db97e5d984719e983ba9ccaca037701f
BLAKE2b-256 669afa43059fdce7ecacf3bba70d181b23fede07292ffa1f6f89f3bf1ae38c61

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 73d6e463fd232be77e53058a0ff6f1e4fe1d2e4db95d77ffb241f9473ff9610c
MD5 c82c4a0ee58328d5873516dabf8e2634
BLAKE2b-256 43f4fd910755283dd458a86620aebeb9f677b47c9fe88cb1ce694a2ddcb49ae3

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 483db12780984e92726240b40b4b6ef41cedc69164c465b2d63cc5d960195810
MD5 d534075d17f2e5c943a928f60c058624
BLAKE2b-256 7aa0b4a8d9b652d90f8f9a856dc06d6b1aaef7f08b8ba42a411eea13f7c94ab4

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7465b1979aa8671e60e8002b34dcab00572b1bda793b544304f007d749757c57
MD5 d424c7eebf407ad4f59cc752586ae649
BLAKE2b-256 f46d0e405333e71754091c8bcd430ce234c77c71fb88531ed0f04b10a798bb41

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4c7bf761b86b95bd921122259413f49b2b61ae65ebba5102f41a57acf1171a92
MD5 8360be6a5942d2fa579c0ac60427b37b
BLAKE2b-256 95e294a87bda5e470c775be581bfaab70bb5e6881648e5057869668d4f60f8e0

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 9b812e5798fdd6d5d56415aeb061f2c6483dbbdfb731ccb8dfb9b7229f16dee5
MD5 8fa40f4f15731649aed8698e9a632554
BLAKE2b-256 76bbf1924777a15d42b10d450b8009c7a54c69a7375b5b9063ae7970f1868235

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2adc010a175aba23d59a41c041bb34845d7d346f6c880c42e8f8cb326d36ef3b
MD5 c03a4d009e761984e3e615626ced4ac4
BLAKE2b-256 f889c26a185b2dd2a6815eebb26d0c67896795d665d0c047efa03398a724e18b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 26bce861b4c4b6265af99882efaeba1b3c5d9d95bba76c59962d6480d3f0151b
MD5 c08124268777f34ee8dbcfb0aec31cd4
BLAKE2b-256 76edad73e44de759ddabd248c50b1d0d6e470bda8de4c2b82f4a5fad425968b2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page