Skip to main content

Python bindings for LiteParse - fast, lightweight PDF and document parsing

Project description

LiteParse Python

Python bindings for LiteParse — fast, lightweight PDF and document parsing with spatial text extraction.

Installation

pip install liteparse

This also installs the lit CLI command.

Quick Start

from liteparse import LiteParse

parser = LiteParse()
result = parser.parse("document.pdf")
print(result.text)

# Access structured data
for page in result.pages:
    print(f"Page {page.page_num}: {len(page.text_items)} text items")

Configuration

All options are passed to the constructor:

parser = LiteParse(
    ocr_enabled=True,              # Enable OCR (default: True)
    ocr_language="eng",            # Tesseract language code
    ocr_server_url=None,           # HTTP OCR server URL (optional)
    tessdata_path=None,            # Path to tessdata directory (optional)
    max_pages=1000,                # Max pages to parse
    target_pages="1-5,10",         # Specific pages (optional)
    dpi=150,                       # Rendering DPI
    preserve_very_small_text=False, # Keep tiny text
    password=None,                 # Password for protected documents
    quiet=False,                   # Suppress progress output
    num_workers=4,                 # Concurrent OCR workers
)

Parsing from Bytes

Pass raw PDF bytes directly — useful for web uploads or downloaded files:

with open("document.pdf", "rb") as f:
    result = parser.parse(f.read())
print(result.text)

Screenshots

Generate PNG screenshots of document pages:

screenshots = parser.screenshot("document.pdf", page_numbers=[1, 2, 3])
for s in screenshots:
    print(f"Page {s.page_num}: {s.width}x{s.height}")
    with open(f"page_{s.page_num}.png", "wb") as f:
        f.write(s.image_bytes)

Supported Formats

  • PDF (.pdf)
  • Microsoft Office (.docx, .xlsx, .pptx, etc.) — requires LibreOffice
  • OpenDocument (.odt, .ods, .odp) — requires LibreOffice
  • Images (.png, .jpg, .tiff, etc.) — requires ImageMagick
  • And more!

CLI

The Python package includes the lit CLI:

lit parse document.pdf
lit parse document.pdf --format json -o output.json
lit screenshot document.pdf -o ./screenshots
lit batch-parse ./input ./output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liteparse-2.0.4.tar.gz (115.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

liteparse-2.0.4-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

liteparse-2.0.4-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ ARM64

liteparse-2.0.4-cp315-cp315-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ x86-64

liteparse-2.0.4-cp315-cp315-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ ARM64

liteparse-2.0.4-cp314-cp314-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.14Windows x86-64

liteparse-2.0.4-cp314-cp314-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

liteparse-2.0.4-cp314-cp314-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

liteparse-2.0.4-cp314-cp314-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

liteparse-2.0.4-cp313-cp313-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.13Windows x86-64

liteparse-2.0.4-cp313-cp313-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

liteparse-2.0.4-cp313-cp313-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

liteparse-2.0.4-cp313-cp313-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

liteparse-2.0.4-cp312-cp312-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.12Windows x86-64

liteparse-2.0.4-cp312-cp312-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

liteparse-2.0.4-cp312-cp312-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

liteparse-2.0.4-cp312-cp312-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

liteparse-2.0.4-cp311-cp311-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.11Windows x86-64

liteparse-2.0.4-cp311-cp311-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

liteparse-2.0.4-cp311-cp311-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

liteparse-2.0.4-cp311-cp311-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

liteparse-2.0.4-cp310-cp310-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.10Windows x86-64

liteparse-2.0.4-cp310-cp310-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

liteparse-2.0.4-cp310-cp310-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file liteparse-2.0.4.tar.gz.

File metadata

  • Download URL: liteparse-2.0.4.tar.gz
  • Upload date:
  • Size: 115.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.4.tar.gz
Algorithm Hash digest
SHA256 17f6119f38e80b956c1ce3dc998ea7b0a8e80777ce1f49178f2b14bb17b35a9c
MD5 1f2c0e08d6d37e82ce238789d9317fd5
BLAKE2b-256 0de7ecf68643604a59247a0a7b2f8c73bee7415ea99e0165bb32e2838ddd0d3f

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a1f9cb9c24f2df0d4f71ddd66ddb474bfdec8a434ecc1428b791f83aab2a688b
MD5 01a21d0115348b5404872dee45e78a26
BLAKE2b-256 7dae9b85e510ddb390ed63b407851d412152b7006487d06703d931f6a0b1414e

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 4cf31cb3987df1190e59b73d9f10976e538ff577f41c40281fd14b84fe4f9da1
MD5 298ef16e2db33c81c2b2df82e29917df
BLAKE2b-256 979c59cdd88ebc6c27312ea6cbd0a894002e78b6f8a3dead2b2bf60d7febba85

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp315-cp315-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp315-cp315-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 acdf3c76cb3215f8d389a935b6b68007fac2ffa9ce0b681dd53650b69d580521
MD5 ccef3184bcb6be94344aad9b4237ae2d
BLAKE2b-256 899604c595ab45162d81bc73218870d1459560428c3f40957e594a6c1c5ea2be

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp315-cp315-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp315-cp315-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7c02d0bb31cd5aefa3297ce6e58388abd6f3e109c62ac0fdeef07d8eac4b769e
MD5 6f8f5559fef932bed42e818254f5edb5
BLAKE2b-256 1659c554f376c0bdd1bf4c313ac5d77a34817740f021ab6ada9d3226a23fa4b6

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.4-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.4-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 6546ee0359dc56eebd9f45008bb59708118c234140ecf466f6c7121d9161d9e4
MD5 45f99d9538b5a0312b70ae2d4fed05dc
BLAKE2b-256 66f4da191e881cad5941dc0065782497eb81027bc3f48ac0a3143deab094be33

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 34c53d9cefa35f77dc67a19a875e6dca32b4f35006c2015a22eb30c9c810653b
MD5 c5aeb7633f4b59630a593dc8fc5f9404
BLAKE2b-256 7a047e7c3a8edd01c9904b6eef76bf4a008f987a5df64b8334c61e742861ac84

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 d2efbaf7453d2bedc86db51b2b808078567817d7fc537122389b65a317927902
MD5 be00529e4f5f3271704f66a04a6d62d5
BLAKE2b-256 b9c16dedc6b4325aa8de3249694123a74bc9506e0d65a28c85aa5fad4bfdea5c

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8aeaf821151aaaa854294f3499d64264dbea7d10e682fa9a2443f9177cd444c6
MD5 94e61d0dc490660503989cb27d11f0b8
BLAKE2b-256 5963b2bb03bc30103e93c87695f63eae3ed007b08796a6cc06ea29acace54c4a

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.4-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 098fba3ecb2337f78426d9e077d1f70bc75871d4387ab8c3774b0cc5d26b890d
MD5 77e20bf2b99c966832aaafd0bf2e0d7a
BLAKE2b-256 e67fa2017df8031677d7940ad1ce33640219aa28defae4a8171844ea8bed68ca

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 137f169002f3abe21e3dd2e6781fbd86841096a0f3b0162afc1fd64eb21fa607
MD5 32d3ed12f6feb463047f991d3d8fbf42
BLAKE2b-256 8ae03938561ad66d4a216922c8e1e6a878f63df82ce5f00f15a935f779fb7c5b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 924e3f669341e22e625b13d08535644d1dfd779bc6781e4ab6f6e54ea90a53d6
MD5 581d7cebb3abe7392143c2299e813489
BLAKE2b-256 bed7b4633483502940d43d583f8057e0aed68b9091087a86d021f8bd7558ba0b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8f02c6e0d8f71da671a3527d52d8f1e2c42fddebf81d1b4931c3d035e4ec1e6a
MD5 faad8e13f315c05d0e79e9a5c3cfb89c
BLAKE2b-256 6eb502ed5fff6418fdc970688190eab4470f4f9c116f4de1e39a7deea0d9968a

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.4-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 fb67326ba957388214762acea35d24cf0d1230ae6a2fe1fdeaf74024e92e3c40
MD5 ab907731e6c061cd18466bc1b8af676b
BLAKE2b-256 fff0bf10611e409732bd4e19f0fc0faf3194040e8e09bb75a166ee126d09b70f

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 efdaa6b471084a1f4594574555eb6abb5f85de25f2155c8d539542239eacaa56
MD5 b78eb80decd29154a307012518acc916
BLAKE2b-256 589f4bf4e9b112b47025ae085503fe9cbf13631673ffc41bfb864a3091285c22

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 414599a922aa51f567fa939183929579d1668ef74846fe25f7f46742bb31fcd8
MD5 ccf0f1a7882206ebd80d9856e68db1c5
BLAKE2b-256 861f105ccdd9bc4608a836fe409394d68e8765e699fa7393c2f2f464c612057f

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 87680616fae276b04ace6e5fc5e4e0c93980391b0d46c2d66d72c0742a3cb19e
MD5 de43b5d049dc79e25721177dadb5325c
BLAKE2b-256 ea58be78c7c47147aeb1350d475336c6c2e17d5aa513be9244e9d95a170ced34

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.4-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 2d05d10f0d14b1beb34ef8c5e9a14d6cc966adf19f60c7ea1ec5717adc4c986f
MD5 4cc77c4d600f34ff5ef2f4958540443a
BLAKE2b-256 6048f41ebe428d8d8d70c53ddd47523baa7300c5cc96e404417d7af25578be01

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2012a3a9b5a3f7e13ce34b5a770158971da43bb9d266c7c5a3ea62bdda7ca851
MD5 97efb7be6d5e0e260242d763557755e7
BLAKE2b-256 21d0a97174ae281d353251994ed080c8855ea9b0b5d81a60ab3b6b065e911c49

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 feae0c530197130cb38f176d718eeae639d9091264aa5f954835986c59470813
MD5 f1b66ab755c97f4cad98c5360522144b
BLAKE2b-256 a9c87429622d86bf00ceaec95bf211adf1c9a7bdf46f8c2cd806685f9c02c0f1

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 537ab6347a384f81980e48cc181d6cd33fc6ad2b7478e3db61350076744d952e
MD5 eb4f6a633f15ea68ca6471e590328ff7
BLAKE2b-256 9c2dbe89a429a6a6bc78ce8d620974a4f8fbe9f566ea3592a2f1da8dc6bdda4a

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.4-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c1bbc8b7206b8bfbf7aabc5341d2cf851b7464641d58375bd218b4e1dd3517f9
MD5 94d2791788204d187ff49e4cf4c675b0
BLAKE2b-256 6b9de7f1a1b8cb14ac867b1220fdb0c87bfe07b86c69bf98578573ab37b1a103

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 35a72946b965d3b6b87a602051919e7ce243da15ad143d301152fb5e8cd0f6d2
MD5 ed0cc484c1e5ef9a1de174e4694683e4
BLAKE2b-256 832fc7977a2d6f376e31c8c465ee010c238e27e06cbb2c3200d63f41983e40db

See more details on using hashes here.

File details

Details for the file liteparse-2.0.4-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.4-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6df1e1199ffbeb2191bb64d7fcbff6af6bdfd1592973e0ad67a82eb09d377c08
MD5 29d0adec40d975053b8c6f4ba1652ccb
BLAKE2b-256 d1b04f5007a52ef13679437a892a06ea58448b825de7ea78276e19b9d7fb9dcb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page