Skip to main content

Python bindings for LiteParse - fast, lightweight PDF and document parsing

Project description

LiteParse Python

Python bindings for LiteParse — fast, lightweight PDF and document parsing with spatial text extraction.

Installation

pip install liteparse

This also installs the lit CLI command.

Quick Start

from liteparse import LiteParse

parser = LiteParse()
result = parser.parse("document.pdf")
print(result.text)

# Access structured data
for page in result.pages:
    print(f"Page {page.page_num}: {len(page.text_items)} text items")

Configuration

All options are passed to the constructor:

parser = LiteParse(
    ocr_enabled=True,              # Enable OCR (default: True)
    ocr_language="eng",            # Tesseract language code
    ocr_server_url=None,           # HTTP OCR server URL (optional)
    tessdata_path=None,            # Path to tessdata directory (optional)
    max_pages=1000,                # Max pages to parse
    target_pages="1-5,10",         # Specific pages (optional)
    dpi=150,                       # Rendering DPI
    preserve_very_small_text=False, # Keep tiny text
    password=None,                 # Password for protected documents
    quiet=False,                   # Suppress progress output
    num_workers=4,                 # Concurrent OCR workers
)

Parsing from Bytes

Pass raw PDF bytes directly — useful for web uploads or downloaded files:

with open("document.pdf", "rb") as f:
    result = parser.parse(f.read())
print(result.text)

Screenshots

Generate PNG screenshots of document pages:

screenshots = parser.screenshot("document.pdf", page_numbers=[1, 2, 3])
for s in screenshots:
    print(f"Page {s.page_num}: {s.width}x{s.height}")
    with open(f"page_{s.page_num}.png", "wb") as f:
        f.write(s.image_bytes)

Supported Formats

  • PDF (.pdf)
  • Microsoft Office (.docx, .xlsx, .pptx, etc.) — requires LibreOffice
  • OpenDocument (.odt, .ods, .odp) — requires LibreOffice
  • Images (.png, .jpg, .tiff, etc.) — requires ImageMagick
  • And more!

CLI

The Python package includes the lit CLI:

lit parse document.pdf
lit parse document.pdf --format json -o output.json
lit screenshot document.pdf -o ./screenshots
lit batch-parse ./input ./output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liteparse-2.0.0b3.tar.gz (107.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

liteparse-2.0.0b3-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

liteparse-2.0.0b3-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ ARM64

liteparse-2.0.0b3-cp315-cp315-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b3-cp315-cp315-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b3-cp314-cp314-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.14Windows x86-64

liteparse-2.0.0b3-cp314-cp314-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b3-cp314-cp314-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b3-cp314-cp314-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

liteparse-2.0.0b3-cp313-cp313-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.13Windows x86-64

liteparse-2.0.0b3-cp313-cp313-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b3-cp313-cp313-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b3-cp313-cp313-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

liteparse-2.0.0b3-cp312-cp312-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.12Windows x86-64

liteparse-2.0.0b3-cp312-cp312-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b3-cp312-cp312-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b3-cp312-cp312-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

liteparse-2.0.0b3-cp311-cp311-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.11Windows x86-64

liteparse-2.0.0b3-cp311-cp311-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b3-cp311-cp311-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b3-cp311-cp311-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

liteparse-2.0.0b3-cp310-cp310-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.10Windows x86-64

liteparse-2.0.0b3-cp310-cp310-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b3-cp310-cp310-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file liteparse-2.0.0b3.tar.gz.

File metadata

  • Download URL: liteparse-2.0.0b3.tar.gz
  • Upload date:
  • Size: 107.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0b3.tar.gz
Algorithm Hash digest
SHA256 dee3ca32f94258ab9f7039b9c9234a89117568b99873d25d10e5a9286e14ae8b
MD5 10546bd00cdddb19cd196f5b358dad3e
BLAKE2b-256 4a47a9da99e02d48da571dded23610e15552cee456a4a2b9572b94e3f5f11a1c

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ea12192a392b05eb08ada754a0ec46d09e9e94abaddff7291a8a9bd334b2276d
MD5 3a187ae939de707b0f77b4dbc3ebe682
BLAKE2b-256 58ebeaf92dcd98200882ee885d9d9a87e47b6853e7bf61363367eab253ac96af

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6de2bd04c2e5ac01d8ebd8a480db6d8ea66d67b66df96ff3a50e982af114502b
MD5 140a8bd22932d54683d3634551b890a1
BLAKE2b-256 0b6eeb9b92fd70c4ba4cd8008882f6476d561f710b4a87c7b190d9d0deb6c26b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp315-cp315-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp315-cp315-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 faa6e23a9a784778155972197ecc57e928bb8f587a4427196c24b79ebae6aef9
MD5 5efe08b9081afb6a1d2c2b4bddaa6f24
BLAKE2b-256 592de23c37c9fa536bc29f0c99ba5a9fb549442dc8c5669b03c24648fa4212e3

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp315-cp315-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp315-cp315-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 693d3ba86e677ff9df8816e11e25eed45fcafb9db19095c9545eb093005d4fe6
MD5 4164cd8f68abed8cae64d4058a105fef
BLAKE2b-256 3fb7b8b4258e53cd09651fe435234987da53feb6e061ccc360353139e6e3a308

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 5e47e8a633d6726eea9d302fd337649aa35b2119cb8367b3d0f2259c529974f4
MD5 69e16174976bf5592d8950344580d9da
BLAKE2b-256 715c721dea9fdca8895685d2560c16683e33de51be44517f0b8cf8506f0f657c

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 666624a017d2ea71d766f9a2091d3166a2bc5c4c15b26343fc0b42798b4b3960
MD5 77174207c35134690cef879183b9e8d6
BLAKE2b-256 fe071b5d1abefee5c3e6b7b3ff47e477c63e05cd5b72f6b1d85e13f9e88783d5

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 916fcf06d0e9ab351e95a9999869ef6866140a1177c1ccfc67f58cb64727a0c7
MD5 f7bdea86395588b5a27c61155386ee02
BLAKE2b-256 cf0f98a0550ba47c4a8a979a9fd3246b6a1afdf90f67680baf6e6e17a545cb93

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c74316d575b3d9600fee8f47c8a377bbe6bc44a229b8d293e7b16ba8490b3dca
MD5 830d93e8bb821ee69c5371a53260894f
BLAKE2b-256 1e23c6c262c5b408c77217012ecf90fe68f38eb3a9742ad101473f6525714d9f

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 e8e1a4a5b0b053c41eb3bb8a46775ab6ef3dddf62c0cfc61c0cbe68cf4926ab5
MD5 7c4edd04d2a8596b60c747f169c782c8
BLAKE2b-256 9a84f1c9f79bed68c102a89bfe6a95264c5c3c235c53e31532cadf7f23d8d4bd

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7ac3017e76c1f20b55bb8de5eb622dcb33fd652b280a3c44beacf6aaf8b5a2d0
MD5 68d039459fc34a35ffae6513b038a77e
BLAKE2b-256 669b3049e17b1e0febacb19d871e6eac6e365c02b38d7f7bb3066a7755fa5c42

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5f4f21433fd29a7407c709aace2ee26aad520aa58d4294d9b135fbe0e282467c
MD5 c3988d4fc1c71549fc5161e3c344a943
BLAKE2b-256 9ee276dae5fa2347db56d61625f37bf1d6f448188f116fcd0f4ee408437f4528

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2517da316d33212fd09049455c4829409d72c51d808151d3958a089286e70d10
MD5 2b23eaa3fa23332d62bebbd86b1c6e30
BLAKE2b-256 44d9d5c6fb030c4db4716ed71d40fc1436fdb98c6953dc1b2b8832b31f35fda6

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ad63e4de269e88db9d322781a3ac97efccc1df3ce86ae886f335f0eb79a01d43
MD5 d74c3608d68e436412dea50afc6c58c6
BLAKE2b-256 6f34809d0b328e99447dfdf9e3270dab4e2b590017b81bfe5c968feacc250286

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4bca413ef5035f197439283daa42fd752b2631601d9323b98449e80bb51c52a9
MD5 5245d3b285c8b1cf141c1ddbe3854df8
BLAKE2b-256 b7414ae20fb08ea10d9720c49397537cff7d28ed5a3a85a4c38d89f9ed7442e7

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8fee06ffdd4058d9748daa62a49a4a7b3c1e8b948e463251f72dc45fe889049d
MD5 49dfa732d0a0b3109ddee887dad7c9e6
BLAKE2b-256 7c87a138b1a8777f2e790bfdd2c26c3dcd3844084d34ab3480a2ec57238dec42

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 756bff5c5baa5e984c240eb36879005fa33de865969413bd614da90dc36dd3cd
MD5 243fcaea7f32a8acf3b0ce8e4415f8c8
BLAKE2b-256 e2aa962ac482e4d23857a818cd21109d38119c17e2a321a7f62403cab1db2230

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 2693a7b8500597b055ca16c56d2b66ab643b1940cb7b223c3a52040bcec0f1bb
MD5 c0f349a26c19bedbf3cf1f11c2c3bb53
BLAKE2b-256 ab4ad3a5043358ab534510b6f68615918319d1bc2cdddea9bbe6e48bd64192ce

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 52a46dcc6d82dc90cf980cc86afda0d317e93271678351a48d395b467d8a6919
MD5 c90ef5afb12af39da887fc6061c52d71
BLAKE2b-256 2473040ceafe5a21dd291af14f1ffd3a1cca4dbfc36194839b36103bf9a7d375

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 2f16cf2c41799fd10fc4656198b42cd6348497bd9909d19829db609a7d4de0d3
MD5 d8b4d099e65ddca4e1a8b9d398d3bb4f
BLAKE2b-256 83b766a31fa844417016fce1d4358f6705c3b5833e0113712ae7675576651244

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e37783804ecf7d5c0260199a4fffad43aa85b6f1efc4b12d5b474cc302d4a856
MD5 d597fc8438afc832f3bf9a860e872ea3
BLAKE2b-256 86a87f3eecb2413b3822f932b10615d85c05a422754a85231cd345d8934a4853

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 72c9649a19ac5a0a3944aaa62e7f59e874b287aa100af6db63dc34e7c073786a
MD5 5f0cdb371676821fb73f92b7889cf6ca
BLAKE2b-256 430f425daaecf962ff2c880d6721384317d8d43ed4b2eb4255042880b1b8adfb

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 31ca1d88726aa5856d064a67a66d953994540fabedb0b5fa005cd17feec046e3
MD5 e5d8613a8a4ca21d24327d29524e4e53
BLAKE2b-256 38092aa637b9b2cec09be3fd888c139c4b6ec23a25b0363c10d7ce5eef9a4de6

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b3-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b3-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5ebf2055423f134c00baf8418d6d2d8deb26e5b5b2b0ac711e401b52f69046b2
MD5 778be3209d4770519342f73c6642ad68
BLAKE2b-256 9e73fd90539ce14cfbe75ebd0a39c4518a47848fb449e09bea26fef8aa823570

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page