Skip to main content

Python bindings for LiteParse - fast, lightweight PDF and document parsing

Project description

LiteParse Python

Python bindings for LiteParse — fast, lightweight PDF and document parsing with spatial text extraction.

Installation

pip install liteparse

This also installs the lit CLI command.

Quick Start

from liteparse import LiteParse

parser = LiteParse()
result = parser.parse("document.pdf")
print(result.text)

# Access structured data
for page in result.pages:
    print(f"Page {page.page_num}: {len(page.text_items)} text items")

Configuration

All options are passed to the constructor:

parser = LiteParse(
    ocr_enabled=True,              # Enable OCR (default: True)
    ocr_language="eng",            # Tesseract language code
    ocr_server_url=None,           # HTTP OCR server URL (optional)
    tessdata_path=None,            # Path to tessdata directory (optional)
    max_pages=1000,                # Max pages to parse
    target_pages="1-5,10",         # Specific pages (optional)
    dpi=150,                       # Rendering DPI
    preserve_very_small_text=False, # Keep tiny text
    password=None,                 # Password for protected documents
    quiet=False,                   # Suppress progress output
    num_workers=4,                 # Concurrent OCR workers
)

Parsing from Bytes

Pass raw PDF bytes directly — useful for web uploads or downloaded files:

with open("document.pdf", "rb") as f:
    result = parser.parse(f.read())
print(result.text)

Screenshots

Generate PNG screenshots of document pages:

screenshots = parser.screenshot("document.pdf", page_numbers=[1, 2, 3])
for s in screenshots:
    print(f"Page {s.page_num}: {s.width}x{s.height}")
    with open(f"page_{s.page_num}.png", "wb") as f:
        f.write(s.image_bytes)

Supported Formats

  • PDF (.pdf)
  • Microsoft Office (.docx, .xlsx, .pptx, etc.) — requires LibreOffice
  • OpenDocument (.odt, .ods, .odp) — requires LibreOffice
  • Images (.png, .jpg, .tiff, etc.) — requires ImageMagick
  • And more!

CLI

The Python package includes the lit CLI:

lit parse document.pdf
lit parse document.pdf --format json -o output.json
lit screenshot document.pdf -o ./screenshots
lit batch-parse ./input ./output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liteparse-2.0.0b2.tar.gz (107.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

liteparse-2.0.0b2-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

liteparse-2.0.0b2-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ ARM64

liteparse-2.0.0b2-cp315-cp315-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b2-cp315-cp315-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b2-cp314-cp314-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.14Windows x86-64

liteparse-2.0.0b2-cp314-cp314-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b2-cp314-cp314-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b2-cp314-cp314-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

liteparse-2.0.0b2-cp313-cp313-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.13Windows x86-64

liteparse-2.0.0b2-cp313-cp313-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b2-cp313-cp313-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b2-cp313-cp313-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

liteparse-2.0.0b2-cp312-cp312-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.12Windows x86-64

liteparse-2.0.0b2-cp312-cp312-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b2-cp312-cp312-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b2-cp312-cp312-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

liteparse-2.0.0b2-cp311-cp311-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.11Windows x86-64

liteparse-2.0.0b2-cp311-cp311-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b2-cp311-cp311-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b2-cp311-cp311-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

liteparse-2.0.0b2-cp310-cp310-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.10Windows x86-64

liteparse-2.0.0b2-cp310-cp310-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b2-cp310-cp310-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file liteparse-2.0.0b2.tar.gz.

File metadata

  • Download URL: liteparse-2.0.0b2.tar.gz
  • Upload date:
  • Size: 107.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0b2.tar.gz
Algorithm Hash digest
SHA256 d474494463b35a3502d6e1ef480b2ec16113f6c2c746f55d32766d74bf02d8f6
MD5 dadc22866c982c3a4eb4ac906de7e884
BLAKE2b-256 d6165fa536ea5e59c84a56ae4fa4391d010b27973722bb4e177a9b9993c4ae00

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d3e8a91fb6b8be4e30f942662b8c6201f7af4a926c1f5a633bdc66367b19200a
MD5 ca48109dcb515e5f4657abd94b10669b
BLAKE2b-256 d1b8d788c433eea9ff381642a3bc2d3adee840b0d4e61873a450ff24b9e6fa3b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8a00150eec8e26fdffc45ef799b603f258ffaf604be78ea7be88d52037c0dda5
MD5 6bdb889d845b845afa6a70ff01041820
BLAKE2b-256 ce68bd4b34ae337db91632d10a8c5333e1b5fe3cff91d92f44516ce6475b64e2

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp315-cp315-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp315-cp315-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c5006850268306038054ca4dcf705813ffcfe1c6df59bb74a015ef6c50329df3
MD5 9b900b5e1c34a413afb469d7ddbbbf49
BLAKE2b-256 3b202cfa74e51de8490d2749df2d169af0dd14ef4e36e854114d9b0755a97e10

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp315-cp315-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp315-cp315-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 04f2465d5adbf25fa6e8dd50de7ca6465534e043b8e9e7a493f11994ebb1a32a
MD5 e4f8a1578ddb5c1b6f287750783dcf46
BLAKE2b-256 b87e6a86afd21ee44441f7282845e4b4b93b0fc6e8306d928e91acc62eaaef2e

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 d76fda57a439e93bc74a4b44c800bebb58e0098f3cbcc9a76cc2f89b9080eb15
MD5 0d70a2c5a0667bf9776341d836829449
BLAKE2b-256 cede42dfeea0a0117c66cb9d937ecad384e9e4ef3e364bed20ca772993f8e712

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d31b1055274dbe91332a03014c604efc9fe5bc3159e43b6b499dfed121273933
MD5 c17d613cdbbe0314fffc9e868d9978f8
BLAKE2b-256 a2ddebf9dc22c824840f7ff43cfe869e820228f8794677af86361ed43cac9478

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b3ea86bfb9d2d86bfef0e162734e0d13525ba5a2d69b550960bf22dc0dbb2c29
MD5 8d589bb9f407dcc4d3b5228e44abb4af
BLAKE2b-256 a9edfc7ef792d1753ead6f36710523d33e16861d15e6741bf2c4132373d7eeb1

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 36bcc99f8e9d6cdc8d9a3200576770330d8baff517cd94322d0377bd0966cae1
MD5 322385754d23943729c960b271f37a89
BLAKE2b-256 0626b3b2b90d5118b90d6f8cc2b148f4f3d138a81340b6ede2324148ec1a4f0f

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 a24e7d8e5c7a38eb75f09af130710a5549ebe171a25b4db7d273a8909f2b8a14
MD5 3daddd1a8abddac2bd678b2ba04d186b
BLAKE2b-256 da5b03d622df3627814087674cb688ce3e77cd375ed6f36d20baef4353a288e4

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b20b147bfa3ad8afcad9b8ab09250a4de1b34bbc9d7c1a0e3dcf77e367c8ad5e
MD5 c26953e3cda035c482df9aacb4fee6cd
BLAKE2b-256 aa7c3e45d5fafc82a551f323ce12675e6f1beb3e41f14b44e1ad4d3f85ebc42d

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 4cf7b12e76cb4e9259ecc1d844267c3f6ddf1cdc3de2dbb95803eb002b0a2a56
MD5 ad2f7cafd5a8bd02f330aefa5dc31ac7
BLAKE2b-256 d5dad8f94b342198cac0c2487799ff7b89b994d1a6b467b1468ee98ae9025894

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e1fcef38607da34129b4f65d1521a4bf53660f5a1cd8aaf736bcff8b3466896d
MD5 cfec30bba43fb02bc31b59c8d5d7d74e
BLAKE2b-256 b4206f117655fabef185d8a239c9b1ad72779c9b4b445cda2ef886cdcae8a7ea

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 7e5dd45a58a6f1cca47859ec170d3fa7b713ce8873f20ab0e5d11e8365f75373
MD5 f511c77dc02adfe6366400681969cb94
BLAKE2b-256 f9554dad943f0517d975a9694d119c086e1e7fac5516708b2ce0cad1de5f1542

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9a4d28c079229f30cba16e0fb1cbae0c0b0cc1f00b15151855ebda622d41958c
MD5 b73dafb0d34d43b9bbb9ba5bc58e8194
BLAKE2b-256 1825ac5b5a123b4b6e0f042a7313a8d3c889077b92d841d7f930dc976b1a42e2

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ca8058d61d5a47ba7dc5f5a4aa9b46f18694977281b83041d7ca1533c8c04c39
MD5 185d1b699d00dab80d543ece451ef81d
BLAKE2b-256 06a82707529f1124ca5826549c3971a8bb6e0ce9e4a063877d9c62cb9a662610

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b2b70d45881f52e6232a2e4c3c0ce018ccad0f17ea173b600e68cce58e6bdcf5
MD5 714d610dd7b5508c51b58ec07a424c2c
BLAKE2b-256 5c445d32c5b0287b7f7b94264583b199d973d237a4c83e213377110e75d09d2a

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 6dabdfe568d1ff9c097179ab4be395ebe24aa622377dcdc091bef553e8a6b415
MD5 70d6ef05ac404d77438f1c65b9c3d027
BLAKE2b-256 32834bd165c407603967d87a458bcc7e111dc8b53296fa1c7a20ed1bf3bf3294

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 165255d9da03c9f1c7f842f24ef138f34ab2362eacff5d3b0c83d006f8c55ec9
MD5 9a54f5d6e40a2e0dad743134bbf47cb1
BLAKE2b-256 0ac0db90083f321832d0ce73de880398159f1fbcc0e98f34dc0da642202dc271

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5e7a9f3844a53c6673e7ad997883b90d3c7d6f232a09f012dd8c0623b4cc52fa
MD5 6774921ef9268bc86337b68cd6d64b53
BLAKE2b-256 0e21c14783f683d0b900a63540974eb838e30b6eaf8c1070b899ed1df1ba9488

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 69ee0d4dab26d64954681c3264ceb2c84698bf7247fe9ec5ca5daa7b72b24ae3
MD5 de6022f5f1c81ccb107b272bce400edc
BLAKE2b-256 9af93220f9183760fb7ef77bacfa0ad5d975efe43bc8bc8e39b668d02adeaf32

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b80b09be7151240ba9bf6e93055861f340577749df2f1c349ee199dc2a4e711b
MD5 2baf36c901c34db28bd11f07fc6e6909
BLAKE2b-256 cdaf91e73db1e0a1e44a0fc4ef3ea1b59358de5c100ecc5e779c476cf857bb94

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0cd9c59b21d3fc5ee0c9c9af7c24371a559833eb0acea3b63e213e31ff66ebc2
MD5 e0e0e1f36a2a26f0e05ff033772adcbb
BLAKE2b-256 ab513218d0b7214244bc9fb55e53654aca054e7cabf9859af4826804a2f3b4c1

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b2-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b2-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9595f5f403f3c4279e90ef6d4ac64c32f2ba186bec4c2917df2d73697ba10ccb
MD5 5ab81cb1a77569bbe0db2c9b450870be
BLAKE2b-256 c89bf23358733472b88967768b58177e14188219f5df4a8a5d8ac0c2f9ceb6fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page