Skip to main content

Python bindings for LiteParse - fast, lightweight PDF and document parsing

Project description

LiteParse Python

Python bindings for LiteParse — fast, lightweight PDF and document parsing with spatial text extraction.

Installation

pip install liteparse

This also installs the lit CLI command.

Quick Start

from liteparse import LiteParse

parser = LiteParse()
result = parser.parse("document.pdf")
print(result.text)

# Access structured data
for page in result.pages:
    print(f"Page {page.page_num}: {len(page.text_items)} text items")

Configuration

All options are passed to the constructor:

parser = LiteParse(
    ocr_enabled=True,              # Enable OCR (default: True)
    ocr_language="eng",            # Tesseract language code
    ocr_server_url=None,           # HTTP OCR server URL (optional)
    tessdata_path=None,            # Path to tessdata directory (optional)
    max_pages=1000,                # Max pages to parse
    target_pages="1-5,10",         # Specific pages (optional)
    dpi=150,                       # Rendering DPI
    preserve_very_small_text=False, # Keep tiny text
    password=None,                 # Password for protected documents
    quiet=False,                   # Suppress progress output
    num_workers=4,                 # Concurrent OCR workers
)

Parsing from Bytes

Pass raw PDF bytes directly — useful for web uploads or downloaded files:

with open("document.pdf", "rb") as f:
    result = parser.parse(f.read())
print(result.text)

Screenshots

Generate PNG screenshots of document pages:

screenshots = parser.screenshot("document.pdf", page_numbers=[1, 2, 3])
for s in screenshots:
    print(f"Page {s.page_num}: {s.width}x{s.height}")
    with open(f"page_{s.page_num}.png", "wb") as f:
        f.write(s.image_bytes)

Supported Formats

  • PDF (.pdf)
  • Microsoft Office (.docx, .xlsx, .pptx, etc.) — requires LibreOffice
  • OpenDocument (.odt, .ods, .odp) — requires LibreOffice
  • Images (.png, .jpg, .tiff, etc.) — requires ImageMagick
  • And more!

CLI

The Python package includes the lit CLI:

lit parse document.pdf
lit parse document.pdf --format json -o output.json
lit screenshot document.pdf -o ./screenshots
lit batch-parse ./input ./output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liteparse-2.0.3.tar.gz (114.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

liteparse-2.0.3-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

liteparse-2.0.3-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ ARM64

liteparse-2.0.3-cp315-cp315-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ x86-64

liteparse-2.0.3-cp315-cp315-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ ARM64

liteparse-2.0.3-cp314-cp314-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.14Windows x86-64

liteparse-2.0.3-cp314-cp314-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

liteparse-2.0.3-cp314-cp314-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

liteparse-2.0.3-cp314-cp314-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

liteparse-2.0.3-cp313-cp313-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.13Windows x86-64

liteparse-2.0.3-cp313-cp313-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

liteparse-2.0.3-cp313-cp313-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

liteparse-2.0.3-cp313-cp313-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

liteparse-2.0.3-cp312-cp312-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.12Windows x86-64

liteparse-2.0.3-cp312-cp312-manylinux_2_28_x86_64.whl (13.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

liteparse-2.0.3-cp312-cp312-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

liteparse-2.0.3-cp312-cp312-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

liteparse-2.0.3-cp311-cp311-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.11Windows x86-64

liteparse-2.0.3-cp311-cp311-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

liteparse-2.0.3-cp311-cp311-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

liteparse-2.0.3-cp311-cp311-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

liteparse-2.0.3-cp310-cp310-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.10Windows x86-64

liteparse-2.0.3-cp310-cp310-manylinux_2_28_x86_64.whl (13.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

liteparse-2.0.3-cp310-cp310-manylinux_2_28_aarch64.whl (13.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file liteparse-2.0.3.tar.gz.

File metadata

  • Download URL: liteparse-2.0.3.tar.gz
  • Upload date:
  • Size: 114.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.3.tar.gz
Algorithm Hash digest
SHA256 a13b089f068b36fd563d5ac580f4ef4c691623653395dafc01f7d2bf9254b495
MD5 5d92169e2013473ef55565aac382be6b
BLAKE2b-256 b1dece77f6cab49e4c21691dc3607c71edd0592ecad2370a822d8b4cb8ca07ce

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3e9e74e4610c5c4739422d6732548239d409cb19a76b5dabe988f3232e5ed0a6
MD5 c7bad6937bde54bcefc0f58b548a88e8
BLAKE2b-256 b8cc14a459967bbc97b1c5e63d8bc63ab0ebe52b8a2f3224a4966cbe45abd772

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 054f81a5f32badb93d0e124455b16224ca082599a762bc659223b95e13abe3f1
MD5 408d977ec96582deda2fc313724da75a
BLAKE2b-256 8a1039f70c3e97abf568a6976a41df07d86b46ba5bf3aa85e760cca6d7ecd158

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp315-cp315-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp315-cp315-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6209d74e97dc1546e43d492e4dd3770ef88fb49cc3e5e56f57257ed08576e4fe
MD5 27964a6c808d0162a2bc058f9225801b
BLAKE2b-256 58976ec951ebf4d7e36d88f6d59ef216aa6ec3c5d16346e179a1c4aa29298dcb

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp315-cp315-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp315-cp315-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1a6a39690e2243ac1a3649f5ed9a554589504de2e24bdadbad6fce2ef25b7449
MD5 c6e7c5d0805e7c1a6820f667ba314a4b
BLAKE2b-256 04011b545a84765a89ef9549c0c8b31b5984f26b2660bfa0620533f72265a9e4

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.3-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.3-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 1b5eadaf93738a481b679446e43653f3d69201daf0f585415913550e245b812c
MD5 6fc5adfff9a0dbb157a4c2c740d65209
BLAKE2b-256 ccd0aed6f026c669f2f24318a87edfd9011ade7fd7c805324679be86116c385c

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 77c15b7d9da9f5ccf3486e93634d7a9e24b5dd20c153e33191fb873acfc93420
MD5 961e3f2bae9ca8340eeed5f468d98538
BLAKE2b-256 68521d457a09c76ed6771bc8081ccc08d62d6ee6f28faffadfd2620f645f4b6b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a7876e90dd45d2a4c074318503346408ea66eb28e2fd3fec213522964708f722
MD5 9da80921806c30f17dee28c6729f30cc
BLAKE2b-256 4a913ce288552db14d431ffa73f3bf36fd927c8c0f05188d27f4ed0c18b6114b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fe077508031700d85dc887d9162c9a30ead5b34a4c10637f078081a9a0d14c98
MD5 31b329f4ab62ae561dbbba664bf166fb
BLAKE2b-256 6041af39a60580fa91525e3d7c655f829a97caf2b07981cdb7d86663b189e00c

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.3-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 8350ca6a8f13dd616ef7e2944364a1bb0837e2ec6d5eb169ae36d029d719ae84
MD5 c0798f5637af69ed7c6905fcc7f2da1a
BLAKE2b-256 114375799b5cd5362a4f73705393bc94a920d22995251c405a9e26b2dc2fa5b1

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 06090184880085154011e8e8c4bc6403e293cfa1e5d72d26b86ccf4d43c85c2f
MD5 5a432a1c7f9f9fbb0565a335dade9b4b
BLAKE2b-256 67dd77d801c27279bb4e5b4a89d9387d28abafe92e64236349adff5df391e5a8

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1f3645df4bc1b63a8a96833ef7172bea39ad7d692253e8052597b3d097dc28f5
MD5 82a08d8698d5a29d380e48aaff58dfa9
BLAKE2b-256 71d7c5a4098e881442bf795670a1311f3b2ec641984ca231a81cfc3c5ebd9724

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a7994d25f7be466926b8a32d033cad86691283830d18c9d371cb15603fb8aa09
MD5 876c335e457078245dec8dc2395cb043
BLAKE2b-256 6ddc77fb2dc98ea0ec6c16b00a63a280f1f1c12919c5aa297a7724b885233b8d

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 59a9b4400ba8895c9c4a1652bede02279d014dea5f16fa4636b0825b6d2192f7
MD5 edb6fb36224b19470499dfac7acef6fe
BLAKE2b-256 50083a5d4364f7a19b85af6b1c1fbffccb6a0b65ce11d7c3c899ce191c41e223

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5c5bf58d697ff6f496c7ffa7fc5b31278dd9c56f8f44a52a3740c660461b870c
MD5 41c831abe20d7c33cacb957ff73c1698
BLAKE2b-256 4a3bfba08905b4d78c37a9ab23f02cb1079ce1e9f3ac429ed4c5e86e12f37244

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e97524938546020669465f54a495a909a7e2b1dc6ca448fc61835e74bf985808
MD5 84dbeb82ea335730dd9488aaa91db231
BLAKE2b-256 3d523eb455aa68f54748cccf28ba55d11513abefd24e0613d33cfd6e0c55a239

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5d6f92aea47bfb34dadfb3e14f55a94cbfbba625087d3a894ace1ca0bf70b029
MD5 6b75faf981d94bf153c21617999e1da1
BLAKE2b-256 7f2ec145dddedc7364880b7970183cd93bd3d8003fd26e8283eb26240959aca2

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.3-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 32cecb053b2ba0aeb6bb36b22199f8fc7659de28fad278521efa5c129fc1ab0c
MD5 5f6b80064f84b53412342d1fe0ef8fe9
BLAKE2b-256 a049df3099f4193efc30faaefbd5b5987f33d5720468a07404278c2f8d7bfec0

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5067de89d3642543bcecf5c1ae5151ed13aaebefc073ffd688dae3acd133e2be
MD5 f09610a5ef2f99922f4b1657be621ffd
BLAKE2b-256 4d62d104418ca15df05a12980079cfe7854b8dc05e9e3b21ee43c898d1bb36fb

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1dd3816c7fadd20f77a7697e8511707033caadceb13d9452cf784e9b0626a2cc
MD5 f851632d9827c3e2fdd1aa0e2c0cdb5d
BLAKE2b-256 e236c0d74760f03d6b40a74e56997284d3bae3ee1aa90c4f4e6ead450cf3e197

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fe70bf7a57a74823f25826af6b4f369e730ce8683cf2e10a6d9b622396939075
MD5 20a7262353fedcf6609ee84bb4c2abaa
BLAKE2b-256 dd7afb45339a4b1d4b9bd2522beccc3cf0b77dc4c4a1b4f32f2af25daa0a5901

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: liteparse-2.0.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 62c024bcaaa70b1ffaa0b55863cee943bb4cbb742242dcce2df43158ce05e821
MD5 b2d5a310c1d10b6201b2d2874a17bcaa
BLAKE2b-256 fef922b60c3fc5c854037ee30d7c6159c4c2f662682ffb27f93f74955f002d1a

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f988f78f4fd14df8652a0cc05f00d334965881730da4cdc365cbaa42acdb0343
MD5 8e1cc7a0cbd5430f35305a83ab48881c
BLAKE2b-256 f2bb18553f58e7cf9beee1fdfd44c7299ce85d06a5ac022b4b223a1e19f07b88

See more details on using hashes here.

File details

Details for the file liteparse-2.0.3-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.3-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 dd31b18cf999fcf99f64529bec48e40ccd6201425d388d9ce9555d6fd6b8e503
MD5 d423eaf69416a1691960617bb2f56362
BLAKE2b-256 ed2bc375c383ba03cfe972ddf5c281f13a1501a8a9b11ec00789786b457b49dd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page