Skip to main content

Python bindings for LiteParse - fast, lightweight PDF and document parsing

Project description

LiteParse Python

Python bindings for LiteParse — fast, lightweight PDF and document parsing with spatial text extraction.

Installation

pip install liteparse

This also installs the lit CLI command.

Quick Start

from liteparse import LiteParse

parser = LiteParse()
result = parser.parse("document.pdf")
print(result.text)

# Access structured data
for page in result.pages:
    print(f"Page {page.page_num}: {len(page.text_items)} text items")

Configuration

All options are passed to the constructor:

parser = LiteParse(
    ocr_enabled=True,              # Enable OCR (default: True)
    ocr_language="eng",            # Tesseract language code
    ocr_server_url=None,           # HTTP OCR server URL (optional)
    tessdata_path=None,            # Path to tessdata directory (optional)
    max_pages=1000,                # Max pages to parse
    target_pages="1-5,10",         # Specific pages (optional)
    dpi=150,                       # Rendering DPI
    preserve_very_small_text=False, # Keep tiny text
    password=None,                 # Password for protected documents
    quiet=False,                   # Suppress progress output
    num_workers=4,                 # Concurrent OCR workers
)

Parsing from Bytes

Pass raw PDF bytes directly — useful for web uploads or downloaded files:

with open("document.pdf", "rb") as f:
    result = parser.parse(f.read())
print(result.text)

Screenshots

Generate PNG screenshots of document pages:

screenshots = parser.screenshot("document.pdf", page_numbers=[1, 2, 3])
for s in screenshots:
    print(f"Page {s.page_num}: {s.width}x{s.height}")
    with open(f"page_{s.page_num}.png", "wb") as f:
        f.write(s.image_bytes)

Supported Formats

  • PDF (.pdf)
  • Microsoft Office (.docx, .xlsx, .pptx, etc.) — requires LibreOffice
  • OpenDocument (.odt, .ods, .odp) — requires LibreOffice
  • Images (.png, .jpg, .tiff, etc.) — requires ImageMagick
  • And more!

CLI

The Python package includes the lit CLI:

lit parse document.pdf
lit parse document.pdf --format json -o output.json
lit screenshot document.pdf -o ./screenshots
lit batch-parse ./input ./output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liteparse-2.0.0b0.tar.gz (105.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

liteparse-2.0.0b0-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

liteparse-2.0.0b0-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ ARM64

liteparse-2.0.0b0-cp315-cp315-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b0-cp315-cp315-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b0-cp314-cp314-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.14Windows x86-64

liteparse-2.0.0b0-cp314-cp314-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b0-cp314-cp314-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b0-cp314-cp314-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

liteparse-2.0.0b0-cp313-cp313-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.13Windows x86-64

liteparse-2.0.0b0-cp313-cp313-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b0-cp313-cp313-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b0-cp313-cp313-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

liteparse-2.0.0b0-cp312-cp312-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.12Windows x86-64

liteparse-2.0.0b0-cp312-cp312-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b0-cp312-cp312-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b0-cp312-cp312-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

liteparse-2.0.0b0-cp311-cp311-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.11Windows x86-64

liteparse-2.0.0b0-cp311-cp311-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b0-cp311-cp311-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b0-cp311-cp311-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

liteparse-2.0.0b0-cp310-cp310-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.10Windows x86-64

liteparse-2.0.0b0-cp310-cp310-manylinux_2_28_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b0-cp310-cp310-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file liteparse-2.0.0b0.tar.gz.

File metadata

  • Download URL: liteparse-2.0.0b0.tar.gz
  • Upload date:
  • Size: 105.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0b0.tar.gz
Algorithm Hash digest
SHA256 5c856092adf9feb8eb93a38c205916cc8452f3290a2753f21a07c5a2b5929be9
MD5 e06d952a87b0ac57cbd840b86c4952a2
BLAKE2b-256 bff307be275ce863c6f0295e8f48abcb6bb9d0d73cc0121e4405b54e27bfd80e

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a8b321fb2f958f66e46e58437142ddf781f06ed427c269e257ae0a66902c3b1f
MD5 fc7a3b574bf923173c7aecfdcdb6928b
BLAKE2b-256 bc5091690913e03b424105ae43b879105ce024016a221529c0c8398496aa2dfb

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3c1bf567d24cc3322aa2b334918d629d3964e1fe9035e6b3cf91ea31b3337b2f
MD5 6540e1577bb6f08748edcd7938d8c43d
BLAKE2b-256 51e2f03fa8434d81c2d4eb11b1ec69109407fa5398fcd916bb8bfcea212da301

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp315-cp315-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp315-cp315-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 57e40a9944d2fdf52cef5c551828429316dbbb877a73eeb468968c8388e314aa
MD5 b7204472d3bf9c1f4a55a5c0be0cda73
BLAKE2b-256 b36618cfd6c3e340fa98e95c029bbeb7314f41bae20e1f9b409193d7b55e36a7

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp315-cp315-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp315-cp315-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f96af68e163eba42a99c8ab1693361fc9ba511b3f875681b3ca93752d8a1aa52
MD5 212dd2aa776c1868c35386ff0e7e87fb
BLAKE2b-256 b66d5bb0ef86149cbde2b51291e6036b9ea60f4d55dc20c40184538ea9d4d386

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 db6e255d7c7197514158ecfb363b8361acb51e502c1fbd48f2a79422c1fa74e0
MD5 8e0a98b9a4f076d9559cbcc5bdd703b6
BLAKE2b-256 efb9e8a6b7bd9047f1f4773a0ae1ace93d0f473c6b133bd2eb5db6dc842991cd

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 83f1883905535a105844c48b5eaee81403b989109e6a03d13e021b7c50c7de6e
MD5 dd625130e840bb5c07991adffc837546
BLAKE2b-256 a189f44b56554cddac821ea51e5d2e6bbac0aece9e8761261dcfa59964c7272d

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 fb148af3bd5fb1f3b6c28d8ecc65e79ff802598890d23f1a629597d50aa82f07
MD5 5bceefa0e34b88b160df0d19f413ecb7
BLAKE2b-256 3a2879baf241f2c7210fbc524247875fdafb74cf232c20eaac9cd3cfa9ba70e0

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 458a041d438da3bea285a53f31e0e40c620ab9c21288231048cacc91f469fd0c
MD5 b885100ed64084bf16493d41de9bc961
BLAKE2b-256 cd8172b2b0708413306e1005cf807668ede9d4f43abe0855a6f6cc236c4123f1

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 f737cc6151ce9caacf0ee7e4e47ed63e3906f141e1dc195dd6c4c509cffc1ad8
MD5 e1af71d898746c6b7550090eb70beb6e
BLAKE2b-256 08714f9ea85d2a8e737fcb5ae6f79cff6af95a24ed38216accef5deb1ddb7432

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4e96e9c19ad84d76814cdd04540bca269b8910634d52a380c45aa24261165d35
MD5 dc1ed55816c937d2bb320ecab8c2cc13
BLAKE2b-256 dbb793e92cfe1cb5213d38d6b63fcf6b21f79f7f0ae922bb75a1c08837f12fe0

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 005ee8f959704e26416516159ba1fe9bc0113ef2c20562abc19e551ef18ba6d6
MD5 7f3fbd6704e6eac8b4027b1f5b96e794
BLAKE2b-256 441192ab4bba39bcc83ffb2e6e2c90c1b999c433a365fe3c380af98889bcb31b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e925407c7a24c915b0908d5d6fe1ab893a2d30289df3e004d1f4ee12404c8bda
MD5 7c8ef7a10eb7e879f94a8d11326223e0
BLAKE2b-256 d07e079da3387ebb41a83054883ef5411cb8d5723ad16c15dcb6678df2a94003

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 9e5bb8fbd317fb48dc2b6f693b870cfcbbaf0e343114fb01547c6adf28c0c4f4
MD5 1d4928fb3e9c86616550f4a2b9cb6f8a
BLAKE2b-256 ba0385f6a1104fde3fd8a1cb8618d99c6c3f317b3fa592376561c407ff98823a

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0f28f04c1b7422efea1f1cc9da35ad7e4df16b2060c48588e38e29ab6ceafdbc
MD5 6793d1d838cb7e4c30f132ff8daf523f
BLAKE2b-256 12a73550c564f1fad5b761ade7724581e29513ff8ba91fe0a8bf2ccb3e239ae8

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9e66c8fbdba18578d895d7f97c01e91fc3cf1d4c2a47b80ec666ce36f649a784
MD5 86e26f04e950b1a17d1e0e1a2645016a
BLAKE2b-256 a3d3c1f75994d29def510baa06ec59bd7e191e5a59101f4b2618cd9e6174ac7b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e6a42b14d86f585c69c9630aae55b60aa03d58170a4f147af352c76c0952dd7e
MD5 f2ec19a25c9c1b4bee7b4dffe38e5248
BLAKE2b-256 8fb5a769edc3cacd59e894ace7939e51ec2399d3fa0b7e31464d145abb84d76d

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 5c3cfe7620afee08686a103061bcda0c01680e93b235c1e838c9e64b595f863c
MD5 8256259240dcabe9f613307235bf4f76
BLAKE2b-256 2c7ef27f2ecd2bece23099fc3256e1a10d103353563360cdce9d628876cd3d21

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4aa6f442a2029908ac842c6f71e5cbbfeeb3eb0c8949ad467dad30a2691d2a93
MD5 157e5fdaf446688e4b7fae52e3162986
BLAKE2b-256 41f762ccd02cbebeb56d80ec5c7ab366c5832dcb5a044ea269e8eaed2378e4eb

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 2d3c195b68d34d723ea379975b803b3967f3749c6f57ca6b836508b9b73a7a1b
MD5 ad959d7781a77b40679d5e4b622a3c75
BLAKE2b-256 13841016c2aa7c3b5215bf41a9947c2506c178914e6ba7cb8f4e79e8a78f2e4a

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a49d0413fb482f5330a92b0ea7378e8db6bc1a7224b4310581d4f4d137ee7c1b
MD5 bec55484ff1755e4a2db5545d6615286
BLAKE2b-256 98cad0aed036cbbb0dfcf84a32eeea7b69d5be8e3a41114cd431455207de1777

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 743a352644afaeee46db12a0680baf6ee7e2e955d2cb782ef0d0dea1c1ee91b6
MD5 1473ca6125c5a10a38a83347e8ae4890
BLAKE2b-256 5b806586bd52a47b3c3d3da0b4fefb5021d64b50bf4f2f7136f43acee4e1feb9

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ae0c1dce781f9461ae03c534fc1f58fdfe240ccb94ebabd5bad66e643357ee1f
MD5 29429bc5bec10545fb96efcd209d0929
BLAKE2b-256 93022a97639edd4590cc6025065002827509f38bd1474f15b68ebb6d05bd83a8

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b0-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b0-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 74196037b54447cfbc5dd5584aba72c987ac564a31b75a3ecbee33ec74b6cf77
MD5 99a264e0555ae6e1154120063a6419a1
BLAKE2b-256 9e99822e2bac82b987aad44849c3d616325511b741006fe4f746f5c699f6bb2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page