Skip to main content

Python bindings for LiteParse - fast, lightweight PDF and document parsing

Project description

LiteParse Python

Python bindings for LiteParse — fast, lightweight PDF and document parsing with spatial text extraction.

Installation

pip install liteparse

This also installs the lit CLI command.

Quick Start

from liteparse import LiteParse

parser = LiteParse()
result = parser.parse("document.pdf")
print(result.text)

# Access structured data
for page in result.pages:
    print(f"Page {page.page_num}: {len(page.text_items)} text items")

Configuration

All options are passed to the constructor:

parser = LiteParse(
    ocr_enabled=True,              # Enable OCR (default: True)
    ocr_language="eng",            # Tesseract language code
    ocr_server_url=None,           # HTTP OCR server URL (optional)
    tessdata_path=None,            # Path to tessdata directory (optional)
    max_pages=1000,                # Max pages to parse
    target_pages="1-5,10",         # Specific pages (optional)
    dpi=150,                       # Rendering DPI
    preserve_very_small_text=False, # Keep tiny text
    password=None,                 # Password for protected documents
    quiet=False,                   # Suppress progress output
    num_workers=4,                 # Concurrent OCR workers
)

Parsing from Bytes

Pass raw PDF bytes directly — useful for web uploads or downloaded files:

with open("document.pdf", "rb") as f:
    result = parser.parse(f.read())
print(result.text)

Screenshots

Generate PNG screenshots of document pages:

screenshots = parser.screenshot("document.pdf", page_numbers=[1, 2, 3])
for s in screenshots:
    print(f"Page {s.page_num}: {s.width}x{s.height}")
    with open(f"page_{s.page_num}.png", "wb") as f:
        f.write(s.image_bytes)

Supported Formats

  • PDF (.pdf)
  • Microsoft Office (.docx, .xlsx, .pptx, etc.) — requires LibreOffice
  • OpenDocument (.odt, .ods, .odp) — requires LibreOffice
  • Images (.png, .jpg, .tiff, etc.) — requires ImageMagick
  • And more!

CLI

The Python package includes the lit CLI:

lit parse document.pdf
lit parse document.pdf --format json -o output.json
lit screenshot document.pdf -o ./screenshots
lit batch-parse ./input ./output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liteparse-2.0.0b1.tar.gz (107.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

liteparse-2.0.0b1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

liteparse-2.0.0b1-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ ARM64

liteparse-2.0.0b1-cp315-cp315-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b1-cp315-cp315-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.15manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b1-cp314-cp314-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.14Windows x86-64

liteparse-2.0.0b1-cp314-cp314-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b1-cp314-cp314-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b1-cp314-cp314-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

liteparse-2.0.0b1-cp313-cp313-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.13Windows x86-64

liteparse-2.0.0b1-cp313-cp313-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b1-cp313-cp313-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b1-cp313-cp313-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

liteparse-2.0.0b1-cp312-cp312-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.12Windows x86-64

liteparse-2.0.0b1-cp312-cp312-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b1-cp312-cp312-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b1-cp312-cp312-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

liteparse-2.0.0b1-cp311-cp311-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.11Windows x86-64

liteparse-2.0.0b1-cp311-cp311-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b1-cp311-cp311-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

liteparse-2.0.0b1-cp311-cp311-macosx_11_0_arm64.whl (11.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

liteparse-2.0.0b1-cp310-cp310-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.10Windows x86-64

liteparse-2.0.0b1-cp310-cp310-manylinux_2_28_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

liteparse-2.0.0b1-cp310-cp310-manylinux_2_28_aarch64.whl (16.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

File details

Details for the file liteparse-2.0.0b1.tar.gz.

File metadata

  • Download URL: liteparse-2.0.0b1.tar.gz
  • Upload date:
  • Size: 107.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for liteparse-2.0.0b1.tar.gz
Algorithm Hash digest
SHA256 0502780df8b259b57682e7c296b4d840f039d1ad80ff39fa757c50d37a49ca72
MD5 db2db747f346dda197293681c27eb730
BLAKE2b-256 f785df49c9d2ebe5df0e9f650b4f44a0d934b487cbb88213dd1f291e83b26ae5

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a85d10e7bfc237a6216614c9c466dc0bb34f3e9835e83814895e87eb343166fd
MD5 42ddcbffabf51d14a50e3f18cbe6cfbc
BLAKE2b-256 adcf83963ef046904ed31384b9ecdc3ce42319d7827b7c15bf8bda787dff4805

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0cc9cfc78beb7ad4ccbe794d5492936bed0228e92cfd9a7701b3de9d3c08ee9e
MD5 996ec0f5dcec7645eedfd3662af82baa
BLAKE2b-256 775303112ead0cdf6d4baeb8adc384de595b8d8faad6aa7e4a3559d408e706b8

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp315-cp315-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp315-cp315-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 93337dc1b659ff99a5e72ec1677c9bf6e6a7d8a642f6007eff41eea512988914
MD5 4febce2d174e5765ca5089eee10665e0
BLAKE2b-256 f14c62223481857f015587b16ed94e03a466473818415b2c0acae0e3d0080303

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp315-cp315-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp315-cp315-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 031d014b2a022c18770c459f9c5470202a54e72fa1e5e5420045fc1467d71547
MD5 3f89c6fec45ce1ec4c9749447ec234d6
BLAKE2b-256 a71051e4d8f7873beb048a687364a0660808c2956dda92aa0cecda55e5649b4b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 4e639fa00cda937c16fc65a4973f4861b625f0c0c8e63aae5b5e6f5a94be66fc
MD5 c0753717104089b23960640abea1dfc1
BLAKE2b-256 48340f187476b327d7f0702d9f340e53de5b8a584af4184cc90910a163a4b707

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ccc30dd4fc056b5cf63461ddab8a2abd84bdfd62efcf7bb29548d4a03ead1067
MD5 7a292434fad5060451f77ed5b30d07e7
BLAKE2b-256 77a38ceb6562661e08444bc737bdbe427ac5b4a4beb70a8e8f579b400da2fcb1

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ef6065153ef2af1e9e7ff2ec7c4ddec1ef9ee91b4cb10707368b3642442f115a
MD5 f09cbbc72b3edef5779e129de6a9fb88
BLAKE2b-256 99a1c9ecd1b2a54ede8acdfa8dcc43bd7f8238c7a77f70168d3bbc2414452755

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 45a0d468a14e4e8e8c968bf9038326c9d0f57c72a82aad1f7bc2d8e0d24a4198
MD5 1728e2051a97b8ef3c76b562bec1109c
BLAKE2b-256 372eeab8638d6f6ddc1c29f5325a85ccdf159f69adee0420a83353602be9046b

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 e42c88311fe34a5da23dd402e4e5ee44c8717824f4bf3b0cb935115d3693cf67
MD5 6864000f07f3e54618bdb89f8c4b04ed
BLAKE2b-256 e26c40c003eda86e652923c217c570539cbd6d52a3f1dc808d91a039e242dd20

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 694616978e23b5d39e512d72fa56cd847d8efeb874af11f2dd68d0f395328868
MD5 1b82194401b1b1f2e7a25d7a3cbb70b2
BLAKE2b-256 22142131c5cc36e3fa79b826e80d14fc32597e6e230355dcddc9c42753963953

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1a3190227d87114227e1577092c1d4bb40fd1a6a9047a7c10f118c21bfa50043
MD5 e53e5a9b3deb07304783108018ce87f3
BLAKE2b-256 e079328632962990267a10554131d487b41f7c23205400ab815637152c43990a

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ef8bde5f215e5cad730063ca19594734c298f0a00b4e0d3737e9949d6655238a
MD5 397fb00849c13d94bd3e35db2ee2834e
BLAKE2b-256 49c9f9d93d913f227d1c71cd88a65dbafbaad47c478b7e1ef4e7fbd9c7cd5112

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 5e85968f5c0cc3a7371a1daf328329ba1f6e5d55a9950f1f52a912d7e985d54f
MD5 347bb1171d5468a1a87f8f4d143171a2
BLAKE2b-256 16aa7561079470c431a9cfda0ce694b3c7b12ea51e94ef806d040b782dcfe96f

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 44084bea900c7532006f51eaff260d9e6f07354eee98ef227bcf5e4381f72c3d
MD5 d1d737cd293b6b4f4ec1d69875cadeaa
BLAKE2b-256 dd1cc6e4d76ad2d042adb7a6307d04feabc571b443ebbbf1d639e2c328a998e6

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 58e7fdfbf5cbdc3698e7655e88ad6fb605fb3498d9e163b01cee650cdc67ca11
MD5 3cac546cd2e9e79dfa582e3e55559d57
BLAKE2b-256 69613c8b432e031810f7d445c81c755829fd81b9f09e2af7f785ac2c6793fa92

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8ea4eea538ebaba439a72981b23543acf0166ac477039fb8607d2325defefb05
MD5 96251642ee7c927201a0afea89405724
BLAKE2b-256 2d29effa1b471366a28e2a73c814758e47f98717cbd270536569eb249ebdccd3

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 c64878178da68846ec2cb0e4872f2d975b700173ea1466e203e55d3be051e6b6
MD5 093b0feef161aa23176f960f068ba66b
BLAKE2b-256 7de9c1606aec0b01738f57942acd2bdd1d8303855b50d686fb2ebcafd98d8e7e

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4cbc5094c12bc86e2f239a5c2c61db2f57d5f40e37190d80f5988b37d5e9d93a
MD5 f163563293e463ef257fa950fcfdd59a
BLAKE2b-256 72b68af08073d41d0998c344391635f13316a51c397ad03cf4382ed72f8392b1

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 abcc3d90f6beabdd7ccdec40985f88e7210395a240fb862e82420f7c063544c7
MD5 fd719a3a7eb9bafcfbb3fe382c015a54
BLAKE2b-256 da0f1f63baeb7894657217d78c980844dc0e1ea20ebf0ec7c0823b885b79fe6c

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 982e45697ee158ad5c33d00a46b848546cebb0f36b4e4c2e02a0e331854612ba
MD5 bf388eaef849322be102a807ba37c96c
BLAKE2b-256 664191f170f6c6dcc1d210b36bcb5d6e8afd7e95c58ec4941de45c3ac5101700

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 8a5bc9a1b74a7b1037e3bb69725a6551e1ddf289dd7ad1c96605f0a956ab859a
MD5 f3c1ede91df2624e22222d64a91a7e9c
BLAKE2b-256 e35b5d22884dfd9bf6cebc8e71aa131f5067a011202c2c307134c827703599e2

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7f7f4801e8f12e738c625fa9202f09f2cc6b47e84a11765b16758c02d24e537d
MD5 ad7ced1b7a5e142d81836d32ac225e3a
BLAKE2b-256 548350e9adeb6ccc3d23a9ab194fe9afc8b92660fb31a6cf17e297ce02e8e0cf

See more details on using hashes here.

File details

Details for the file liteparse-2.0.0b1-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for liteparse-2.0.0b1-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f61a7bc0041f21e02e70d21be80780336fa4e781d878ff2590e7da61e09b0131
MD5 77a12589e9315899404be1328bac9e63
BLAKE2b-256 c0de682d02ac5056020cef412bd6744ccc5d8fade7e8d5d67f7d0acdfbdd6ba1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page