Skip to main content

Parse data from documents optimised for downstream llm tasks.

Project description

LLM Parse

LLM Parse is a Python library designed for parsing and extracting data from files, specifically optimized for downstream tasks involving large language models (LLMs).

It is built on several popular document parsing libraries with further text processing to represent the data in a form that is more suitable for downstream LLM tasks such as RAG, summarization and drafting.

Getting started

Install the package:

pip install llm-parse

Examples

Parse a PDF to Markdown.

from llm_parse.pdf_2_md_parser import PDF2MDParser

parser = PDF2MDParser()
text = parser.load_data("example.pdf")

Parse a PDF to text.

from llm_parse.pdf_2_text_parser import PDF2TextParser

parser = PDF2TextParser()
text = parser.load_data("example.pdf")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_parse-0.1.5.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_parse-0.1.5-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file llm_parse-0.1.5.tar.gz.

File metadata

  • Download URL: llm_parse-0.1.5.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/24.5.0

File hashes

Hashes for llm_parse-0.1.5.tar.gz
Algorithm Hash digest
SHA256 6ca4d8c4702ae7d39f168d985437b2d1ef5bb977325d92bc4803548ede6a9a26
MD5 6bc75116ff5542cd62b9ca3342bfdf60
BLAKE2b-256 33a5ee5ad5bbee5c9ca8066d25dfd6c946c5d47ae75e6f52c5d6d2ca718f2527

See more details on using hashes here.

File details

Details for the file llm_parse-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: llm_parse-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/24.5.0

File hashes

Hashes for llm_parse-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 89e80cf0b75f459144e80d216b82f778681e556b3224fa76ea54715f1c6b25fa
MD5 f2b2b80413396c9e3e6202c8df509b2b
BLAKE2b-256 86fc3a07ff0dbc5f1ef6463d2fa95dfe7ef8d7d354b22b98b0e16a3059dc459f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page