Parse data from documents optimised for downstream llm tasks.
Project description
LLM Parse
LLM Parse is a Python library designed for parsing and extracting data from files, specifically optimized for downstream tasks involving large language models (LLMs).
It is built on several popular document parsing libraries with further text processing to represent the data in a form that is more suitable for downstream LLM tasks such as RAG, summarization and drafting.
Getting started
Install the package:
pip install llm-parse
Examples
Parse a PDF to Markdown.
from llm_parse.pdf_2_md_parser import PDF2MDParser
parser = PDF2MDParser()
text = parser.load_data("example.pdf")
Parse a PDF to text.
from llm_parse.pdf_2_text_parser import PDF2TextParser
parser = PDF2TextParser()
text = parser.load_data("example.pdf")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_parse-0.1.5.tar.gz.
File metadata
- Download URL: llm_parse-0.1.5.tar.gz
- Upload date:
- Size: 7.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ca4d8c4702ae7d39f168d985437b2d1ef5bb977325d92bc4803548ede6a9a26
|
|
| MD5 |
6bc75116ff5542cd62b9ca3342bfdf60
|
|
| BLAKE2b-256 |
33a5ee5ad5bbee5c9ca8066d25dfd6c946c5d47ae75e6f52c5d6d2ca718f2527
|
File details
Details for the file llm_parse-0.1.5-py3-none-any.whl.
File metadata
- Download URL: llm_parse-0.1.5-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89e80cf0b75f459144e80d216b82f778681e556b3224fa76ea54715f1c6b25fa
|
|
| MD5 |
f2b2b80413396c9e3e6202c8df509b2b
|
|
| BLAKE2b-256 |
86fc3a07ff0dbc5f1ef6463d2fa95dfe7ef8d7d354b22b98b0e16a3059dc459f
|