Using LLM to parse PDF and get better chunk for retrieval
Project description
LLMDocParser
A package for parsing PDFs and analyzing their content using LLMs.
Installation
pip install llmdocparser
Usage
from llmdocparser import get_image_content
content = get_image_content(
llm_type="azure",
pdf_path="path/to/your/pdf",
output_dir="path/to/output/directory",
max_concurrency=5,
azure_deployment="azure-gpt-4o",
azure_endpoint="your_azure_endpoint",
api_key="your_api_key"
)
print(content)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
llmdocparser-0.1.0.tar.gz
(1.1 MB
view details)
Built Distribution
File details
Details for the file llmdocparser-0.1.0.tar.gz
.
File metadata
- Download URL: llmdocparser-0.1.0.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.5 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9232eb657afb8c60eb14e75de70f1816e8557e60171ef738c98aad064cbfb636 |
|
MD5 | 2b99c5cc6e2b54415cbab942bf4b421f |
|
BLAKE2b-256 | 0e0ceda6a070cf817b744305e61202dd85e8ccfe2fea3daacee73941c154de7a |
File details
Details for the file llmdocparser-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: llmdocparser-0.1.0-py3-none-any.whl
- Upload date:
- Size: 1.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.5 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d8d3265b3e6a9b2f530653a3248e9cfcb29d5f5a94dee7f0ffc5ff65fc68ecf5 |
|
MD5 | a3681afbfb210e8ddc364395b283eb55 |
|
BLAKE2b-256 | 0e7a01b6380a8de6501f885df69987baa8ba4f50f08e2e857ad5a3a5652a7a55 |