Skip to main content

Using LLM to parse PDF and get better chunk for retrieval

Project description

LLMDocParser

A package for parsing PDFs and analyzing their content using LLMs.

Installation

pip install llmdocparser

Usage

from llmdocparser import get_image_content

content = get_image_content(
    llm_type="azure",
    pdf_path="path/to/your/pdf",
    output_dir="path/to/output/directory",
    max_concurrency=5,
    azure_deployment="azure-gpt-4o",
    azure_endpoint="your_azure_endpoint",
    api_key="your_api_key"
)
print(content)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmdocparser-0.1.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

llmdocparser-0.1.0-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file llmdocparser-0.1.0.tar.gz.

File metadata

  • Download URL: llmdocparser-0.1.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Darwin/23.5.0

File hashes

Hashes for llmdocparser-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9232eb657afb8c60eb14e75de70f1816e8557e60171ef738c98aad064cbfb636
MD5 2b99c5cc6e2b54415cbab942bf4b421f
BLAKE2b-256 0e0ceda6a070cf817b744305e61202dd85e8ccfe2fea3daacee73941c154de7a

See more details on using hashes here.

File details

Details for the file llmdocparser-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llmdocparser-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Darwin/23.5.0

File hashes

Hashes for llmdocparser-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d8d3265b3e6a9b2f530653a3248e9cfcb29d5f5a94dee7f0ffc5ff65fc68ecf5
MD5 a3681afbfb210e8ddc364395b283eb55
BLAKE2b-256 0e7a01b6380a8de6501f885df69987baa8ba4f50f08e2e857ad5a3a5652a7a55

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page