Skip to main content

A simple configuration manager with Pydantic and JSON export.

Project description

OCR & LLM Parser

A powerful Python package for parsing and processing documents using multiple providers:

  • Mistral OCR — Extracts text from PDFs and images with high accuracy.
  • LangChain — Processes or summarizes text using LLMs.
  • Llama Parser — Advanced parsing with Markdown or text output.
  • HuggingFace — OCR and document question answering with transformer models.

The package provides a unified interface so you can switch between providers easily using a factory pattern.


🚀 Features

  • Extract text from PDFs or images
  • Summarize or process text using LLMs
  • Support for Markdown or plain text output
  • Plug-and-play factory to switch providers without changing much code
  • Handles environment variable loading for API keys automatically

🔑 Tokens

Create a .env file in your project root and add the API keys for the services you want to use.

Mistral OCR

MISTRAL-OCR-API-TOKEN=your_mistral_api_key

Llama Parser

LLAMA-PARSER-API-TOKEN=your_llama_parser_api_key

HuggingFace

HF-API-TOKEN=your_huggingface_api_key

Only include the keys for the providers you plan to use.


🛠️ Usage

from HowdenParser import ParserFactory

from pathlib import Path

parser = ParserFactory.get_parser("mistralocr:", result_type="md") text = parser.parse(Path("document.pdf")) print(text)

if HowdenConfig package being used

parser = ParserFactory.get_parser("mistralocr:", **config.parameter.dump_model())

text = parser.parse(Path("document.pdf"))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

howdenparser-0.1.8.tar.gz (3.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

howdenparser-0.1.8-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

howdenparser-0.1.8-py2.py3-none-any.whl (5.2 kB view details)

Uploaded Python 2Python 3

File details

Details for the file howdenparser-0.1.8.tar.gz.

File metadata

  • Download URL: howdenparser-0.1.8.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.3 Windows/11

File hashes

Hashes for howdenparser-0.1.8.tar.gz
Algorithm Hash digest
SHA256 28f77ff07d506b28b624a25b8c8559a25d0087fae858ca023f98cf001d5e466d
MD5 07c8a88da046772879b4858cef7c5598
BLAKE2b-256 d2282af7e995d7f5804597dc426cb78f8f6209c89e76a5e92a16492c87c6b2aa

See more details on using hashes here.

File details

Details for the file howdenparser-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: howdenparser-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 5.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.3 Windows/11

File hashes

Hashes for howdenparser-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 74dda074a4ef631fd809ea20e882e113ba0e2b451fe75229420e2c9f3c164a16
MD5 8b9dfa9e697db34487f6bab11f6d1239
BLAKE2b-256 f19ca7fb3f9e3fdc396f6df1339fd76f470e659fbea46fbf13d4d0be1d708e9b

See more details on using hashes here.

File details

Details for the file howdenparser-0.1.8-py2.py3-none-any.whl.

File metadata

  • Download URL: howdenparser-0.1.8-py2.py3-none-any.whl
  • Upload date:
  • Size: 5.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.3 Windows/11

File hashes

Hashes for howdenparser-0.1.8-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b9796dc82e34371fa1941c407b51c35e7014188249b1d7a9019a24085c3a6dd6
MD5 b6ade8123170f7f3613c591e60c89068
BLAKE2b-256 56dc09a891428610a3a6b3564d016b002c376d92263f7a26943219f1ce948865

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page