Skip to main content

automatically parse PDF's and texts to dataclasses

Project description

piah

PyPI - Version PyPI - Python Version


Piah automatically parse the data from PDF's or texts based only in the dataclass that you provide and return the same dataclass fullfilled with the values. Piah is based in the OxyParser

Table of Contents

Installation

pip install piah

Example

from piah import Piah
from dataclasses import dataclass

@dataclass
class Person:
  name: str
  age: int

parser = Piah("gpt-3.5-turbo")
result = parser.parse("Hello Iam python and I have 33 years old", Person)

to parse PDF's:

result = parser.parse("example.pdf", Person)
#or
result = parser.parse(Path("example.pdf"), Person)

TODO

  • Write docstrings
  • Improve allowed types
  • Improve system prompt

Know Issues

Seems that piah don't pass every time in the test, because the LLM don't parse correctly every time large PDF's

License

piah is distributed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

piah-0.1.0.tar.gz (53.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

piah-0.1.0-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file piah-0.1.0.tar.gz.

File metadata

  • Download URL: piah-0.1.0.tar.gz
  • Upload date:
  • Size: 53.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.0

File hashes

Hashes for piah-0.1.0.tar.gz
Algorithm Hash digest
SHA256 654079328d7727a601d697d6fbd1d6c24425d329907963bd93785321cb6544bf
MD5 719723e7e80afda8b31a4e8621cd893c
BLAKE2b-256 0cce29927781b94c931bdb3344d8fa8f6aa168093d961291f7563e241347af9d

See more details on using hashes here.

File details

Details for the file piah-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: piah-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.0

File hashes

Hashes for piah-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1a45dd609b880e68f0998a5b380076e33d77a230b9150e8927ce263ddea8951c
MD5 87c7fe96e658e73e03dac6fd847e8327
BLAKE2b-256 c2b5bbbb52755b3913505981c118c4d2330f9742e46e5466a22139686256556c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page