Skip to main content

automatically parse PDF's and texts to dataclasses

Project description

piah

PyPI - Version PyPI - Python Version


Piah automatically parse the data from PDF's or texts based only in the dataclass that you provide and return the same dataclass fullfilled with the values. Piah is based in the OxyParser

Table of Contents

Installation

pip install piah

Example

from piah import Piah
from dataclasses import dataclass

@dataclass
class Person:
  name: str
  age: int

parser = Piah("gpt-3.5-turbo")
result = parser.parse("Hello Iam python and I have 33 years old", Person)

to parse PDF's:

result = parser.parse("example.pdf", Person)
#or
result = parser.parse(Path("example.pdf"), Person)

TODO

  • Write docstrings
  • Improve allowed types
  • Improve system prompt

Know Issues

Seems that piah don't pass every time in the test, because the LLM don't parse correctly every time large PDF's

License

piah is distributed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

piah-0.1.1.tar.gz (53.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

piah-0.1.1-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file piah-0.1.1.tar.gz.

File metadata

  • Download URL: piah-0.1.1.tar.gz
  • Upload date:
  • Size: 53.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.0

File hashes

Hashes for piah-0.1.1.tar.gz
Algorithm Hash digest
SHA256 02fae02474a8990ecb7ff9d1b5c705268978ebd18b00d49d5be3b9c80d4c7cb4
MD5 d4e91b48a07cb0e3f4b062446fbdaf36
BLAKE2b-256 3d5dcc169eb8984e8dd1170e4bcfd112485f6fb27dd6512df1d8c478fbb0db21

See more details on using hashes here.

File details

Details for the file piah-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: piah-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 5.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.0

File hashes

Hashes for piah-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fa22eb8570337eb6dd498401ec96604983e985d4d0396ade0df2e9096f4bc7bb
MD5 fe09aa485e1ddf2826ba16791615e819
BLAKE2b-256 f79e6b1ec6ad4014955b2bdbc0eefe5319e648bec0f389ecea552edfec3c4535

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page