automatically parse PDF's and texts to dataclasses
Project description
piah
Piah automatically parse the data from PDF's or texts based only in the dataclass that you provide and return the same dataclass fullfilled with the values. Piah is based in the OxyParser
Table of Contents
Installation
pip install piah
Example
from piah import Piah
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
parser = Piah("gpt-3.5-turbo")
result = parser.parse("Hello Iam python and I have 33 years old", Person)
to parse PDF's:
result = parser.parse("example.pdf", Person)
#or
result = parser.parse(Path("example.pdf"), Person)
TODO
- Write docstrings
- Improve allowed types
- Improve system prompt
Know Issues
Seems that piah don't pass every time in the test, because the LLM don't parse
correctly every time large PDF's
License
piah is distributed under the terms of the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file piah-0.1.0.tar.gz.
File metadata
- Download URL: piah-0.1.0.tar.gz
- Upload date:
- Size: 53.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
654079328d7727a601d697d6fbd1d6c24425d329907963bd93785321cb6544bf
|
|
| MD5 |
719723e7e80afda8b31a4e8621cd893c
|
|
| BLAKE2b-256 |
0cce29927781b94c931bdb3344d8fa8f6aa168093d961291f7563e241347af9d
|
File details
Details for the file piah-0.1.0-py3-none-any.whl.
File metadata
- Download URL: piah-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a45dd609b880e68f0998a5b380076e33d77a230b9150e8927ce263ddea8951c
|
|
| MD5 |
87c7fe96e658e73e03dac6fd847e8327
|
|
| BLAKE2b-256 |
c2b5bbbb52755b3913505981c118c4d2330f9742e46e5466a22139686256556c
|