Extract receipt / invoice/ orders data from your gmail .
Project description
Receiptor Package
Overview
Receiptor is a Python package designed to extract receipt, invoice, and order data from a user's Gmail account. It provides an easy-to-use interface for developers to fetch and structure email data, including attachments. The package also includes a feature that uses LLMs (Language Model Models) to structure the extracted data into JSON format.
Features
- Extract receipt/invoice/order data from Gmail
- Parse email attachments
- Structure extracted data using LLMs
Installation
To install the Receiptor package, use pip:
pip install receiptor
Usage
1. Import required modules
from receiptor import Receiptor
from llm_parser.gpt_4o_mini_parser.gpt_4o_mini import DocumentStructureExtractor
from dotenv import load_dotenv
2. Load environment variables (if needed)
load_dotenv()
3. You can setup OpenAi Api Keys by :
Create a .env file and set the keys as follows :
OPENAI_API_KEY="your api key"
ORG_ID = "org_id" #Optional
API keys can be passed into the function directly.
structured_data = DocumentStructureExtractor.structure_document_data(
raw_text=data.attachments[0].attachment_raw_text
,openai_api_key = "" , org_id = ""
)
3. Initialize the Receiptor object
obj = Receiptor()
4. Set up Gmail access token
Obtain a Gmail access token through the OAuth2 flow. Store this token securely.
access_token = "Your_Gmail_access_token_here"
5. Fetch and process receipt data
for data in obj.fetch_receipt_data(access_token=access_token):
print(data)
if data.attachments:
# Print the raw text of the first attachment
print(data.attachments[0].attachment_raw_text)
# Structure the attachment text using DocumentStructureExtractor
structured_data = DocumentStructureExtractor.structure_document_data(
raw_text=data.attachments[0].attachment_raw_text
)
print(structured_data)
Example Output
Main Data
{
"message_id": "1dsse2342dfs3",
"body": "body text",
"company": "zomato.com",
"attachments": [
"<models.attachment.Attachment object at 0x1040d45c0>",
"<models.attachment.Attachment object at 0x10407b440>",
"<models.attachment.Attachment object at 0x103f90980>"
],
"attachment_extension": "pdf"
}
Attachment Raw Text
Zomato Food Order: Summary and Receipt
Structured Document Data
{
"brand": "brand name",
"total_cost": "189",
"location": "New york",
"purchase_category": "Food",
"brand_category": "Culinary Services",
"Date": "01-01-2024",
"currency": "INR",
"filename": "filename",
"payment_method": null,
"metadata": null
}
Contributing We welcome contributions to the Receiptor package. Please feel free to submit issues, feature requests, or pull requests on our GitHub repository. License This project is licensed under the MIT License. See the LICENSE file for details. Support
Thank you for using Receiptor! We hope this package simplifies your receipt and invoice data extraction process.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file receiptor-0.0.3.tar.gz
.
File metadata
- Download URL: receiptor-0.0.3.tar.gz
- Upload date:
- Size: 41.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49b9b1ffa0f9deeece23de6c043a8da96ee54e686fcae040b780fcd92669ca20 |
|
MD5 | 2c0501afb3202130273c25dfce65c08b |
|
BLAKE2b-256 | 792644be8d57789445520c9022e7afdedb4bb9c5627d1d51c05f55246ba1330b |
File details
Details for the file receiptor-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: receiptor-0.0.3-py3-none-any.whl
- Upload date:
- Size: 44.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 23c1daae999f8fbf411d4918eee88f95ee8a28651973ca4bd0387a03b428d448 |
|
MD5 | 6aa0d77842bb71e5712985d833313114 |
|
BLAKE2b-256 | 9a648ef058e183cdcac46216f5dc3d50cb29294807297832356dd093cfb575e3 |