Skip to main content

Extract receipt / invoice/ orders data from your gmail .

Project description

Receiptor Package

Overview

Receiptor is a Python package designed to extract receipt, invoice, and order data from a user's Gmail account. It provides an easy-to-use interface for developers to fetch and structure email data, including attachments. The package also includes a feature that uses LLMs (Language Model Models) to structure the extracted data into JSON format.

Features

  • Extract receipt/invoice/order data from Gmail
  • Parse email attachments
  • Structure extracted data using LLMs

Installation

To install the Receiptor package, use pip:

pip install receiptor

Usage

1. Import required modules

from receiptor import Receiptor
from llm_parser.gpt_4o_mini_parser.gpt_4o_mini import DocumentStructureExtractor
from dotenv import load_dotenv

2. Load environment variables (if needed)

load_dotenv()

3. You can setup OpenAi Api Keys by :

Create a .env file and set the keys as follows :

OPENAI_API_KEY="your api key"
ORG_ID = "org_id" #Optional

API keys can be passed into the function directly.

structured_data = DocumentStructureExtractor.structure_document_data(
       raw_text=data.attachments[0].attachment_raw_text
       ,openai_api_key = "" , org_id = ""
   )

3. Initialize the Receiptor object

obj = Receiptor()

4. Set up Gmail access token

Obtain a Gmail access token through the OAuth2 flow. Store this token securely.

access_token = "Your_Gmail_access_token_here"

5. Fetch and process receipt data

for data in obj.fetch_receipt_data(access_token=access_token):
    print(data)
    if data.attachments:
        # Print the raw text of the first attachment
        print(data.attachments[0].attachment_raw_text)
        
        # Structure the attachment text using DocumentStructureExtractor
        structured_data = DocumentStructureExtractor.structure_document_data(
            raw_text=data.attachments[0].attachment_raw_text
        )
        print(structured_data)

Example Output

Main Data

{
"message_id": "1dsse2342dfs3",
"body": "body text",
"company": "zomato.com",
"attachments": [
"<models.attachment.Attachment object at 0x1040d45c0>",
"<models.attachment.Attachment object at 0x10407b440>",
"<models.attachment.Attachment object at 0x103f90980>"
],
"attachment_extension": "pdf"
}

Attachment Raw Text

Zomato Food Order: Summary and Receipt

Structured Document Data

{
"brand": "brand name",
"total_cost": "189",
"location": "New york",
"purchase_category": "Food",
"brand_category": "Culinary Services",
"Date": "01-01-2024",
"currency": "INR",
"filename": "filename",
"payment_method": null,
"metadata": null
}

Contributing We welcome contributions to the Receiptor package. Please feel free to submit issues, feature requests, or pull requests on our GitHub repository. License This project is licensed under the MIT License. See the LICENSE file for details. Support

Thank you for using Receiptor! We hope this package simplifies your receipt and invoice data extraction process.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

receiptor-0.0.3.tar.gz (41.1 kB view details)

Uploaded Source

Built Distribution

receiptor-0.0.3-py3-none-any.whl (44.5 kB view details)

Uploaded Python 3

File details

Details for the file receiptor-0.0.3.tar.gz.

File metadata

  • Download URL: receiptor-0.0.3.tar.gz
  • Upload date:
  • Size: 41.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for receiptor-0.0.3.tar.gz
Algorithm Hash digest
SHA256 49b9b1ffa0f9deeece23de6c043a8da96ee54e686fcae040b780fcd92669ca20
MD5 2c0501afb3202130273c25dfce65c08b
BLAKE2b-256 792644be8d57789445520c9022e7afdedb4bb9c5627d1d51c05f55246ba1330b

See more details on using hashes here.

File details

Details for the file receiptor-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: receiptor-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 44.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for receiptor-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 23c1daae999f8fbf411d4918eee88f95ee8a28651973ca4bd0387a03b428d448
MD5 6aa0d77842bb71e5712985d833313114
BLAKE2b-256 9a648ef058e183cdcac46216f5dc3d50cb29294807297832356dd093cfb575e3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page