Skip to main content

txn-harvester extracts and structures financial transaction data from unstructured text, parsing amounts, dates, and categories for automation.

Project description

txn-harvester

PyPI version License: MIT Downloads LinkedIn

Extract and structure financial transaction data from unstructured text

txn_harvester is a Python package designed to parse and validate financial transaction data from raw text inputs (e.g., bank statements, transaction logs) into structured formats. It leverages LLM7 (via langchain_llm7) by default, but supports any LangChain-compatible LLM for flexibility.


🚀 Features

  • Extracts transaction details (amount, date, description, category) from unstructured text
  • Validates output against predefined financial patterns using regex
  • Supports custom LLMs (OpenAI, Anthropic, Google, etc.) via LangChain
  • Lightweight and easy to integrate into financial workflows

📦 Installation

pip install txn_harvester

🔧 Usage

Basic Usage (Default LLM7)

from txn_harvester import txn_harvester

user_input = """
Paid for groceries at Whole Foods: $125.50 on 2024-05-15
Rent payment: $1500.00 (due 2024-05-20)
"""

response = txn_harvester(user_input)
print(response)

Custom LLM (OpenAI Example)

from langchain_openai import ChatOpenAI
from txn_harvester import txn_harvester

llm = ChatOpenAI(model="gpt-4")
response = txn_harvester(user_input, llm=llm)

Custom LLM (Anthropic Example)

from langchain_anthropic import ChatAnthropic
from txn_harvester import txn_harvester

llm = ChatAnthropic(model="claude-2")
response = txn_harvester(user_input, llm=llm)

Custom LLM (Google Example)

from langchain_google_genai import ChatGoogleGenerativeAI
from txn_harvester import txn_harvester

llm = ChatGoogleGenerativeAI(model="gemini-pro")
response = txn_harvester(user_input, llm=llm)

🔑 API Key Configuration

  • Default: Uses LLM7_API_KEY from environment variables.
  • Manual: Pass via api_key parameter or set LLM7_API_KEY in your shell:
    export LLM7_API_KEY="your_api_key_here"
    
  • Free Tier: Sufficient for most use cases (rate limits apply).
  • Get Key: Register at https://token.llm7.io

📝 Parameters

Parameter Type Description
user_input str Raw text containing financial transactions (required).
api_key Optional[str] LLM7 API key (optional; defaults to LLM7_API_KEY).
llm Optional[BaseChatModel] Custom LangChain LLM (optional; defaults to ChatLLM7).

📝 Output

Returns a list of structured transaction data (e.g., [{"amount": "$125.50", "date": "2024-05-15", ...}]).


🔄 Customization

  • Modify regex patterns in prompts.py to adapt to specific transaction formats.
  • Extend the package by subclassing txn_harvester for domain-specific parsing.

📝 License

MIT


📧 Support & Issues


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

txn_harvester-2025.12.22084352.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

txn_harvester-2025.12.22084352-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file txn_harvester-2025.12.22084352.tar.gz.

File metadata

File hashes

Hashes for txn_harvester-2025.12.22084352.tar.gz
Algorithm Hash digest
SHA256 55792f5773ba5c728c15d5b680a92ceef3778102a386bef34e2ce9667762667d
MD5 40d2a312771020ba2cd08cc2c143a9fb
BLAKE2b-256 2752b605d33a7c504555bfc633a678a3c53662d8887600abcddd89532b8f042c

See more details on using hashes here.

File details

Details for the file txn_harvester-2025.12.22084352-py3-none-any.whl.

File metadata

File hashes

Hashes for txn_harvester-2025.12.22084352-py3-none-any.whl
Algorithm Hash digest
SHA256 ac9f77032bd88bfab066bb35db1cf453f574cf9acad5aa7734640a8d1c5d788a
MD5 8dbcc416580acd105c56b9230508f72e
BLAKE2b-256 d8bc8c4d233a843f5de885625b3c92fabf4c5d2d64bed144e909a2f78afa0ee7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page