txn-harvester extracts and structures financial transaction data from unstructured text, parsing amounts, dates, and categories for automation.
Project description
txn-harvester
Extract and structure financial transaction data from unstructured text
txn_harvester is a Python package designed to parse and validate financial transaction data from raw text inputs (e.g., bank statements, transaction logs) into structured formats. It leverages LLM7 (via langchain_llm7) by default, but supports any LangChain-compatible LLM for flexibility.
🚀 Features
- Extracts transaction details (amount, date, description, category) from unstructured text
- Validates output against predefined financial patterns using regex
- Supports custom LLMs (OpenAI, Anthropic, Google, etc.) via LangChain
- Lightweight and easy to integrate into financial workflows
📦 Installation
pip install txn_harvester
🔧 Usage
Basic Usage (Default LLM7)
from txn_harvester import txn_harvester
user_input = """
Paid for groceries at Whole Foods: $125.50 on 2024-05-15
Rent payment: $1500.00 (due 2024-05-20)
"""
response = txn_harvester(user_input)
print(response)
Custom LLM (OpenAI Example)
from langchain_openai import ChatOpenAI
from txn_harvester import txn_harvester
llm = ChatOpenAI(model="gpt-4")
response = txn_harvester(user_input, llm=llm)
Custom LLM (Anthropic Example)
from langchain_anthropic import ChatAnthropic
from txn_harvester import txn_harvester
llm = ChatAnthropic(model="claude-2")
response = txn_harvester(user_input, llm=llm)
Custom LLM (Google Example)
from langchain_google_genai import ChatGoogleGenerativeAI
from txn_harvester import txn_harvester
llm = ChatGoogleGenerativeAI(model="gemini-pro")
response = txn_harvester(user_input, llm=llm)
🔑 API Key Configuration
- Default: Uses
LLM7_API_KEYfrom environment variables. - Manual: Pass via
api_keyparameter or setLLM7_API_KEYin your shell:export LLM7_API_KEY="your_api_key_here"
- Free Tier: Sufficient for most use cases (rate limits apply).
- Get Key: Register at https://token.llm7.io
📝 Parameters
| Parameter | Type | Description |
|---|---|---|
user_input |
str |
Raw text containing financial transactions (required). |
api_key |
Optional[str] |
LLM7 API key (optional; defaults to LLM7_API_KEY). |
llm |
Optional[BaseChatModel] |
Custom LangChain LLM (optional; defaults to ChatLLM7). |
📝 Output
Returns a list of structured transaction data (e.g., [{"amount": "$125.50", "date": "2024-05-15", ...}]).
🔄 Customization
- Modify regex patterns in
prompts.pyto adapt to specific transaction formats. - Extend the package by subclassing
txn_harvesterfor domain-specific parsing.
📝 License
MIT
📧 Support & Issues
- GitHub Issues: https://github.com/chigwell/txn-harvester/issues
- Author: Eugene Evstafev (hi@euegne.plus)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file txn_harvester-2025.12.22084352.tar.gz.
File metadata
- Download URL: txn_harvester-2025.12.22084352.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
55792f5773ba5c728c15d5b680a92ceef3778102a386bef34e2ce9667762667d
|
|
| MD5 |
40d2a312771020ba2cd08cc2c143a9fb
|
|
| BLAKE2b-256 |
2752b605d33a7c504555bfc633a678a3c53662d8887600abcddd89532b8f042c
|
File details
Details for the file txn_harvester-2025.12.22084352-py3-none-any.whl.
File metadata
- Download URL: txn_harvester-2025.12.22084352-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac9f77032bd88bfab066bb35db1cf453f574cf9acad5aa7734640a8d1c5d788a
|
|
| MD5 |
8dbcc416580acd105c56b9230508f72e
|
|
| BLAKE2b-256 |
d8bc8c4d233a843f5de885625b3c92fabf4c5d2d64bed144e909a2f78afa0ee7
|