Skip to main content

Thai bank PDF statement parser (KBank, BBL)

Project description

thanakan-statement

Thai bank PDF statement parser for KBank, BBL, and SCB.

Installation

pip install thanakan-statement

Usage

Parse single PDF

from thanakan_statement import parse_pdf

statement = parse_pdf("statement.pdf", password="DDMMYYYY")

print(f"Account: {statement.account_number}")
print(f"Bank: {statement.bank}")
print(f"Transactions: {len(statement.transactions)}")

Parse directory

from thanakan_statement import parse_all_pdfs

statements = parse_all_pdfs("./statements/", password="DDMMYYYY")

Consolidate by account

from thanakan_statement import parse_all_pdfs, consolidate_by_account

statements = parse_all_pdfs("./statements/")
accounts = consolidate_by_account(statements, preferred_language="en")

for account in accounts:
    print(f"{account.account_number}: {len(account.all_transactions)} transactions")

Export

from thanakan_statement import (
    parse_all_pdfs,
    consolidate_by_account,
    export_to_json,
    export_to_csv,
    export_to_excel,
)

statements = parse_all_pdfs("./statements/")
accounts = consolidate_by_account(statements)

# Export to JSON
export_to_json(accounts, "output.json")

# Export to CSV (one file per account)
export_to_csv(accounts, "./csv_output/")

# Export to Excel (one sheet per account)
export_to_excel(accounts, "output.xlsx")

Validate balance continuity

from thanakan_statement import parse_all_pdfs, validate_balance_continuity

statements = parse_all_pdfs("./statements/")
statements.sort(key=lambda s: s.statement_period_start)

is_valid, issues = validate_balance_continuity(statements)
if not is_valid:
    for issue in issues:
        print(f"Issue in {issue.statement.source_pdf}")

Supported Banks

Bank Statement Format Languages
KBank PDF Thai, English
BBL PDF Thai, English
SCB PDF Thai, English

PDF Password

Most bank statement PDFs are password-protected with birthdate (DDMMYYYY):

# Specify directly
statement = parse_pdf("statement.pdf", password="02011995")

# Or use environment variable
import os
os.environ["PDF_PASS"] = "02011995"
statement = parse_pdf("statement.pdf")

Documentation

Full documentation: https://ninyawee.github.io/thanakan/libraries/thanakan-statement/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thanakan_statement-0.1.1.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thanakan_statement-0.1.1-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file thanakan_statement-0.1.1.tar.gz.

File metadata

  • Download URL: thanakan_statement-0.1.1.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for thanakan_statement-0.1.1.tar.gz
Algorithm Hash digest
SHA256 90f6c44a4c254cb4390b7a6fdc5585ae7f01ae598b9e580b3af2e94c71bc6285
MD5 630003c15bd97015714bb82d8c7be3c1
BLAKE2b-256 8c14db9051d45cb10526e9ad1b655384d67eebd286073ccf4b869c07da9ceeb9

See more details on using hashes here.

File details

Details for the file thanakan_statement-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: thanakan_statement-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for thanakan_statement-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a2527e44c32a5324be5ae1b8bd53fc8ec5b3a422c9cd4b1c7218f70a93cb7d71
MD5 249bbb5d4b3ce2e62ac6a2d6d2922a9e
BLAKE2b-256 41116de35883c5a245022376aca37ee3ab32b89c070e4d9de24dfe5ae5cbd54b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page