Skip to main content

Thai bank PDF statement parser (KBank, BBL, SCB)

Project description

thanakan-statement

Thai bank PDF statement parser for KBank, BBL, and SCB.

Installation

pip install thanakan-statement

Usage

Parse single PDF

from thanakan_statement import parse_pdf

statement = parse_pdf("statement.pdf", password="DDMMYYYY")

print(f"Account: {statement.account_number}")
print(f"Bank: {statement.bank}")
print(f"Transactions: {len(statement.transactions)}")

Parse directory

from thanakan_statement import parse_all_pdfs

statements = parse_all_pdfs("./statements/", password="DDMMYYYY")

Consolidate by account

from thanakan_statement import parse_all_pdfs, consolidate_by_account

statements = parse_all_pdfs("./statements/")
accounts = consolidate_by_account(statements, preferred_language="en")

for account in accounts:
    print(f"{account.account_number}: {len(account.all_transactions)} transactions")

Export

from thanakan_statement import (
    parse_all_pdfs,
    consolidate_by_account,
    export_to_json,
    export_to_csv,
    export_to_excel,
)

statements = parse_all_pdfs("./statements/")
accounts = consolidate_by_account(statements)

# Export to JSON
export_to_json(accounts, "output.json")

# Export to CSV (one file per account)
export_to_csv(accounts, "./csv_output/")

# Export to Excel (one sheet per account)
export_to_excel(accounts, "output.xlsx")

Validate balance continuity

from thanakan_statement import parse_all_pdfs, validate_balance_continuity

statements = parse_all_pdfs("./statements/")
statements.sort(key=lambda s: s.statement_period_start)

is_valid, issues = validate_balance_continuity(statements)
if not is_valid:
    for issue in issues:
        print(f"Issue in {issue.statement.source_pdf}")

Supported Banks

Bank Statement Format Languages
KBank PDF Thai, English
BBL PDF Thai, English
SCB PDF Thai, English

PDF Password

Most bank statement PDFs are password-protected with birthdate (DDMMYYYY):

# Specify directly
statement = parse_pdf("statement.pdf", password="02011995")

# Or use environment variable
import os
os.environ["PDF_PASS"] = "02011995"
statement = parse_pdf("statement.pdf")

Documentation

Full documentation: https://ninyawee.github.io/thanakan/libraries/thanakan-statement/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thanakan_statement-0.2.0.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thanakan_statement-0.2.0-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file thanakan_statement-0.2.0.tar.gz.

File metadata

  • Download URL: thanakan_statement-0.2.0.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for thanakan_statement-0.2.0.tar.gz
Algorithm Hash digest
SHA256 480c9e7a52e02e5813d0996c105b2db17d2214d69c836d420585e615edee934d
MD5 651adb682008c1bcfb816da11868d10b
BLAKE2b-256 51367bc0c044ca81fac5d34ee5ab6a3ffaa12c53dc86e0e42950b52458cf3a5f

See more details on using hashes here.

File details

Details for the file thanakan_statement-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: thanakan_statement-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for thanakan_statement-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 90afdb13d1cdbc9d3a96e1d4fcf845ef139b1051d7619759804a7ccd70f85b06
MD5 542795ef66bf58d99ddc24d6f208c236
BLAKE2b-256 e4f3477a120ae58661508a066547be8a2cc87417c25da82988e45bd70435d9f3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page