Skip to main content

Command-line tool for converting PDF bank statements into CSV

Project description

pdf-bank-statement-parser

I built this command-line tool because First National Bank (FNB) South Africa only gives the option of exporting historic statements to PDF, which is a terrible format to use for any downstream task other than reading.

This tool uses pdfplumber) for text extraction from the PDF, but has no other package dependencies.

This tool exports all transactions from a PDF bank statement and exports them into a CSV file. It does this by exporting the PDF contents to text and then extracting the transactions and balances using REGEX.

The parsed results are verified as follows:

  1. It is checked (for every transaction extracted) that the balance amount is the sum of the previous balance and the transaction amount.

  2. It is checked that the opening balance reported on the statement plus the sum of extracted transaction amounts is equal to the closing balance reported on the statement.

This tool currently only works on First National Bank (FNB) current account statements, but I'm happy to extend it to other bank statement formats if there is a need.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_bank_statement_parser-0.1.0.tar.gz (28.4 kB view details)

Uploaded Source

Built Distribution

pdf_bank_statement_parser-0.1.0-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file pdf_bank_statement_parser-0.1.0.tar.gz.

File metadata

File hashes

Hashes for pdf_bank_statement_parser-0.1.0.tar.gz
Algorithm Hash digest
SHA256 25e20dd0ce93d9bdce161bfa89c3eacf45d08751aee1b49dd20cd6c709a79df2
MD5 72f8bd08dc249e933fc122026794825a
BLAKE2b-256 e3bd8932efe1c25f619fdac7af56ff5bb68e51826ab68d78d78f6b133389dff0

See more details on using hashes here.

File details

Details for the file pdf_bank_statement_parser-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pdf_bank_statement_parser-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8a39889c52de8c12d2acd4f07fb58873f5c210ff7e05cc88c995a04e4f71d103
MD5 07f88f28764f97b901b80c7c805bb844
BLAKE2b-256 727ab033e5f0bdb57095bdff59a847954e134234e6a54d58895cb5f10e076fa8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page