Skip to main content

Aggregate account PDF statements into JSON and visualize aggregated financial data as timeline

Project description

PDF aggregator

Aggregate account PDF statements into JSON and visualize aggregated financial data as timeline.

PDF aggregator

Works offline and relies on tika for PDF parsing and matplotlib for plotting. It relies on regular expressions stored in simple configuration files to extract bank statements balance, date, account number...

Installation

pip install -r requirements.txt

Usage

Aggregate

Scan PDF files and aggregate financial data into an accounts.json summary file:

python aggregate.py path/to/folder/with/PDF

or

python aggregate.py path/to/file.pdf

--help for more options.

Add a new config

python aggregate.py path/to/PDF/file --test

It should print out the content of the pdf. Then test regular expression:

python aggregate.py path/to/PDF/file --test 'Ending balance on (\d+)/(\d+)/(\d+)

You can then create conf file and test detection with -vvv:

python aggregate.py path/to/PDF/file -vvv

Plot

Plot aggregated data:

python plot.py path/to/folder/with/multiple/accounts.json

or

python plot.py path/to/accounts.json

--help for more options.

Example:

python.exe .\plot.py .\accounts\ --subtotals --no_real_estate_appreciation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf-aggregator-0.0.1.tar.gz (8.8 kB view hashes)

Uploaded Source

Built Distribution

pdf_aggregator-0.0.1-py3-none-any.whl (10.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page