Skip to main content

Aggregate account PDF statements into JSON and visualize aggregated financial data as timeline

Project description

PDF aggregator

Aggregate account PDF statements into JSON and visualize aggregated financial data as timeline.

PDF aggregator

Works offline and relies on tika for PDF parsing and matplotlib for plotting. It relies on regular expressions stored in simple configuration files to extract bank statements balance, date, account number...

Installation

pip install -r requirements.txt

Usage

Aggregate

Scan PDF files and aggregate financial data into an accounts.json summary file:

python aggregate.py path/to/folder/with/PDF

or

python aggregate.py path/to/file.pdf

--help for more options.

Add a new config

python aggregate.py path/to/PDF/file --test

It should print out the content of the pdf. Then test regular expression:

python aggregate.py path/to/PDF/file --test 'Ending balance on (\d+)/(\d+)/(\d+)

You can then create conf file and test detection with -vvv:

python aggregate.py path/to/PDF/file -vvv

Plot

Plot aggregated data:

python plot.py path/to/folder/with/multiple/accounts.json

or

python plot.py path/to/accounts.json

--help for more options.

Example:

python.exe .\plot.py .\accounts\ --subtotals --no_real_estate_appreciation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf-aggregator-0.0.1.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf_aggregator-0.0.1-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file pdf-aggregator-0.0.1.tar.gz.

File metadata

  • Download URL: pdf-aggregator-0.0.1.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.6.8

File hashes

Hashes for pdf-aggregator-0.0.1.tar.gz
Algorithm Hash digest
SHA256 7bb694f89ff1b590429d7f285134567cf58f480f29601b4d04aecc9984216793
MD5 28c607048e83ed6918934598ba12b48d
BLAKE2b-256 66fd84db9f9b053a42debb170be330f241f22f5d55166922731a260b33f0299b

See more details on using hashes here.

File details

Details for the file pdf_aggregator-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: pdf_aggregator-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.6.8

File hashes

Hashes for pdf_aggregator-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8a23cb0760b27b8457c5e84276124c0aac28c8b2a62919f111eb58bc693d80bd
MD5 6dcf5c2028dccbe4030cce77ad08d310
BLAKE2b-256 c0d3526724bcb3ff1fee559d9b1b35aade81584e832013aad60a50b80af8df37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page