Skip to main content

library for interacting with Czech Raiffeisen Bank's text bank statements

Project description

rbcz.py Build Status Coverage Status

rbcz is a Python library for parsing the plain-text bank statements that Raiffeisen Bank send out via email. It exposes a simple API to either parse statements stored on your local filesystem or to search through your email and retrieve them via IMAP.

Install

Either retrieve from pypi using pip:

$ pip install rbcz

or clone this repo, and install using setup.py:

$ git clone https://github.com/smcl/rbcz.py
$ cd rbcz.py
$ python setup.py install

Methods

There are three simple functions - read_statement, read_statements and read_statements_from_imap. To parse a single statement we can use the read_statement function, which takes a single parameter - the path to the bank statement on the local filesystem - and returns a Statement object:

from rbcz import *
statement = rbcz.read_statement("/path/to/stmt_january_czk.txt")

If we have a number of statements locally we can use read_statements which accepts a list of filenames to parse, and returns a list of Statement:

from rbcz import *

statement_filenames = [
    "stmt_jan_czk.txt",
    "stmt_feb_czk.txt",
    "stmt_mar_czk.txt"
]

statements = rbcz.read_statements(statement_filenames)

If we don’t have all our statements stored locally we can use read_statements_from_imap to connect to an IMAP server and search it for emails from the “info@rb.cz” address, download and parse the attachments and return a list of Statement.

from rbcz import *

statements = read_statements_from_imap("imap.gmail.com", "my.email.address@gmail.com", "password123", "inbox")

Types

There are two types - Statement and Movement.

Statement

A Statement represents a monthly statement:

  • account_name - (string) the name of the main account holder (your name!)
  • account_number - (string) your account number
  • iban - (string) the IBAN of your account
  • currency - (string) the currency the account holds
  • number - (int) the number of the statement (your first statement will be 1)
  • from_date - (datetime) the opening date of the statement
  • to_date - (datetime) the closing date of the statement
  • opening_balance - (Decimal) the balance at the opening date of the statement
  • income - (Decimal) the income you’ve received during the statement’s reporting period
  • expenses - (Decimal) the expenses you’ve paid out during the statement’s reporting period
  • closing_balance - (Decimal) the balance at the closing date of the statement
  • blocked - (Decimal) amount ringfenced for payments out
  • receivable - (Decimal) amount received but yet to clear/settle
  • available_balance - (Decimal) amount of money available to withdraw at the closing date of the statement
  • movements - (List of Movement) the individual cash movements (payments in or out) during the reporting period

Movement

A Movement is an individual transaction - for example an ATM withdrawal or Debit Card payment. Each Statement will have a list of Movement called movements for all the transactions during the reporting period. Each Movement has the following: * number - (int) id of the movement in the current statement * amount - (Decimal) amount of the thing * date_deducted - (datetime) the date the transaction was submitted originally * date_completed - (datetime) the date + time the transaction was finalised at * counterparty_account_number - (string) the account the payment was sent to or received from * counterparty_details - (string) information about the account the payment was sent to or received from, if available * narrative - (string) additional information about the transaction * transaction_type - (string) what type of transaction occurred * specific_symbol - (string) specific symbol for movement * variable_symbol - (string) variable symbol for movement * constant_symbol - (string) constant symbol for movement

Example

The following script will attempt to parse all the statements in the ./rb directory, then take the closing balance and high/low water marks of each period and plot it on a graph.

#!/usr/bin/python

# system/lib imports
import os
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.dates import YearLocator, MonthLocator, DateFormatter, drange, date2num
from numpy import arange

# rbcz library
from rbcz import *

# load and sort the statements
statements = sorted(
    rbcz.read_statements([ "./rb/" + f for f in os.listdir("./rb") ]),
    key=lambda stmt: stmt.from_date)

# function to deterine high/low-water mark on account
def high_low_water(stmt):
    bal = stmt.opening_balance
    hwm = bal
    lwm = bal
    for m in stmt.movements:
        bal += m.amount
        if bal > hwm:
            hwm = bal
        if bal < lwm:
            lwm = bal
    return (lwm, hwm)

#plt.gca().set_color_cycle(['green', 'black', 'red'])


# extract high/low-water marks
water_marks = [ high_low_water(s) for s in statements ]
low_water_marks = [ wm[0] for wm in water_marks ]
high_water_marks = [ wm[1] for wm in water_marks ]

# extract closing balance and dates
closing_balances = [ s.closing_balance for s in statements ]
dates = date2num([ s.from_date for s in statements ])

# prepare and display the chart using matplotlib
y = arange(len(dates)*1.0)

# plot the data
fig, ax = plt.subplots()
ax.set_color_cycle(['green', 'black', 'red'])
ax.plot_date(dates, high_water_marks, "o-")
ax.plot_date(dates, closing_balances, "o-")
ax.plot_date(dates, low_water_marks, "o-")

# fix up the axes
ax.xaxis.set_major_locator(YearLocator())
ax.xaxis.set_minor_locator(MonthLocator())
ax.xaxis.set_major_formatter(DateFormatter('%Y-%m-%d'))

ax.fmt_xdata = DateFormatter('%Y-%m-%d')
fig.autofmt_xdate()

# add a legend
ax.legend(['highest', 'closing', 'lowest'], loc='upper left')

plt.show()

Depending on the content of the bank statements this will generate a graph like the following:

rbcz.png

rbcz.png

TODO

  • get coverage to 100%
  • decide if error parsing an imap statement should be eaten, printed or an exception
  • check if it’s possible to improve the parsing - there are a LOT of regexes that I throw around and it’s not pretty…
  • check if anyone I know gets Czech statements, see if we can parse them too. Is there any other languages - German?
  • check if it works for non-Czech-Republic Raiffeisen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for rbcz, version 0.6
Filename, size File type Python version Upload date Hashes
Filename, size rbcz-0.6.tar.gz (8.6 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page