Skip to main content

Augment Beancount importers with machine learning functionality.

Project description

https://github.com/beancount/smart_importer

https://github.com/beancount/smart_importer/actions/workflows/ci.yml/badge.svg?branch=main

Augments Beancount importers with machine learning functionality.

Status

Working protoype, development status: beta

Installation

The smart_importer can be installed from PyPI:

pip install smart_importer

Quick Start

This package provides import hooks that can modify the imported entries. When running the importer, the existing entries will be used as training data for a machine learning model, which will then predict entry attributes.

The following example shows how to apply the PredictPostings hook to an existing CSV importer:

from beangulp.importers import csv
from beangulp.importers.csv import Col

from smart_importer import PredictPostings


class MyBankImporter(csv.Importer):
    '''Conventional importer for MyBank'''

    def __init__(self, *, account):
        super().__init__(
            {Col.DATE: 'Date',
             Col.PAYEE: 'Transaction Details',
             Col.AMOUNT_DEBIT: 'Funds Out',
             Col.AMOUNT_CREDIT: 'Funds In'},
            account,
            'EUR',
            (
                'Date, Transaction Details, Funds Out, Funds In'
            )
        )


CONFIG = [
    MyBankImporter(account='Assets:MyBank:MyAccount'),
]

HOOKS = [
    PredictPostings().hook
]

Documentation

This section explains in detail the relevant concepts and artifacts needed for enhancing Beancount importers with machine learning.

Beancount Importers

Let’s assume you have created an importer for “MyBank” called MyBankImporter:

class MyBankImporter(importer.Importer):
    """My existing importer"""
    # the actual importer logic would be here...

Note: This documentation assumes you already know how to create Beancount/Beangulp importers. Relevant documentation can be found in the beancount import documentation. With the functionality of beangulp, users can write their own importers and use them to convert downloaded bank statements into lists of Beancount entries. Examples are provided as part of beangulps source code under examples/importers.

smart_importer only works by appending onto incomplete single-legged postings (i.e. It will not work by modifying postings with accounts like “Expenses:TODO”). The extract method in the importer should follow the latest interface and include an existing_entries argument.

Using smart_importer as a beangulp hook

Beangulp has the notation of hooks, for some detailed example see beangulp hook example <https://github.com/beancount/beangulp/blob/ead8a2517d4f34c7ac7d48e4ef6d21a88be7363c/examples/import.py#L50>. This can be used to apply smart importer to all importers.

  • PredictPostings - predict the list of postings.

  • PredictPayees- predict the payee of the transaction.

For example, to convert an existing MyBankImporter into a smart importer:

from your_custom_importer import MyBankImporter
from smart_importer import PredictPayees, PredictPostings

CONFIG = [
    MyBankImporter('whatever', 'config', 'is', 'needed'),
]

HOOKS = [
    PredictPostings().hook,
    PredictPayees().hook
]

Wrapping an importer to become a smart_importer

Instead of using a beangulp hook, it’s possible to wrap any importer to become a smart importer, this will modify only this importer.

  • PredictPostings - predict the list of postings.

  • PredictPayees- predict the payee of the transaction.

For example, to convert an existing MyBankImporter into a smart importer:

from your_custom_importer import MyBankImporter
from smart_importer import PredictPayees, PredictPostings

CONFIG = [
    PredictPostings().wrap(
        PredictPayees().wrap(
            MyBankImporter('whatever', 'config', 'is', 'needed')
        )
    ),
]

HOOKS = [
]

Specifying Training Data

The smart_importer hooks need training data, i.e. an existing list of transactions in order to be effective. Training data can be specified by calling bean-extract with an argument that references existing Beancount transactions, e.g., import.py extract -e existing_transactions.beancount. When using the importer in Fava, the existing entries are used as training data automatically.

Usage with Fava

Smart importers play nice with Fava. This means you can use smart importers together with Fava in the exact same way as you would do with a conventional importer. See Fava’s help on importers for more information.

Development

Pull requests welcome!

Executing the Unit Tests

Simply run (requires tox):

make test

Configuring Logging

Python’s logging module is used by the smart_importer module. The according log level can be changed as follows:

import logging
logging.getLogger('smart_importer').setLevel(logging.DEBUG)

Using Tokenizer

Custom tokenizers can let smart_importer support more languages, eg. Chinese.

If you looking for Chinese tokenizer, you can follow this example:

First make sure that jieba is installed in your python environment:

pip install jieba

In your importer code, you can then pass jieba to be used as tokenizer:

from smart_importer import PredictPostings
import jieba

jieba.initialize()
tokenizer = lambda s: list(jieba.cut(s))

predictor = PredictPostings(string_tokenizer=tokenizer)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smart_importer-1.2.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smart_importer-1.2-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file smart_importer-1.2.tar.gz.

File metadata

  • Download URL: smart_importer-1.2.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for smart_importer-1.2.tar.gz
Algorithm Hash digest
SHA256 c6b8ad801912b5f325a74bfbc67d7a67788e0ad80c192648ed9ce84f259c38e4
MD5 4499906bcc6dbf7060a8cd968f4c6043
BLAKE2b-256 e133e7fe2e9373fa3e3cb3d1428e9963524e50e7abe5e3936858ce0009741a7b

See more details on using hashes here.

File details

Details for the file smart_importer-1.2-py3-none-any.whl.

File metadata

  • Download URL: smart_importer-1.2-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for smart_importer-1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 73a925f8f5c4ef4470f300e9ce47a27e2e35911cba5fe4c1f6aa2e6d02b3eaa8
MD5 ac902a444508c9e472317bae31880d81
BLAKE2b-256 8ea3bc7247b8cf1686def5fd2a3e39425a016612bc68c4dea76669e1aa5eeaca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page