Configurable Table Loader package for python
Project description
ConTabLo
Description
ConTabLo is a python package providing a Configurable Table Loader.
With ConTabLo, it is possible to define import configurations for a number of different CSV file formats with the goal to import them into a defined table format, with defined data types for each column. Given a number of configurations, each CSV file can be matched to its configuration via meta data like delimiter, file format and column headers.
A CLI tool is provided that can automatically generate configuration files templates from a number of CSV files. These templates can be edited to get a working configuration. Additionally, a json-schema can created from a configuration file to support editing the configurations.
Installation
ConTabLo can be installed from the Python Package Index:
python3 -m pip install contablo
Usage
Used as a command line tool, ConTabLo provides the following functionality:
$ contablo -h
Usage: contablo [OPTIONS] COMMAND [ARGS]...
CLI tool to the ConTabLo python package.
Options:
-v, --verbose
-h, --help Show this message and exit.
Commands:
convert Load the given CSV file(s) based on their...
mk-import-tmpl Create an input configuration template for the given...
mk-schema Create a JSON schema for validation of JSON based...
Example
Create an import template from a number of CSV files
Any application that uses ConTabLo would probably have a good idea on the composition of the table, data should be imported into.
This idea can be expressed by creating a list of FieldSpec objects (or rather objects of classes that follow the FieldSpec protocol). Since the commandlinetool contablo
is rather generic, it provides a way to initialize this table specification from a simple JSON file like the following:
[
{ "name": "date", "type": "date", "help": "Date of payment" },
{ "name": "payee","type": "string", "help": "Person or institution sending or receiving the payment" },
{ "name": "note", "type": "string", "help": "Transaction notes" },
{ "name": "iban", "type": "string", "help": "Payee IBAN" },
{ "name": "bic", "type": "string", "help": "Payee BIC, if applicable" },
{ "name": "amount", "type": "number", "help": "Payment amount" },
{ "name": "balance", "type": "number", "help": "New account balance" }
]
With this, we can create an import template from a number of CSV files that already contains hints for possible target fields:
$ contablo mk-import-tmpl -t fieldspec-banking.json banking*.csv -o bank-tmpl
Found 1 distict file formats:
# 1: 2 files, iso-8859-1 encoded, with 1 chunks.
# 1: 5 columns, 87 samples
Datum;Name;Text;Betrag (EUR);Saldo (EUR)
30.10.2024;"Some random payee Inc.";"SEPA-Basislasts [..] 24 Kunden-Referenz: xxxyyyzzz";"-217,50";"539,11"
Template was written to bank-tmpl-20241103_165348-01.json
The resulting template now contains specifications for each of the five columns, e.g.:
{
"columns": [
{
"label": "Datum",
"field": "date",
"format": "dd.mm.yyyy"
},
{
"label": "Name",
"field": "payee|note|iban|bic",
"format": "",
"samples": [
"Some random payee Inc.",
"Another random payee Inc.",
"Someone completely different"
]
},
{
"label": "Text",
"field": "payee|note|iban|bic",
"format": "",
"samples": [
"Note 1",
"Note 2"
]
},
{
"label": "Betrag (EUR)",
"field": "amount|balance",
"format": "-1.000,00"
},
{
"label": "Saldo (EUR)",
"field": "amount|balance",
"format": "1.000,00"
}
]
}
Each "field" entry contains suggestions for target fields to choose from, depending on the type of input (again, see subclasses of FieldSpec for builtin types).
The same template after a manual cleaup results in an import specification:
{
"columns": [
{
"label": "Datum",
"field": "date",
"format": "dd.mm.yyyy"
},
{
"label": "Name",
"field": "payee"
},
{
"label": "Text",
"field": "note"
},
{
"label": "Betrag (EUR)",
"field": "amount",
"format": "-1.000,00"
},
{
"label": "Saldo (EUR)",
"field": "balance",
"format": "1.000,00"
}
]
}
Note, how the sample and empty format elements were removed and the field entries were reduced to the desired target field name. The labels must stay unchanged, they are an essential identification marker matching a CSV import file to the appropriate import configuration. Since it is no longer a template, we rename it to a more appropriate name, e.g. bank-import-config.json
.
We are now ready to convert the type of CSV files described by the import specification:
$ contablo convert -t fieldspec-banking.json -c bank-import-config.json banking-*.csv -o merged_data.csv
Contributing
If you want to contribute to this project, please use the following steps:
- Fork the project.
- Create a new branch (git checkout -b feature/awesome-feature).
- Commit your changes (git commit -m 'Add some feature').
- Push to the branch (git push origin feature/awesome-feature).
- Open a pull request.
Commit Message Structure
This projects aims to follow the Conventional Commits guidelines.
When writing commit messages, use one of the following categories to clearly describe the purpose of your commit:
- feat / feature: ✨ Introducing new features
- fix / bugfix: 🐛 Addressing bug fixes
- perf: 🚀 Enhancing performance
- refactor: 🔄 Refactoring code - Not displayed in CHANGELOG
- test / tests: ✅ Adding or updating tests - Not displayed in CHANGELOG
- build / ci: 🛠️ Build system or CI/CD updates - Not displayed in CHANGELOG
- doc / docs: 📚 Documentation changes - Not displayed in CHANGELOG
- style: 🎨 Code style or formatting changes - Not displayed in CHANGELOG
- chore: 🔧 Miscellaneous chores
- other: 🌟 Other significant changes
Example Commit Messages
feat: Add cool new feature
fix: Resolve unexpected behavior with translation
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file contablo-0.2.1rc1.tar.gz
.
File metadata
- Download URL: contablo-0.2.1rc1.tar.gz
- Upload date:
- Size: 27.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4208b2ae919a045c188ee5e5cf3026b11572c679f8a4568a84117623df9d56d7 |
|
MD5 | 0030a0528223f739c7aff35cfaac1070 |
|
BLAKE2b-256 | 78eb83fb834bc67878006c7313dc26e1f1a999c2636d298e97b1fb1492287dbc |
Provenance
The following attestation bundles were made for contablo-0.2.1rc1.tar.gz
:
Publisher:
release.yaml
on gandy92/contablo
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
contablo-0.2.1rc1.tar.gz
- Subject digest:
4208b2ae919a045c188ee5e5cf3026b11572c679f8a4568a84117623df9d56d7
- Sigstore transparency entry: 149278183
- Sigstore integration time:
- Predicate type:
File details
Details for the file contablo-0.2.1rc1-py3-none-any.whl
.
File metadata
- Download URL: contablo-0.2.1rc1-py3-none-any.whl
- Upload date:
- Size: 30.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 77ed6a19014c5fe61acfb51f8366fd18c73b112902809b902796a1ef8908a4a7 |
|
MD5 | b8b56b1ec6585d1ec24cb1c53fbc43e1 |
|
BLAKE2b-256 | c50cb970d06ce86e08cf11fe9df50b8bb77714a37762c8d25a8a7007ecce5704 |
Provenance
The following attestation bundles were made for contablo-0.2.1rc1-py3-none-any.whl
:
Publisher:
release.yaml
on gandy92/contablo
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
contablo-0.2.1rc1-py3-none-any.whl
- Subject digest:
77ed6a19014c5fe61acfb51f8366fd18c73b112902809b902796a1ef8908a4a7
- Sigstore transparency entry: 149278184
- Sigstore integration time:
- Predicate type: