A package for extracting JSON data from Maybank PDF account statements
Project description
maybankpdf2json
A small Python library to extract transaction data from Maybank PDF account statements.
Install
Requires Python 3.8 or newer.
pip install maybankpdf2json
Quick Start
from maybankpdf2json import MaybankPdf2Json
with open("statement.pdf", "rb") as f:
extractor = MaybankPdf2Json(f, "your_pdf_password")
# Raw Python data
transactions = extractor.data()
print(transactions[0])
# Nicely formatted JSON string
print(extractor.dumps())
# Full output with account metadata
print(extractor.dumps_v2())
API
MaybankPdf2Json(buffer, pwd)
json()->List[Output]- Returns transaction rows with fields:
date,desc,trans,bal.
- Returns transaction rows with fields:
data()->List[Output]- Clearer alias for
json().
- Clearer alias for
jsonV2()->dict- Returns:
account_number: statement account number when availablestatement_date: statement date indd/mm/yytransactions: same list asjson()
- Returns:
data_v2()->dict- Clearer alias for
jsonV2().
- Clearer alias for
dumps(indent=2)->str- Returns transaction data as nicely formatted JSON text.
dumps_v2(indent=2)->str- Returns account metadata plus transactions as nicely formatted JSON text.
Output Notes
- Dates use
dd/mm/yy. - Amounts support trailing sign notation from statements:
123.45-->-123.45123.45+->123.45
Example pretty-printed output:
Transaction list output from dumps():
[
{
"date": "01/09/24",
"desc": "BEGINNING BALANCE",
"trans": 0,
"bal": 3285.77
},
{
"date": "01/09/24",
"desc": "TRANSFER FROM A/C MBBQR1714285 * 11111755387009 124998670Q",
"trans": -10.0,
"bal": 3275.77
}
]
Full output from dumps_v2():
{
"account_number": "162021-851156",
"statement_date": "30/09/24",
"transactions": [
{
"date": "01/09/24",
"desc": "BEGINNING BALANCE",
"trans": 0,
"bal": 3285.77
}
]
}
Development
Install project dependencies:
make install
Run tests:
make test
Alternative test command:
pytest tests/
See CONTRIBUTING.md for development workflow and docs/ARCHITECTURE.md for parser internals.
Release
See CHANGELOG.md for release history.
Automatic PyPI publishing is configured with GitHub Actions in .github/workflows/publish.yml.
One-time setup on PyPI:
- Open the project on PyPI.
- Add a Trusted Publisher for this GitHub repository.
- Use workflow name
publish.yml. - Use environment name
pypi.
Release flow:
- Move items from
[Unreleased]inCHANGELOG.mdinto a new version section. - Update the version in
pyproject.tomlandsetup.py. - Commit and push
main. - Create and push a version tag such as
v0.1.53. - GitHub Actions builds the package and publishes it to PyPI automatically.
Example:
git tag v0.1.53
git push origin v0.1.53
Local manual release remains available for maintainers:
make release
This builds and uploads to PyPI using Twine. Run only with valid release credentials.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file maybankpdf2json-0.1.53.tar.gz.
File metadata
- Download URL: maybankpdf2json-0.1.53.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0964daf11a25087927ea7a81d12e2d4c5fb6ce05ab01f36c4541ccf70c717f7f
|
|
| MD5 |
207113749783a163b8f13d80bca7415b
|
|
| BLAKE2b-256 |
c44884ccdbb3468aa1c453d8c7b0cd4a100c2d601e17b0f3e63cceba740c8295
|
Provenance
The following attestation bundles were made for maybankpdf2json-0.1.53.tar.gz:
Publisher:
publish.yml on nordinz7/maybankpdf2json
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
maybankpdf2json-0.1.53.tar.gz -
Subject digest:
0964daf11a25087927ea7a81d12e2d4c5fb6ce05ab01f36c4541ccf70c717f7f - Sigstore transparency entry: 1232330240
- Sigstore integration time:
-
Permalink:
nordinz7/maybankpdf2json@76e40ddc6aa3c40fc849b3d2a123c832eb8d884e -
Branch / Tag:
refs/tags/v0.1.54 - Owner: https://github.com/nordinz7
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@76e40ddc6aa3c40fc849b3d2a123c832eb8d884e -
Trigger Event:
push
-
Statement type:
File details
Details for the file maybankpdf2json-0.1.53-py3-none-any.whl.
File metadata
- Download URL: maybankpdf2json-0.1.53-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ee88d625f71d41ad7ebe80a7c424b553cac84e392653072b5eca0cb912a623b
|
|
| MD5 |
469aaf535f21dac29e21d7062c25bf82
|
|
| BLAKE2b-256 |
33f3df841a961df258571be242659b12e3f2fae71345ddbec60bd9e865466f93
|
Provenance
The following attestation bundles were made for maybankpdf2json-0.1.53-py3-none-any.whl:
Publisher:
publish.yml on nordinz7/maybankpdf2json
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
maybankpdf2json-0.1.53-py3-none-any.whl -
Subject digest:
0ee88d625f71d41ad7ebe80a7c424b553cac84e392653072b5eca0cb912a623b - Sigstore transparency entry: 1232330300
- Sigstore integration time:
-
Permalink:
nordinz7/maybankpdf2json@76e40ddc6aa3c40fc849b3d2a123c832eb8d884e -
Branch / Tag:
refs/tags/v0.1.54 - Owner: https://github.com/nordinz7
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@76e40ddc6aa3c40fc849b3d2a123c832eb8d884e -
Trigger Event:
push
-
Statement type: