Extract bank transactions from Crédit Mutuel PDF statements
Project description
Crédit Mutuel PDF Extractor
A robust Python utility to extract transaction data from Crédit Mutuel bank statement PDFs, validate data integrity, and export to structured formats (JSON/CSV) or Google Sheets.
Features
- Automated Extraction: Parses transaction dates, descriptions, and amounts from multiple accounts per PDF.
- Balance Validation: Computes the sum of transactions and cross-references them with the starting and ending balances provided in the statement.
- Strict CLI: Explicit input file list and mandatory
--outputflag (with.csvor.jsonvalidation). - French Format Support: Handles French number formatting (e.g.,
1.234,56or1 234,56). - Structured Logging: Uses the Python
loggingmodule for clean, professional output and error reporting. - Automation: Includes a
Justfilefor common tasks likerunandclean. - Account Mapping: Support for custom account labels via YAML configuration.
- Google Sheets Export: Direct export to a Google Spreadsheet.
Installation
You can install the extractor directly from PyPI:
pip install credit_mutuel_pdf_extractor
Or using uv:
uv tool install credit_mutuel_pdf_extractor
Usage
Global Command
Once installed, you can use the cmut_process_pdf command from anywhere:
cmut_process_pdf data/*.pdf --output results.csv --config config.yaml
Using Just (Development)
If you have the source code and just installed:
To process all PDFs in the data/ directory using the labels defined in config.yaml (outputs to transactions.csv):
just run
To output in JSON format:
just run json
To clean up all generated files:
just clean
Configuration
Account Mapping
You can map account numbers to custom labels by creating a config.yaml file. See config.example.yaml for a template.
account_mapping:
21945407: "Crequi"
21945409: "Prevost"
[!NOTE] Account numbers are matched as integers (leading zeros are ignored).
Description Mapping
You can automatically rename transactions by adding a description_mapping section. If any key is found as a substring (case-insensitive) in the transaction description, it will be replaced by the corresponding label.
description_mapping:
"VIR SEPA FROM": "Transfer"
"NETFLIX": "Entertainment"
"AMAZON": "Shopping"
Google Sheets Export
To enable Google Sheets export, add a google_sheets section to your config.yaml:
google_sheets:
spreadsheet_id: "your-spreadsheet-id"
sheet_name: "Transactions"
credentials_file: "credentials.json"
Service Account Setup:
- Create a project in Google Cloud Console.
- Enable both Google Sheets API and Google Drive API.
- Create a Service Account (APIs & Services > Credentials > Create Credentials > Service Account).
- Create a JSON Key for that service account and download it.
- Save the key as
credentials.json(or any path specified in yourconfig.yaml). - Permission: Share your Google Spreadsheet with the service account email (found in the JSON) with Editor access. (No broad IAM roles are needed if shared directly).
Command Line Interface
You can explicitly specify files, the output format, and enable Google Sheets export:
uv run credit-mutuel-extractor data/*.pdf --output results.csv --config config.yaml --gsheet --include-source-file
Requirements:
- At least one input PDF file.
- The
--outputflag is mandatory and must end in.csvor.json.
Technical Details
- Account Identification: Uses vertical Y-coordinate mapping to associate tables with the correct account number headers.
- Data Normalization: Amounts are cleaned and converted to standard floats.
- Validation: If
Starting Balance + Σ(Transactions) != Ending Balance, the script will report aCRITICALerror and halt execution. - Modular Design: Utility functions are separated into
utils.pyfor maintainability.
Security & Publishing
Secret Leak Prevention
This project uses pre-commit and detect-secrets to prevent accidental commits of sensitive data.
Before committing, the hooks will scan for potential secrets.
Publishing to PyPI
Publishing is automated via the Justfile and integrated with 1Password for security.
- Store your PyPI Token: Create a "Login" or "Password" item in 1Password.
- Add Environment Variable: Add a field named
UV_PUBLISH_TOKENcontaining your PyPI API token. - Publish:
just publishThis usesop runto securely inject the token into theuv publishcommand without it ever being stored in plain text or history.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file credit_mutuel_pdf_extractor-0.1.0.tar.gz.
File metadata
- Download URL: credit_mutuel_pdf_extractor-0.1.0.tar.gz
- Upload date:
- Size: 61.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
455b2a52e1eacec0514a37c974cf740d6f2123b4c1648cf52d1d954fa14c2522
|
|
| MD5 |
583c1664988ae0031a06ee018e3adafa
|
|
| BLAKE2b-256 |
53b436dad4085910f5841edef72c989d8c22d4999a8a4f7262ba0fb730d05d01
|
File details
Details for the file credit_mutuel_pdf_extractor-0.1.0-py3-none-any.whl.
File metadata
- Download URL: credit_mutuel_pdf_extractor-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba4edaa03173678848dc9f61099fab0a74f0da41a3ebb08b0c9afbc50c2e41ed
|
|
| MD5 |
c8431bba3005688a67cb0b324ea6f2c4
|
|
| BLAKE2b-256 |
6b6fe372dafa6860bf4120780323db085d3abfe78bf088e22f278f5bbe64b3cc
|