EquipeBaie Invoice Automation
Project description
Equipe Baie — Invoice Automation
Automated invoicing pipeline that parses PDF invoices, enriches them with client accounting references, and generates Excel accounting exports — with a desktop UI for day-to-day operation.
Project structure
EquipeBaie_Freelance-project/
├── src/
│ ├── ebia/ # Core library (published to PyPI)
│ │ ├── parser.py # PDF invoice parser
│ │ ├── xls_generator.py # Excel (.xlsx) generator
│ │ └── cli.py # CLI entry point
│ └── ebia_ui/ # Desktop application (not on PyPI)
│ ├── paths.py # All filesystem paths — single source of truth
│ ├── logging_config.py # App-wide logging setup
│ ├── main.py # Application entry point
│ ├── core/
│ │ ├── config_manager.py # Persistent JSON config
│ │ ├── client_manager.py # Client reference registry
│ │ ├── engine.py # Orchestration layer (RunEngine)
│ │ ├── manifest.py # Processed-file tracking (SHA-256)
│ │ └── _mp_worker.py # Subprocess entry point (multiprocessing)
│ ├── ui/
│ │ ├── main_window.py
│ │ ├── components/ # Reusable widgets, sidebar
│ │ └── views/ # Run, Config, Client, About pages
│ └── assets/ # logo.png, logo.ico, default_clients.json
├── installer/
│ ├── EquipeBaie.iss # Inno Setup installer script
│ └── equipebaie.cer # Self-signed code-signing certificate (public key)
├── tests/
│ ├── unit/ # Pure-function tests, no I/O
│ ├── integration/ # Real PDF/Excel tests
│ └── e2e/ # CLI subprocess tests
├── scripts/
│ └── build_exe.py # PyInstaller Windows build (onedir)
└── pyproject.toml
Components
ebia — core library
Stateless, no side effects. Can be used standalone via CLI or imported directly.
parser.py— extractsClient,date,total_ttcfrom a French-format PDF invoice usingpdfplumber;parse_many()parses a list of PDFs in parallel usingProcessPoolExecutor, returning(path, dict | Exception, list[LogRecord])triples — log records collected in the worker are re-emitted by the caller so they reach the UI and log file on all platformsxls_generator.py— generates one.xlsxper month with one sheet per day; 3 accounting rows per invoice (411 / 44571 / 701). The Compte auxiliaire (client reference) appears only on the 411 row; rows 44571 and 701 leave that column blank, matching the accounting system's expectation.patch_month_workbookis called by the engine for open-month invoices — it appends new rows at the correct day position with doc numbers read directly from each invoice'sinvoice_idand sequential piece numbers continuing from the month's last counter (no renumbering of existing rows).
ebia_ui — desktop application
Built with customtkinter. Requires Python + display (not headless).
- Run view — manual trigger + recurring scheduler (daily / weekly / monthly). The engine runs in a separate process (own GIL) so the UI stays responsive during execution. Includes a shortcut button to open the reports folder directly in File Explorer, and a run history table showing the last 20 executions (manual and scheduled) with status, report count, invoice count, and error details.
- Config view — VAT rate, accounting exercise start month, invoice folder path, configurable reports output folder, manual PDF archive trigger (runs in background — UI stays responsive on network drives), workspace folder for multi-machine setups
- Client view — client reference registry (accounting code ↔ client name) backed by a
ttk.Treeviewfor instant rendering regardless of list size. Live search bar filters by reference or description as you type. Add and edit via modal popup dialog; delete with a confirmation prompt. Pre-populated with a default client list on first launch. - About view — app version (read dynamically from the installed package)
- Manifest — SHA-256 deduplication: already-processed PDFs are skipped on subsequent runs; only recorded after successful Excel generation
- Accounting exercise counters — the Pièce and Document columns are handled separately:
- Pièce: sequential counter managed by the engine. Continues across all months within the same accounting exercise (e.g. Nov → Oct), resets to 1 on the exercise start month. Configurable via Config → Exercice comptable. The one-time seed (
id_pieceinconfig.json) lets you continue from an existing number at first install. - Document: read directly from the
invoice_idfield embedded in each PDF (formatYY-MM-XXXXX— theXXXXXpart is used as the Document number). The engine does not manage or increment this counter; it is entirely controlled by EBP and reflected in the invoice PDF. - Re-running the engine for a past month never resets or conflicts with other months — each month stores its last-used Pièce counter and resumes correctly.
- Pièce: sequential counter managed by the engine. Continues across all months within the same accounting exercise (e.g. Nov → Oct), resets to 1 on the exercise start month. Configurable via Config → Exercice comptable. The one-time seed (
- Sub-folder support — the engine recurses into sub-folders when scanning for PDF invoices, so the client can organise invoices by month inside the source folder
- Invoice routing — six-way split per invoice:
- No client reference — the parsed client name has no matching entry in the client registry → the invoice is skipped, added to
result.no_ref_skipped, and a warning dialog lists the affected files. The user must add the client via the Client panel and re-run. The PDF is not recorded in the manifest. - Finalized month — the month's xlsx exists AND the following month's xlsx also exists → the invoice is blocked, added to
result.late_skipped, and a warning dialog is shown. The PDF is not recorded in the manifest so the warning reappears on every run until the file is moved or handled manually in the accounting software. - Open month (patch) — the month's xlsx exists but the following month's xlsx does not yet exist → the invoice is appended to the existing workbook at the correct day sheet; the Document number comes from the invoice's
invoice_id(XXXXX part) and the Pièce number continues sequentially from the month's last counter. No renumbering of existing rows. - Out-of-order month — the month has no xlsx yet, but a later month's xlsx already exists in the output folder → the invoice is blocked, added to
result.order_blocked, and a distinct warning dialog is shown. The PDF is not recorded in the manifest. - Too-old month — the month has no xlsx and no doc counter, and the invoice date is from 2+ calendar years ago → almost certainly a stray file from a previous year. The invoice is skipped, added to
result.too_old_skipped, and a warning dialog is shown. The PDF is not recorded in the manifest. - New month — no xlsx yet and no later month has been processed → a fresh workbook is generated.
- No client reference — the parsed client name has no matching entry in the client registry → the invoice is skipped, added to
ebia_ui/paths.py — centralised path configuration
All filesystem paths are defined in one place, derived from APP_DIR (the workspace):
APP_DIR = _resolve_app_dir() # local default or shared drive (see Workspace below)
REPORTS_DIR = Path.home() / "Documents" / "Equipe_Baie" / "Rapports"
ARCHIVES_DIR = Path.home() / "Documents" / "Equipe_Baie" / "factures_archives"
Monitoring
The app can send email notifications via Gmail when invoice runs complete or errors occur. Monitoring is opt-in — the app works normally without it.
Setup
Step 1 — Credentials (required for monitoring to activate)
On first launch the app creates ~/.equipe_baie/.env.example as a template (only if .env does not already exist). Create .env in the same folder and fill in your credentials:
cp ~/.equipe_baie/.env.example ~/.equipe_baie/.env
Edit ~/.equipe_baie/.env:
# Gmail — standard passwords are blocked; use an App Password instead:
# Google Account → Security → 2-Step Verification → App passwords
MAIL_SENDER=your-sender@gmail.com
MAIL_PASSWORD=xxxx-xxxx-xxxx-xxxx
# Primary recipient for all monitoring emails
MAIL_RECEIVER=notify@example.com
# Additional recipients for error alerts only (comma-separated, optional)
MAIL_RECEIVERS_ERROR=
# Display name used in email subjects (default: EquipeBaie)
APP_NAME=EquipeBaie
Restart the app after saving — monitoring activates automatically.
Step 2 — Behaviour (optional, defaults are production-ready)
On first launch the app creates ~/.equipe_baie/monitoring.json with sensible defaults. Edit it directly to tune monitoring behaviour — changes are picked up on the next run, no restart needed:
{
"digest_frequency": "weekly",
"digest_weekday": 0,
"digest_time": "08:00",
"smtp_host": "smtp.gmail.com",
"smtp_port": 587,
"alert_cooldown_seconds": 300,
"log_retention_days": 30,
"notify_on_scheduler_success": true,
"notify_on_manual_error": true
}
| Key | Default | Description |
|---|---|---|
digest_frequency |
"weekly" |
Digest schedule: "daily" | "weekly" | "monthly" | "none" |
digest_weekday |
0 |
Day for weekly digest: 0=Monday … 6=Sunday |
digest_time |
"08:00" |
Fire time (HH:MM) for the digest task — independent of the main scheduler time |
smtp_host |
"smtp.gmail.com" |
SMTP server host |
smtp_port |
587 |
SMTP port (STARTTLS) |
alert_cooldown_seconds |
300 |
Minimum seconds between consecutive error alert emails |
log_retention_days |
30 |
Run log files older than this are deleted on startup |
notify_on_scheduler_success |
true |
Set to false to suppress success emails from scheduled runs |
notify_on_manual_error |
true |
Set to false to suppress error alerts from manual UI runs |
Tip: For the first 2 weeks after a new install, set
"digest_frequency": "daily"for maximum visibility, then switch back to"weekly".
What gets sent and when
| Trigger | Outcome | Email sent |
|---|---|---|
| Scheduled run (Task Scheduler) | Success | ✅ Green summary (if notify_on_scheduler_success=true) |
| Scheduled run (Task Scheduler) | Error / partial | ✅ Red alert with first error message |
| Manual run (UI) | Success | ❌ No email (intentional) |
| Manual run (UI) | Error | ✅ Red alert (if notify_on_manual_error=true) |
| Digest | Per digest_frequency |
✅ Summary of runs in the past period |
| Live ERROR/CRITICAL log | Any session | ✅ Alert email (subject to alert_cooldown_seconds) |
End-user notification email
Separate from the monitoring alerts (which go to the developer/operator), a per-run notification can be sent directly to the app end-user whenever invoices are ignored (wrong client reference, finalised month, out-of-order, etc.).
Set the recipient address in ~/.equipe_baie/config.json:
"notification_email": "utilisateur@example.com"
Leave the key empty ("") to disable. This email is sent automatically after every scheduled run that produces warnings, using the same SMTP credentials as the monitoring system. No notification is sent if monitoring SMTP is not configured.
Digest task
EquipeBaie_WeeklyReport is a dedicated Windows Task Scheduler entry that sends the monitoring digest. It is fully independent of the main scheduler — it is created automatically every time the app starts (whether or not the user has enabled the invoice scheduler), and it survives enabling/disabling the main task.
Its schedule is controlled entirely by monitoring.json:
| Setting | Controls |
|---|---|
digest_frequency |
"daily" / "weekly" / "monthly" / "none" |
digest_weekday |
Day of the week (weekly only) |
digest_time |
Fire time (HH:MM), e.g. "08:00" — not tied to the main scheduler time |
Setting "digest_frequency": "none" removes the digest task. Any other change to these three keys takes effect the next time the app is launched (the startup check re-registers if the task is missing or was manually deleted).
Uninstall: the Windows uninstaller removes both EquipeBaie and EquipeBaie_WeeklyReport from Task Scheduler.
You can also trigger the digest manually:
EquipeBaie.exe --weekly-report
Machine identity in emails
Every monitoring email (both success reports and error alerts) includes a Machine field showing user@hostname — the Windows login name and computer name of the machine that sent the email. This makes it immediately clear which installation triggered the notification when the app runs on more than one machine.
Per-run log files
Every execution (scheduled or manual) creates a log file at:
~/.equipe_baie/logs/runs/RUN-YYYYMMDD-HHMMSS-{SCHEDULER|MANUAL}.log
Log files are cleaned up automatically on startup after log_retention_days days (default: 30).
Application data locations
| Data | Default location |
|---|---|
Config (VAT rate, folder paths, scheduler, notification_email, etc.) |
~/.equipe_baie/config.json |
| Client reference registry | ~/.equipe_baie/clients.json |
| Processed-file manifest | ~/.equipe_baie/processed.json |
| Run history | ~/.equipe_baie/run_history.json (last 100 runs, all triggers) |
| Application logs | ~/.equipe_baie/logs/ebia-{machine-name}.log — one file per registered machine (5 MB × 3 rotating backups) |
| Per-run logs | ~/.equipe_baie/logs/runs/RUN-*.log (kept log_retention_days days, default 30) |
| Monitoring credentials | ~/.equipe_baie/.env (create from .env.example template to enable) |
| Monitoring behaviour | ~/.equipe_baie/monitoring.json (created automatically on first launch) |
| Generated Excel reports | ~/Documents/Equipe_Baie/Rapports/ (configurable in Config view) |
| Archived PDFs | ~/Documents/Equipe_Baie/factures_archives/ (configurable in Config view) |
| Workspace pointer | ~/.equipe_baie/workspace.txt (only present when using a shared workspace) |
On Windows ~ resolves to C:\Users\<user>, on Linux/macOS to /home/<user>.
Note: When a shared workspace is configured (see below), all paths in the table above point into the shared folder instead of
~/.equipe_baie/. Theworkspace.txtfile itself always stays on the local machine.
Accounting counters
The engine tracks one numbering sequence (Pièce) and persists it in config.json. Document numbers are not managed by the engine — they come directly from each PDF's invoice_id field.
| Key | Resets | Description |
|---|---|---|
piece_counters |
exercise_start_month each year |
Per-month last-used Pièce number. Continues across all months of the accounting exercise and resets to 1 when the exercise restarts. |
exercise_start_month |
— | Month number (1–12) when the accounting exercise begins. Default: 11 (November). Change it in Config → Exercice comptable. |
id_piece |
— | One-time seed for the Pièce counter — use it if installing mid-exercise to continue from an existing number. Auto-resets to 1 after first use. |
Document numbers are extracted from the invoice_id embedded in each PDF (format YY-MM-XXXXX). The XXXXX part is used as-is as the Document number — exactly as EBP assigned it, with no modification by the engine. Invoices without a readable invoice_id are rejected.
Example — client exercise starts November 1:
exercise_start_month = 11
Nov 2025: Pièce 1, 2, 3 Doc read from invoice_id (e.g. 00001, 00002, 00003)
Dec 2025: Pièce 4, 5 Doc read from invoice_id (e.g. 00004, 00005)
Jan 2026: Pièce 6, 7 Doc read from invoice_id (EBP resets: 00001, 00002)
...
Oct 2026: Pièce N Doc from invoice_id
Nov 2026: Pièce 1, 2 ... Doc from invoice_id ← Pièce resets, Doc unchanged
Multi-machine workspace (shared drive)
Two machines can share the same configuration, client registry, and processed-file manifest by pointing both installations to a single folder on a shared drive.
Prerequisites: the two machines must never run the app simultaneously, and the shared drive must be reachable before launching the app.
Setup — machine 1 (existing installation)
- Open Config → Dossier de travail → click Changer
- Pick the shared drive folder (e.g.
Z:\EquipeBaie\) - The app copies your existing data (config, clients, manifest,
.env,monitoring.json) to the shared folder and writes a localworkspace.txtpointer - Confirm the restart prompt — the app reloads from the shared folder
Setup — machine 2 (fresh installation)
- Install the app normally
- Open Config → Dossier de travail → click Changer
- Pick the same shared drive folder (
Z:\EquipeBaie\) - The app writes
workspace.txtlocally; existing files in the shared folder are not overwritten - Confirm the restart prompt
Both machines now read and write the same config.json, clients.json, processed.json, and monitoring files. The workspace pointer (workspace.txt) is machine-local and is the only file that differs between the two machines.
Installation
Windows — client machines
Download EquipeBaie_Setup.exe from the latest GitHub Release.
Before running the installer, unblock the downloaded file to prevent Windows SmartScreen from prompting. Open PowerShell and run:
Unblock-File -Path "$env:USERPROFILE\Downloads\EquipeBaie_Setup.exe"
Then double-click EquipeBaie_Setup.exe. The installer will:
- Install the application to
Program Files\EquipeBaie - Silently import the code-signing certificate into the Windows Trusted Root store
- Create a Start Menu shortcut (and optionally a Desktop shortcut)
- Register an uninstaller in Programs & Features
After the first installation, all future updates signed with the same certificate will run without any SmartScreen warning — no Unblock-File step needed again.
Library only (CLI usage)
pip install ebia
Development (library + UI + tests)
git clone https://github.com/Alamajdoub9/EquipeBaie_Freelance-project.git
cd EquipeBaie_Freelance-project
python -m venv .venv
source .venv/bin/activate # Linux / macOS
# .venv\Scripts\Activate.ps1 # Windows PowerShell
pip install -e ".[dev,app]"
Running the application
ebia-ui
# or
python -m ebia_ui.main
On first launch:
- Go to Configurations and set the invoice folder path (where your PDFs are stored)
- Optionally adjust the VAT rate, the exercise start month, the reports output folder, and the archive folder
- Go to Clients to review the pre-loaded client list and add or edit entries as needed
- Go to Exécution and click Lancer maintenant
Generated Excel files are written to the reports folder configured in the Config view (default: ~/Documents/Equipe_Baie/Rapports/).
To archive processed PDFs after a run, go to Configurations → Archivage des Factures and click Archiver les factures traitées. Only PDFs already recorded in the manifest are moved. After archiving, any sub-folders left empty inside the invoice folder are removed automatically.
CLI usage (library only)
# Parse a single PDF and print extracted fields
ebia --path facture.pdf
# Generate Excel from a single PDF (explicit piece and document numbers)
ebia --path facture.pdf --out output.xlsx --piece 0001 --document 00042
# Generate Excel from a folder of PDFs (doc numbers read from each invoice_id)
ebia --path ./invoices --out ./reports --start-piece 1
Running tests
# All tests
pytest
# By level
pytest -m unit
pytest -m integration
pytest -m e2e
# With coverage
pytest --cov=src/ebia --cov=src/ebia_ui/core --cov-report=term-missing
Test matrix:
| Suite | What it covers |
|---|---|
unit/test_parser.py |
PDF field extraction logic, parse_many contract (3-tuple return, log records) |
unit/test_generator.py |
Excel row generation and formatting |
unit/test_manifest.py |
SHA-256 manifest: load, save, deduplication |
unit/test_ebia_ui_core.py |
ConfigManager, ClientManager, RunEngine error paths and counter carry-over logic |
integration/test_engine.py |
Full pipeline: PDFs → parse → enrich → xlsx |
integration/test_generator_xlsx.py |
Multi-month/multi-day workbook structure |
integration/test_parser_pdf.py |
Real PDF corpus parsing |
e2e/test_cli.py |
CLI invoked as subprocess |
Delivery
The project has two deliverables released together by the same workflow.
1. ebia PyPI package
pip install ebia
Published automatically on every release. The ebia_ui package is excluded from the wheel (internal app, not a public library).
2. Windows installer (EquipeBaie_Setup.exe)
Built via PyInstaller (onedir) + Inno Setup on a Windows runner and attached to the GitHub Release alongside the wheel. No Python installation required on the target machine.
# Build the app folder locally (requires PyInstaller)
python scripts/build_exe.py
# produces: dist/EquipeBaie/EquipeBaie.exe
# Build the installer locally (requires Inno Setup installed)
iscc /DMyAppVersion=x.y.z installer\EquipeBaie.iss
# produces: dist/EquipeBaie_Setup.exe
Code signing
The Windows installer and executable are signed with a self-signed certificate (installer/equipebaie.cer). The certificate is automatically imported into the client machine's Trusted Root store during installation, so no SmartScreen warnings appear on subsequent runs.
The private key (.pfx) is stored as a GitHub Actions secret (CODESIGN_PFX_B64) and never committed to the repository. The public certificate (.cer) is committed and bundled into the installer.
To regenerate the certificate (e.g. after expiry in 2031):
- Run
New-SelfSignedCertificateas described in the signing setup guide - Export the new
.pfxand.cer - Update the
CODESIGN_PFX_B64andCODESIGN_PFX_PASSWORDGitHub secrets - Replace
installer/equipebaie.cerwith the new.cer - Re-run the one-time
Import-Certificatestep on each client machine
Versioning and release
Both deliverables share a single version number defined in pyproject.toml. The about view reads it dynamically via importlib.metadata — no manual update needed.
To publish a new release, trigger the Release workflow from the GitHub Actions UI and pick the bump type:
| Input | Effect | Example |
|---|---|---|
patch (default) |
Bug fixes | 0.2.0 → 0.2.1 |
minor |
New features, backwards-compatible | 0.2.0 → 0.3.0 |
major |
Breaking changes | 0.2.0 → 1.0.0 |
The workflow will:
- Bump the version in
pyproject.tomland push achore: bump version to X.Y.Zcommit tomain - Run the full test suite (gate)
- Build and publish the wheel to PyPI
- Create a GitHub Release with the wheel attached
- Build
EquipeBaie/(onedir) with PyInstaller on a Windows runner - Sign
EquipeBaie.exewith the code-signing certificate - Build
EquipeBaie_Setup.exewith Inno Setup - Sign the installer
- Attach
EquipeBaie_Setup.exeto the GitHub Release
Requirements
- Python >= 3.12
- Library deps:
pdfplumber,openpyxl - UI extra deps:
customtkinter >= 5.2.0,Pillow >= 10.0.0,watchdog >= 4.0,python-dotenv >= 1.0 - Dev deps:
pytest,pytest-cov,pytest-mock,ruff,mypy
watchdogandpython-dotenvare bundled in the Windows installer. They are only active when~/.equipe_baie/.envis present and correctly filled. Monitoring behaviour is controlled by~/.equipe_baie/monitoring.json, which is created automatically on first launch.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ebia-0.6.5-py3-none-any.whl.
File metadata
- Download URL: ebia-0.6.5-py3-none-any.whl
- Upload date:
- Size: 21.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc46f6c2a4cb20ddce2a7f65c545420e9b815c0071257bbc390883e91b62c753
|
|
| MD5 |
0b02cc003784169f3babaca2a24d4db5
|
|
| BLAKE2b-256 |
62fa3ccc146bb8dc0ab9dcb4ea557a873c7ba6092dc1848981606ab24ce02cba
|