CardQL — chat with your credit card statements. Fetch statements, normalize, export to CSV/SQLite, and query with a local LLM (Streamlit + Ollama).
Project description
CardQL
Chat with your credit card statements!
CardQL pulls credit card statement PDFs from your email (via IMAP), turns them into a single CSV and SQLite database, then runs a local text-to-SQL loop: small Qwen models through Ollama, with a Streamlit chat UI. No cloud API is required for parsing or querying.
What it does (three parts)
- Fetch — Connect to your mailbox over IMAP, match sender rules, download password-protected PDFs into
data/raw-pdfs/<bank>/<card>/. - Parse and export — Extract text, run bank-specific parsers (regex and layout heuristics), normalize rows, merge to
data/exports/master.csvandtransactions.sqlite. - Query — Natural language → SQL with validation, LangChain + Ollama, answers grounded on query results; or open a raw
sqlite3shell withcardql sql.
Real-world data work shows up everywhere: shaky email subjects and attachments, PDF engines that reflow tables differently per issuer, one-shot prompting for tiny models (tight JSON/SQL, retries), and SQL verification so you are not trusting free-form arithmetic from the LLM.
This runs inference locally with Qwen3.5-0.8B, Qwen3.5-4B, or Qwen3-Coder-30B (via Ollama) — models you can run completely on a laptop. I recommend the Qwen3.5-4B preset for balanced performance.
Stack
Python · pypdf · SQLite · LangChain · Ollama · Streamlit · local-first / on-device inference
Streamlit UI
Quick start
pip install cardql
cardql init
Edit the templates under .local/config/ — at least secrets.json and card_rules.json. See docs/CONFIG.md for the full reference (mailbox setup: docs/IMAP_SETUP.md).
cardql
That default command fetches (if IMAP is configured), normalizes PDFs, writes master CSV and SQLite, and opens the Streamlit UI (after Ollama setup). Flags include --force, -o (master CSV path), --no-open, --no-fetch, --skip-ollama, and --no-ui.
For a slower walkthrough, see docs/USER_GUIDE.md.
Commands
| Command | What it does |
|---|---|
cardql |
Full stack: init, optional IMAP fetch, normalize, master.csv + transactions.sqlite, open CSV, Ollama, Streamlit. |
cardql init |
Create .local/ and data/, write config templates (never overwrites existing files). |
cardql fetch |
Fetch new statement PDFs via IMAP into data/raw-pdfs/. |
cardql parse |
Normalize PDFs, merge to master.csv + SQLite, open CSV (--no-open to skip). Optional single PDF; with -o out.json writes JSON only for that file. |
cardql ollama |
Ensure Ollama is reachable and pull the default chat model. |
cardql ui |
Streamlit NL chat over transactions.sqlite. |
cardql sql |
Interactive sqlite3 on the transactions DB (default data/exports/transactions.sqlite). |
cardql check |
Run validation checks (month gaps between statements; more later). cardql check --gaps narrows to the gap check. |
Logging: CARDQL_LOG=DEBUG cardql for verbose output.
Environment: variables use the CARDQL_ prefix (e.g. CARDQL_OLLAMA_MODEL, CARDQL_OLLAMA_BASE_URL).
Documentation
| Doc | Why open it |
|---|---|
| docs/ARCHITECTURE.md | Pipeline, folders, bare cardql steps. |
| docs/LLM_QUERY.md | NL→SQL, retries, Qwen/Ollama, safety. |
| docs/PDF_PARSING.md | How parsers work; adding a bank or variant. |
| docs/CONFIG.md | card_rules.json, tags, optional app.json. |
| docs/IMAP_SETUP.md | Mailbox credentials and provider notes. |
| docs/USER_GUIDE.md | Short non-developer path. |
Email (IMAP)
CardQL fetches attachments over IMAP and tracks state under .local/state/. Host, folder, and credentials live in .local/config/ — see docs/IMAP_SETUP.md for provider-specific setup (app passwords, etc.).
Banks and cards
Configure .local/config/card_rules.json — one block per bank/card:
bank,card— underdata/raw-pdfs/<bank>/<card>/from_emails— sender addresses to matchpasswords— PDF passwords
Details: docs/CONFIG.md.
Parsers
Parsers live in src/cardql/parsers/banks/. Supported issuers today include Axis, HDFC, HSBC, ICICI, IndusInd, SBI. Adding another card or fixing a format: docs/PDF_PARSING.md.
Help improve CardQL
If your issuer bank is not covered or a statement layout breaks, a focused parser patch helps everyone. Open a pull request with a parser or tests against redacted samples, or open an issue to coordinate with the maintainers. Convention details live in CONTRIBUTING.md.
Security and privacy
Your statements and spend history are sensitive. CardQL keeps parsing and inference on your machine: PDFs and transactions.sqlite stay local; the NL path uses a small local model via Ollama so you are not uploading statement bytes to a third-party chat API. That is a different tradeoff than “send PDFs to a hosted LLM or document AI” — fewer moving parts, and no vendor fine-tuning on your data.
.local/ and data/ are gitignored by design. Do not commit credentials, PDFs, or databases.
Project layout
data/raw-pdfs/<bank>/<card>/— downloaded PDFsdata/normalized/<bank>/<card>/— per-statement JSONdata/exports/master.csv,transactions.sqlite— merged data for CSV tools and SQL.local/config/— secrets and rulessrc/cardql/—ingest/,parsers/,export/,query/,ui/,cli/
Architecture: docs/ARCHITECTURE.md · Contributing: CONTRIBUTING.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cardql-0.1.1.tar.gz.
File metadata
- Download URL: cardql-0.1.1.tar.gz
- Upload date:
- Size: 53.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5c44da095f95deee81c58c41f2a28198ca47ad9391130adaa148d3fcdb318d1
|
|
| MD5 |
23a941cee5551c5fc1c9baa432b7d4eb
|
|
| BLAKE2b-256 |
e529cc5e89e1320d40d70fdf594348205601b0773669be191c41d5d0071d1553
|
Provenance
The following attestation bundles were made for cardql-0.1.1.tar.gz:
Publisher:
workflow.yml on ananyaem/CardQL
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cardql-0.1.1.tar.gz -
Subject digest:
a5c44da095f95deee81c58c41f2a28198ca47ad9391130adaa148d3fcdb318d1 - Sigstore transparency entry: 1181291604
- Sigstore integration time:
-
Permalink:
ananyaem/CardQL@d54d6f8f0d4b1e2303cee70cf31bc40dc448dd06 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ananyaem
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@d54d6f8f0d4b1e2303cee70cf31bc40dc448dd06 -
Trigger Event:
release
-
Statement type:
File details
Details for the file cardql-0.1.1-py3-none-any.whl.
File metadata
- Download URL: cardql-0.1.1-py3-none-any.whl
- Upload date:
- Size: 65.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7322e1552aea67ed97a1eeeead2879f8ad3f7ac2836ec9dbaa6c09d3de9e9fd1
|
|
| MD5 |
ce08ac0d92fe6af3fca2b8cc9eda9afa
|
|
| BLAKE2b-256 |
67e9e12367d49e9befccfeb731e0afe338e41da9cd9df435a1f94aa15b10dbc8
|
Provenance
The following attestation bundles were made for cardql-0.1.1-py3-none-any.whl:
Publisher:
workflow.yml on ananyaem/CardQL
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cardql-0.1.1-py3-none-any.whl -
Subject digest:
7322e1552aea67ed97a1eeeead2879f8ad3f7ac2836ec9dbaa6c09d3de9e9fd1 - Sigstore transparency entry: 1181291615
- Sigstore integration time:
-
Permalink:
ananyaem/CardQL@d54d6f8f0d4b1e2303cee70cf31bc40dc448dd06 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ananyaem
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@d54d6f8f0d4b1e2303cee70cf31bc40dc448dd06 -
Trigger Event:
release
-
Statement type: