Synthetic financial transaction data generation with persona-driven behavior simulation.
Project description
FinForge v2.0.0
FinForge is a Python library for generating realistic synthetic financial transaction datasets with persistent behavioral identity, temporal balance consistency, and reproducible cashflow simulation.
FinForge v2.0.0 expands the engine beyond student and salaried users into richer financial lives: business owners, freelancers, households, retired users, mixed-population datasets, irregular income, business cashflow, and quarterly tax activity.
Why FinForge v2.0.0 is different
FinForge is designed to simulate financial lives, not random rows.
- Persistent user identity: users carry stable behavioral traits such as
spending_style,merchant_loyalty,savings_tendency, andnight_activity_score. - Temporal financial rhythm: balances evolve chronologically across salaries, subscriptions, tax, business income, bills, and discretionary spending.
- Realistic behavioral adaptation: low-balance users suppress discretionary activity, while high-liquidity users spend more freely without becoming unrealistic.
- Mixed real-world personas: v2 includes consumer, household, freelance, retirement, and business cashflow behavior in one framework.
- Reproducible synthetic data: the same seed and config generate the same dataset, which makes FinForge useful for testing, QA, analytics, and benchmarking.
FinForge v2.0.0: Business & Irregular Income Simulation
New in v2:
business_ownerpersona with business income, vendor payments, payroll, office rent, professional services, business travel, tax, and personal spendingfreelancerpersona with irregular client/platform income and software/professional expenseshouseholdpersona with family-oriented groceries, healthcare, education, insurance, and utility behaviorretiredpersona with pension income, healthcare-heavy spending, and low discretionary intensitymixedpersona mode for heterogeneous datasets- irregular income engine with variable dates, amounts, and sources
- business cashflow engine with seasonal business income and quarterly tax payments
- business vs personal account simulation and flags
- expanded exported metadata for downstream testing and scenario analysis
Features
- Persona-driven user generation
- Persistent behavioral identity traits
- Deterministic seed reproducibility
- Balance-aware spending suppression
- Session-based transaction bursts
- Stable subscription recurrence
- Explicit overdraft metadata
- Merchant/category consistency
- Mixed persona simulation
- Irregular income generation
- Business cashflow and seasonality
- Quarterly tax events
- CSV export and pandas DataFrame output
Installation
pip install finforge
For local development:
pip install -e .[dev]
Quickstart
from finforge import DatasetGenerator
df = (
DatasetGenerator(seed=42)
.with_users(100)
.with_persona("salaried")
.for_months(6)
.generate()
)
print(df.head())
Business owner example:
from finforge import DatasetGenerator
df = (
DatasetGenerator(seed=42)
.with_users(5)
.with_persona("business_owner")
.for_months(6)
.generate()
)
Mixed persona example:
from finforge import DatasetGenerator
df = (
DatasetGenerator(seed=42)
.with_users(100)
.with_persona("mixed")
.for_months(6)
.generate()
)
Mixed mode includes all supported v2 personas:
studentsalariedfreelancerbusiness_ownerhouseholdretired
When user_count is at least the number of supported personas, FinForge guarantees at least one user per persona. Remaining users are assigned with a deterministic weighted distribution driven by the configured seed.
Student behavioral example with CSV export:
from finforge import DatasetGenerator
dataset = (
DatasetGenerator(seed=101)
.with_users(3)
.with_persona("student")
.for_months(2)
.generate()
)
dataset.to_csv("transactionsBehaviour.csv", index=False)
Supported personas
studentsalariedfreelancerbusiness_ownerhouseholdretiredmixed
Architecture overview
Core modules:
finforge.core: models, enums, configuration, constantsfinforge.personas: persona definitions and recurring behaviorfinforge.generators: user generation, scheduling, transaction generationfinforge.merchants: consumer and business merchant catalogsfinforge.exporters: CSV exportfinforge.dataset: fluent public API
Behavior modules:
identity.py: long-lived user behavioral traitsmerchant_affinity.py: preferred merchants and weighted reuseadaptive_spending.py: balance-aware and month-phase-aware spending controlsbudgeting.py: overspend memory and discretionary budget statesubscriptions.py: dedicated subscription assignmentoverdraft.py: explicit negative-balance policy decisionslifecycle.py: irregular income, household flows, business cashflow, and quarterly tax activitysessions.py: clustered transaction sessions
LLM-related runtime behavior is intentionally not implemented. Any future AI extension is expected to remain compatible with local Ollama-only architecture.
Behavioral identity engine
Every user exports stable identity metadata:
personaspending_stylesavings_tendencymerchant_loyaltyimpulse_buying_scorelifestyle_scorenight_activity_score
These fields are not cosmetic. They directly influence:
- transaction frequency
- merchant reuse
- late-night behavior
- session probability
- discretionary suppression
- category mix
Spending styles
FinForge uses four reusable spending styles across personas:
budget_consciouslifestyle_spenderminimalistimpulsive_student
Expected effects:
minimalist: fewest transactions, essentials-heavy, lower sessionsbudget_conscious: restrained discretionary behavior and strong balance sensitivitylifestyle_spender: higher food, shopping, entertainment, and weekend intensityimpulsive_student: burstier activity, more late-night behavior, weaker spending discipline
Subscription engine
Subscriptions are handled by a dedicated recurring system.
- Subscription merchants are separate from discretionary entertainment merchants
- Assigned subscriptions recur exactly once per month
- Subscription merchant and amount remain stable across months
- Subscription rows are marked with
is_subscription=True - Subscription rows are never session-linked discretionary entertainment noise
Supported subscription merchants include:
- Netflix
- Spotify
- Amazon Prime
- YouTube Premium
Balance-aware spending and overdrafts
FinForge does not generate independent random balances.
Each transaction updates the running balance chronologically:
balance_before + amount = balance_after
Behavior adapts to financial condition:
- low balance reduces discretionary probability and ticket sizes
- month-end stress suppresses entertainment and shopping
- overspending creates future pullback pressure
- overdrafts are either prevented or explicitly marked
Important metadata:
balance_stateis_overdraftoverdraft_amount
Business and irregular income simulation
FinForge v2 introduces non-salaried cashflow:
- freelancer income from clients and platforms such as Upwork, Fiverr, Stripe, and Razorpay
- business owner cashflow from client payments, settlements, vendor payments, inventory, payroll, and tax
- household secondary income or family transfers
- pension plus irregular interest/family support for retired users
Business owners also export:
account_typebusiness_contextis_business_transactionis_business_expenseis_tax_relatedis_vendor_paymentis_payrollseasonal_factor
Exported metadata columns
v1 metadata is preserved:
personaspending_stylesavings_tendencymerchant_loyaltyimpulse_buying_scorelifestyle_scorenight_activity_scoreis_recurringis_subscriptionis_discretionaryrecurrence_typesession_idday_typebalance_stateis_overdraftoverdraft_amount
v2 metadata adds:
income_sourceexpense_naturecashflow_typebusiness_contextaccount_typeis_business_transactionis_business_expenseis_tax_relatedis_vendor_paymentis_payrollseasonal_factor
Example scripts
Examples are available in examples:
- behavioral_generation.py
- business_owner_generation.py
- freelancer_generation.py
- household_generation.py
- retired_generation.py
- persona_comparison_v2.py
Testing guarantees
The test suite covers:
- balance integrity
- chronological ordering
- merchant/category consistency
- seed reproducibility
- subscription recurrence and stability
- low-balance suppression
- session calibration
- business cashflow generation
- freelancer irregular income
- quarterly tax behavior
- mixed persona generation
- backward compatibility for v1 personas
Run the suite with:
pytest
Roadmap
- richer scenario presets
- more regional merchant catalogs
- card vs bank-transfer vs wallet distinctions
- local Ollama-based narrative/explanation layers without changing the core simulation path
Contributing
Contributions are welcome, especially around:
- new persona modules
- merchant catalog expansion
- calibration improvements
- documentation
- test coverage
Typical development flow:
- Create a feature branch.
- Add or update tests.
- Run
pytest. - Open a pull request with a clear behavioral rationale.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file finforge-2.0.0.tar.gz.
File metadata
- Download URL: finforge-2.0.0.tar.gz
- Upload date:
- Size: 46.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe9e71c86071d57b8554887ed2e28957407b3318059e974963288a3a78fbf93d
|
|
| MD5 |
336b365b188397e85013def15987277d
|
|
| BLAKE2b-256 |
d89bdaa56fed40cd3b94cb83194bb8fe19c9bd0c7169336a98e18db9b4e837cc
|
Provenance
The following attestation bundles were made for finforge-2.0.0.tar.gz:
Publisher:
publish.yml on shivangis22/finforge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
finforge-2.0.0.tar.gz -
Subject digest:
fe9e71c86071d57b8554887ed2e28957407b3318059e974963288a3a78fbf93d - Sigstore transparency entry: 1591194028
- Sigstore integration time:
-
Permalink:
shivangis22/finforge@c1fdf2a6aee9831229486a88928cb5f85513382a -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/shivangis22
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c1fdf2a6aee9831229486a88928cb5f85513382a -
Trigger Event:
release
-
Statement type:
File details
Details for the file finforge-2.0.0-py3-none-any.whl.
File metadata
- Download URL: finforge-2.0.0-py3-none-any.whl
- Upload date:
- Size: 54.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3db8c24206d1aafaddf69392f6e76837793573a63c8236fc2349189826bb7800
|
|
| MD5 |
c5b743db2b0097aecf314d30622fb39f
|
|
| BLAKE2b-256 |
15702fcde48ad03ece8d6f193397b63f7719644dab1bf23dbf0fd4bc80de78af
|
Provenance
The following attestation bundles were made for finforge-2.0.0-py3-none-any.whl:
Publisher:
publish.yml on shivangis22/finforge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
finforge-2.0.0-py3-none-any.whl -
Subject digest:
3db8c24206d1aafaddf69392f6e76837793573a63c8236fc2349189826bb7800 - Sigstore transparency entry: 1591194038
- Sigstore integration time:
-
Permalink:
shivangis22/finforge@c1fdf2a6aee9831229486a88928cb5f85513382a -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/shivangis22
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c1fdf2a6aee9831229486a88928cb5f85513382a -
Trigger Event:
release
-
Statement type: