Skip to main content

Synthetic financial transaction data generation with persona-driven behavior simulation.

Project description

FinForge v2.0.0

FinForge is a Python library for generating realistic synthetic financial transaction datasets with persistent behavioral identity, temporal balance consistency, and reproducible cashflow simulation.

FinForge v2.0.0 expands the engine beyond student and salaried users into richer financial lives: business owners, freelancers, households, retired users, mixed-population datasets, irregular income, business cashflow, and quarterly tax activity.

Why FinForge v2.0.0 is different

FinForge is designed to simulate financial lives, not random rows.

  • Persistent user identity: users carry stable behavioral traits such as spending_style, merchant_loyalty, savings_tendency, and night_activity_score.
  • Temporal financial rhythm: balances evolve chronologically across salaries, subscriptions, tax, business income, bills, and discretionary spending.
  • Realistic behavioral adaptation: low-balance users suppress discretionary activity, while high-liquidity users spend more freely without becoming unrealistic.
  • Mixed real-world personas: v2 includes consumer, household, freelance, retirement, and business cashflow behavior in one framework.
  • Reproducible synthetic data: the same seed and config generate the same dataset, which makes FinForge useful for testing, QA, analytics, and benchmarking.

FinForge v2.0.0: Business & Irregular Income Simulation

New in v2:

  • business_owner persona with business income, vendor payments, payroll, office rent, professional services, business travel, tax, and personal spending
  • freelancer persona with irregular client/platform income and software/professional expenses
  • household persona with family-oriented groceries, healthcare, education, insurance, and utility behavior
  • retired persona with pension income, healthcare-heavy spending, and low discretionary intensity
  • mixed persona mode for heterogeneous datasets
  • irregular income engine with variable dates, amounts, and sources
  • business cashflow engine with seasonal business income and quarterly tax payments
  • business vs personal account simulation and flags
  • expanded exported metadata for downstream testing and scenario analysis

Features

  • Persona-driven user generation
  • Persistent behavioral identity traits
  • Deterministic seed reproducibility
  • Balance-aware spending suppression
  • Session-based transaction bursts
  • Stable subscription recurrence
  • Explicit overdraft metadata
  • Merchant/category consistency
  • Mixed persona simulation
  • Irregular income generation
  • Business cashflow and seasonality
  • Quarterly tax events
  • CSV export and pandas DataFrame output

Installation

pip install finforge

For local development:

pip install -e .[dev]

Quickstart

from finforge import DatasetGenerator

df = (
    DatasetGenerator(seed=42)
    .with_users(100)
    .with_persona("salaried")
    .for_months(6)
    .generate()
)

print(df.head())

Business owner example:

from finforge import DatasetGenerator

df = (
    DatasetGenerator(seed=42)
    .with_users(5)
    .with_persona("business_owner")
    .for_months(6)
    .generate()
)

Mixed persona example:

from finforge import DatasetGenerator

df = (
    DatasetGenerator(seed=42)
    .with_users(100)
    .with_persona("mixed")
    .for_months(6)
    .generate()
)

Mixed mode includes all supported v2 personas:

  • student
  • salaried
  • freelancer
  • business_owner
  • household
  • retired

When user_count is at least the number of supported personas, FinForge guarantees at least one user per persona. Remaining users are assigned with a deterministic weighted distribution driven by the configured seed.

Student behavioral example with CSV export:

from finforge import DatasetGenerator

dataset = (
    DatasetGenerator(seed=101)
    .with_users(3)
    .with_persona("student")
    .for_months(2)
    .generate()
)

dataset.to_csv("transactionsBehaviour.csv", index=False)

Supported personas

  • student
  • salaried
  • freelancer
  • business_owner
  • household
  • retired
  • mixed

Architecture overview

Core modules:

  • finforge.core: models, enums, configuration, constants
  • finforge.personas: persona definitions and recurring behavior
  • finforge.generators: user generation, scheduling, transaction generation
  • finforge.merchants: consumer and business merchant catalogs
  • finforge.exporters: CSV export
  • finforge.dataset: fluent public API

Behavior modules:

  • identity.py: long-lived user behavioral traits
  • merchant_affinity.py: preferred merchants and weighted reuse
  • adaptive_spending.py: balance-aware and month-phase-aware spending controls
  • budgeting.py: overspend memory and discretionary budget state
  • subscriptions.py: dedicated subscription assignment
  • overdraft.py: explicit negative-balance policy decisions
  • lifecycle.py: irregular income, household flows, business cashflow, and quarterly tax activity
  • sessions.py: clustered transaction sessions

LLM-related runtime behavior is intentionally not implemented. Any future AI extension is expected to remain compatible with local Ollama-only architecture.

Behavioral identity engine

Every user exports stable identity metadata:

  • persona
  • spending_style
  • savings_tendency
  • merchant_loyalty
  • impulse_buying_score
  • lifestyle_score
  • night_activity_score

These fields are not cosmetic. They directly influence:

  • transaction frequency
  • merchant reuse
  • late-night behavior
  • session probability
  • discretionary suppression
  • category mix

Spending styles

FinForge uses four reusable spending styles across personas:

  • budget_conscious
  • lifestyle_spender
  • minimalist
  • impulsive_student

Expected effects:

  • minimalist: fewest transactions, essentials-heavy, lower sessions
  • budget_conscious: restrained discretionary behavior and strong balance sensitivity
  • lifestyle_spender: higher food, shopping, entertainment, and weekend intensity
  • impulsive_student: burstier activity, more late-night behavior, weaker spending discipline

Subscription engine

Subscriptions are handled by a dedicated recurring system.

  • Subscription merchants are separate from discretionary entertainment merchants
  • Assigned subscriptions recur exactly once per month
  • Subscription merchant and amount remain stable across months
  • Subscription rows are marked with is_subscription=True
  • Subscription rows are never session-linked discretionary entertainment noise

Supported subscription merchants include:

  • Netflix
  • Spotify
  • Amazon Prime
  • YouTube Premium

Balance-aware spending and overdrafts

FinForge does not generate independent random balances.

Each transaction updates the running balance chronologically:

balance_before + amount = balance_after

Behavior adapts to financial condition:

  • low balance reduces discretionary probability and ticket sizes
  • month-end stress suppresses entertainment and shopping
  • overspending creates future pullback pressure
  • overdrafts are either prevented or explicitly marked

Important metadata:

  • balance_state
  • is_overdraft
  • overdraft_amount

Business and irregular income simulation

FinForge v2 introduces non-salaried cashflow:

  • freelancer income from clients and platforms such as Upwork, Fiverr, Stripe, and Razorpay
  • business owner cashflow from client payments, settlements, vendor payments, inventory, payroll, and tax
  • household secondary income or family transfers
  • pension plus irregular interest/family support for retired users

Business owners also export:

  • account_type
  • business_context
  • is_business_transaction
  • is_business_expense
  • is_tax_related
  • is_vendor_payment
  • is_payroll
  • seasonal_factor

Exported metadata columns

v1 metadata is preserved:

  • persona
  • spending_style
  • savings_tendency
  • merchant_loyalty
  • impulse_buying_score
  • lifestyle_score
  • night_activity_score
  • is_recurring
  • is_subscription
  • is_discretionary
  • recurrence_type
  • session_id
  • day_type
  • balance_state
  • is_overdraft
  • overdraft_amount

v2 metadata adds:

  • income_source
  • expense_nature
  • cashflow_type
  • business_context
  • account_type
  • is_business_transaction
  • is_business_expense
  • is_tax_related
  • is_vendor_payment
  • is_payroll
  • seasonal_factor

Example scripts

Examples are available in examples:

Testing guarantees

The test suite covers:

  • balance integrity
  • chronological ordering
  • merchant/category consistency
  • seed reproducibility
  • subscription recurrence and stability
  • low-balance suppression
  • session calibration
  • business cashflow generation
  • freelancer irregular income
  • quarterly tax behavior
  • mixed persona generation
  • backward compatibility for v1 personas

Run the suite with:

pytest

Roadmap

  • richer scenario presets
  • more regional merchant catalogs
  • card vs bank-transfer vs wallet distinctions
  • local Ollama-based narrative/explanation layers without changing the core simulation path

Contributing

Contributions are welcome, especially around:

  • new persona modules
  • merchant catalog expansion
  • calibration improvements
  • documentation
  • test coverage

Typical development flow:

  1. Create a feature branch.
  2. Add or update tests.
  3. Run pytest.
  4. Open a pull request with a clear behavioral rationale.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

finforge-2.0.0.tar.gz (46.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

finforge-2.0.0-py3-none-any.whl (54.2 kB view details)

Uploaded Python 3

File details

Details for the file finforge-2.0.0.tar.gz.

File metadata

  • Download URL: finforge-2.0.0.tar.gz
  • Upload date:
  • Size: 46.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for finforge-2.0.0.tar.gz
Algorithm Hash digest
SHA256 fe9e71c86071d57b8554887ed2e28957407b3318059e974963288a3a78fbf93d
MD5 336b365b188397e85013def15987277d
BLAKE2b-256 d89bdaa56fed40cd3b94cb83194bb8fe19c9bd0c7169336a98e18db9b4e837cc

See more details on using hashes here.

Provenance

The following attestation bundles were made for finforge-2.0.0.tar.gz:

Publisher: publish.yml on shivangis22/finforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file finforge-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: finforge-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 54.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for finforge-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3db8c24206d1aafaddf69392f6e76837793573a63c8236fc2349189826bb7800
MD5 c5b743db2b0097aecf314d30622fb39f
BLAKE2b-256 15702fcde48ad03ece8d6f193397b63f7719644dab1bf23dbf0fd4bc80de78af

See more details on using hashes here.

Provenance

The following attestation bundles were made for finforge-2.0.0-py3-none-any.whl:

Publisher: publish.yml on shivangis22/finforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page