Skip to main content

Automated Bayesian-frequentist statistics and publication-ready reports

Project description

StatForge

Automated Bayesian-frequentist statistics and publication-ready reports.

StatForge Demo

StatForge is an open-source Python library and command-line interface designed to automate statistical analysis and generate publication-ready reports. Built for academic researchers, biostatisticians, and data scientists, StatForge streamlines the process from raw data ingestion to formatted output (PDF, DOCX, HTML).

Overview

StatForge implements a robust six-stage execution pipeline:

  1. DataLoader: Ingests data from CSV, Excel, SPSS (.sav), and Parquet formats.
  2. AssumptionChecker: Performs statistical assumption checks (e.g., normality, homoscedasticity) utilizing a SHA-256 keyed caching layer (joblib.Memory) for optimized iterative checks.
  3. MethodSelector: Automatically ranks and selects appropriate tests based on data characteristics and assumption results.
  4. ModelFitter: Dispatches analysis to a plugin registry supporting both frequentist methods (SciPy, statsmodels) and Bayesian inference (PyMC).
  5. ResultFormatter: Structures statistical output including effect sizes for standardized reporting.
  6. ReportBuilder: Orchestrates the final document utilizing Jinja2 templates, generating APA or Vancouver styled tables, automated methods summaries, and figure captions.

Installation

pip install statforge

Quick Start

1. Interactive CLI Wizard

The easiest way to begin an analysis is via the interactive wizard. Navigate to your dataset and execute:

statforge run dataset.csv

The wizard will prompt you to:

  • Select the outcome variable.
  • Select grouping or predictor variables.
  • Choose a report style (e.g., APA7).

2. Validating Data Quality

Before running a full analysis, generate a data quality report to flag missing values, outliers, or type mismatches:

statforge validate dataset.csv

3. Generating a Configuration File

For reproducible analyses, generate a configuration scaffold:

statforge config

This creates a statforge_config.yaml file that you can customize and version control.

Bayesian Analysis & PriorAdvisor

StatForge lowers the barrier to Bayesian analysis through its PriorAdvisor module.

  • Guided Priors: PriorAdvisor suggests data-driven, weakly informative priors (e.g., assigning a Normal distribution with $\mu$ equal to the observed mean and $\sigma$ equal to twice the observed standard deviation).
  • Transparency: The rationale for the selected priors is clearly documented and included in the generated report's methodology section.
  • Sensitivity Analysis: The pipeline automatically evaluates posterior stability across weakly informative, uninformative, and highly informative prior variants to ensure robustness.

Model Plugin Registry

StatForge utilizes a @register decorator pattern, allowing seamless integration of custom analytical models. Users can drop custom .py model definitions directly into ~/.statforge/plugins/, and they will be dynamically loaded by the pipeline. See CONTRIBUTING.md for details on writing custom plugins.

Cite StatForge

If you use StatForge in your research, please cite our JOSS paper (DOI pending). See paper/paper.md and paper/paper.bib for citation details.


Made by Samvardhan Singh. Licensed under the Apache License 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statforge-0.1.0.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

statforge-0.1.0-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file statforge-0.1.0.tar.gz.

File metadata

  • Download URL: statforge-0.1.0.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for statforge-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9940b2c422b453e66cfb55bd2855f66337526328ec024b2d20329291bb398716
MD5 29a4307171599952498d60b1de6bbd24
BLAKE2b-256 fa041b172829538931a8c7626adb5fad3b739615c185ce373899f4306c163f1b

See more details on using hashes here.

Provenance

The following attestation bundles were made for statforge-0.1.0.tar.gz:

Publisher: publish_pypi.yml on samvardhan03/statforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file statforge-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: statforge-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for statforge-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e61ade9dd80492307e5138491aca65147f1355eb6aaed31fde6ee48070510fb
MD5 ef89c20bbe226deb45d2e610b757d423
BLAKE2b-256 b9d9ef289e10f75e1b8f25cda6cc034904e6bbe517e70f2c4879a2505b83590a

See more details on using hashes here.

Provenance

The following attestation bundles were made for statforge-0.1.0-py3-none-any.whl:

Publisher: publish_pypi.yml on samvardhan03/statforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page