Skip to main content

Python companion for ModernDive: a tidy simulation-inference grammar, regression helpers, datasets, and dual-engine plotly/plotnine plots

Project description

moderndive (Python)

ModernDive hex logo

Tests codecov Docs License: MIT

The Python companion package for ModernDive: Statistical Inference via Data Science — a faithful port of the R moderndive and infer packages to a modern Python data-science stack (polars, plotly, plotnine, statsmodels).

📖 Documentation (with runnable examples): https://moderndive.readthedocs.io

It is intentionally pure-Python (no compiled extensions) so it installs under Pyodide via micropip for in-browser execution.

Installation

pip install moderndive          # from PyPI (once published)
# or, from source:
pip install git+https://github.com/moderndive/moderndive-python

What’s inside

  • A tidy simulation-inference grammar mirroring R infer: specify → hypothesize → generate → calculate, plus fit() for multiple regression, observe(), and assume() (theoretical t/z/F/Chisq). specify() is also available as a DataFrame method, so you can write df.specify(...) just like R’s df %>% specify(...). calculate(stat=...) takes the full infer vocabulary or any custom callable test statistic. Summaries via get_p_value / get_confidence_interval (percentile, SE, bias-corrected); British-spelling and short aliases included.
  • Dual-engine plots: visualize / shade_p_value / shade_confidence_interval (and every plot helper) take engine="plotly" (default, interactive) or engine="plotnine" — same code, your choice of output.
  • Theory-based wrapper tests: t_test, prop_test, chisq_test, t_stat, chisq_stat, plus the moderndive.theory module.
  • Regression & summary helpers mirroring R moderndive: get_regression_table, get_regression_points, get_regression_summaries, get_correlation, pop_sd, tidy_summary, count_missing (built on statsmodels where relevant, returning polars frames), plus the model plots gg_parallel_slopes / geom_parallel_slopes and gg_categorical_model / geom_categorical_model, and pairplot (the GGally::ggpairs analog).
  • Sampling: rep_slice_sample / rep_sample_n for sampling-distribution activities.
  • 58 datasets: load_*() loaders returning polars DataFrames (the moderndive/infer, nycflights23, gapminder, ISLR2, and FiveThirtyEight datasets used in the book).

Quick start

Are tracks more likely to be popular in metal than in deep house? Compute the observed difference in “popular” rates, then permute the genre labels 1000 times to build a null distribution and read off a p-value.

import moderndive as md
from moderndive import get_p_value, visualize, shade_p_value

spotify = md.load_spotify_metal_deephouse()

# Observed difference in popularity rates (metal − deep house)
obs = (
    spotify
    .specify(formula="popular_or_not ~ track_genre", success="popular")
    .calculate(stat="diff in props", order=("metal", "deep-house"))
)
obs
ObservedStatistic(stat='diff in props', value=0.034)
# Permutation null distribution + p-value
null = (
    spotify
    .specify(formula="popular_or_not ~ track_genre", success="popular")
    .hypothesize(null="independence")
    .generate(reps=1000, type="permute", seed=76)
    .calculate(stat="diff in props", order=("metal", "deep-house"))
)
print(get_p_value(null, obs_stat=obs, direction="right"))
shape: (1, 1)
┌─────────┐
│ p_value │
│ ---     │
│ f64     │
╞═════════╡
│ 0.075   │
└─────────┘
# Visualize — interactive plotly by default; engine="plotnine" for ggplot-style
visualize(null) + shade_p_value(obs_stat=obs, direction="right")

Development

This repo uses uv.

uv sync --extra dev          # create the environment
make test                    # run the test suite (enforces 100% coverage)
make readme                  # re-render README.md from README.qmd (needs Quarto)
make build-data              # rebuild the bundled Parquet datasets (needs R; see tools/)
make build                   # build the wheel/sdist

The test suite is held at 100% statement coverage (enforced in CI via --cov-fail-under=100). Releases are automated on v* tags — see RELEASING.md.

License

MIT. The ModernDive book content is licensed CC-BY-NC-SA 4.0; this software package is MIT-licensed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moderndive-0.1.0.tar.gz (9.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

moderndive-0.1.0-py3-none-any.whl (9.7 MB view details)

Uploaded Python 3

File details

Details for the file moderndive-0.1.0.tar.gz.

File metadata

  • Download URL: moderndive-0.1.0.tar.gz
  • Upload date:
  • Size: 9.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for moderndive-0.1.0.tar.gz
Algorithm Hash digest
SHA256 727029eb5cbdcf5501bf012f7e7230f9f2dcf747abbc22c09b4ed58557683aa4
MD5 ab633120b92c010f000c56eb445d9b1a
BLAKE2b-256 7766304b7dc656a4ad2a6177bdfbe70790e252c0fd150f88a30e53ca687b5602

See more details on using hashes here.

Provenance

The following attestation bundles were made for moderndive-0.1.0.tar.gz:

Publisher: release.yml on moderndive/moderndive-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file moderndive-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: moderndive-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for moderndive-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 10f15fbed9a3bf4c2f61cc4332a5f0da9d9705cb1f9d145483a4f1eb9769dc93
MD5 202ec537097d6ad283ce858461079283
BLAKE2b-256 fd21fed078b920d478aad34e61197658d5e0a2e3a84f02ee7d49fdeb46ff43f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for moderndive-0.1.0-py3-none-any.whl:

Publisher: release.yml on moderndive/moderndive-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page