Skip to main content

Computation of confidence intervals for binomial proportions and for difference of binomial proportions.

Project description

Confidence Intervals for Difference of Binomial Proportions

pytest random-test codecov PyPI RTD Status gh-page status downloads license GitHub Release Date - Published_At GitHub commits since latest release (by SemVer including pre-releases) Streamlit App

Computation of confidence intervals for binomial proportions and for difference of binomial proportions.

[GitHub Pages] [Read the Docs]

:rocket: NEW :rocket: Streamlit support! See here for an app deployed on Streamlit Community Cloud.

Installation

Run

python -m pip install diff-binom-confint

or install the latest version in GitHub using

python -m pip install git+https://github.com/DeepPSP/DBCI.git

or git clone this repository and install locally via

cd DBCI
python -m pip install .

Numba accelerated version

Install using

python -m pip install diff-binom-confint[acc]

Usage examples

from diff_binom_confint import compute_difference_confidence_interval

n_positive, n_total = 84, 101
ref_positive, ref_total = 89, 105

confint = compute_difference_confidence_interval(
    n_positive,
    n_total,
    ref_positive,
    ref_total,
    conf_level=0.95,
    method="wilson",
)

Implemented methods

Confidence intervals for binomial proportions

Click to view!
Method (type) Implemented
wilson :heavy_check_mark:
wilson-cc :heavy_check_mark:
wald :heavy_check_mark:
wald-cc :heavy_check_mark:
agresti-coull :heavy_check_mark:
jeffreys :heavy_check_mark:
clopper-pearson :heavy_check_mark:
arcsine :heavy_check_mark:
logit :heavy_check_mark:
pratt :heavy_check_mark:
witting :heavy_check_mark:
mid-p :heavy_check_mark:
lik :heavy_check_mark:
blaker :heavy_check_mark:
modified-wilson :heavy_check_mark:
modified-jeffreys :heavy_check_mark:

Confidence intervals for difference of binomial proportions

Click to view!
Method (type) Implemented
wilson :heavy_check_mark:
wilson-cc :heavy_check_mark:
wald :heavy_check_mark:
wald-cc :heavy_check_mark:
haldane :heavy_check_mark:
jeffreys-perks :heavy_check_mark:
mee :heavy_check_mark:
miettinen-nurminen :heavy_check_mark:
true-profile :heavy_check_mark:
hauck-anderson :heavy_check_mark:
agresti-caffo :heavy_check_mark:
carlin-louis :heavy_check_mark:
brown-li :heavy_check_mark:
brown-li-jeffrey :heavy_check_mark:
miettinen-nurminen-brown-li :heavy_check_mark:
exact :x:
mid-p :x:
santner-snell :x:
chan-zhang :x:
agresti-min :x:
wang :x:
pradhan-banerjee :x:

Creating report

One can use the make_risk_report function to create a report of the confidence intervals for difference of binomial proportions.

from diff_binom_confint import make_risk_report

# df_train and df_test are pandas.DataFrame providing the data
table = make_risk_report((df_train, df_test), target = "binary_target")
# or if df_data is a pandas.DataFrame containing both training and testing data
table = make_risk_report(df_data, target = "binary_target")

For more details, see corresponding documenation. The produced table is similar to the following:

Click to view!

risk report

References

  1. SAS
  2. PASS
  3. statsmodels.stats.proportion
  4. scipy.stats._binomtest
  5. corplingstats
  6. DescTools.StatsAndCIs
  7. Newcombee

NOTE

Reference 1 has errors in the description of the methods Wilson CC, Mee, Miettinen-Nurminen. The correct computation of Wilson CC is given in Reference 5. The correct computation of Mee, Miettinen-Nurminen are given in the code blocks in Reference 1

Test data

Test data are

  1. taken (with slight modification, e.g. the upper_bound of miettinen-nurminen-brown-li method in the edge case file) from Reference 1 for automatic test of the correctness of the implementation of the algorithms.

  2. generated using DescTools.StatsAndCIs via

    library("DescTools")
    library("data.table")
    
    results = data.table()
    for (m in c("wilson", "wald", "waldcc", "agresti-coull", "jeffreys",
                    "modified wilson", "wilsoncc", "modified jeffreys",
                    "clopper-pearson", "arcsine", "logit", "witting", "pratt",
                    "midp", "lik", "blaker")){
        ci = BinomCI(84,101,method = m)
        new_row = data.table("method" = m, "ratio"=ci[1], "lower_bound" = ci[2], "upper_bound" = ci[3])
        results = rbindlist(list(results, new_row))
    }
    fwrite(results, "./test/test-data/example-84-101.csv")  # with manual slight adjustment of method names
    
  3. taken from Reference 7 (Table II).

The filenames has the following pattern:

# for computing confidence interval for difference of binomial proportions
"example-(?P<n_positive>[\\d]+)-(?P<n_total>[\\d]+)-vs-(?P<ref_positive>[\\d]+)-(?P<ref_total>[\\d]+)\\.csv"

# for computing confidence interval for binomial proportions
"example-(?P<n_positive>[\\d]+)-(?P<n_total>[\\d]+)\\.csv"

Note that the out-of-range values (e.g. > 1) are left as empty values in the .csv files.

Known Issues

  1. Edge cases incorrect for the method true-profile.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diff_binom_confint-0.0.17.tar.gz (26.6 kB view details)

Uploaded Source

Built Distribution

diff_binom_confint-0.0.17-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file diff_binom_confint-0.0.17.tar.gz.

File metadata

  • Download URL: diff_binom_confint-0.0.17.tar.gz
  • Upload date:
  • Size: 26.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for diff_binom_confint-0.0.17.tar.gz
Algorithm Hash digest
SHA256 4a1f1e066ed0e5f1bf999f422444c0020c3dd69cb029bba07c674b03a00b48c9
MD5 bf82303854c39ee7121393e184152a3e
BLAKE2b-256 5fb02c0e8509fac3647aadf019568db38314213b068b099f23393bfa4ab1b799

See more details on using hashes here.

File details

Details for the file diff_binom_confint-0.0.17-py3-none-any.whl.

File metadata

File hashes

Hashes for diff_binom_confint-0.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 e35ce25253f77a9cc108423e4dc4a9b3d5cd354ff0c4e399b46cb8fbd3f45802
MD5 d46e650c24d3f04cdce7a41b2f1547f0
BLAKE2b-256 92d68fe2ebba88dc7be2c0d791fb7d76895a313e0dcfdb60087ca10c98eeed67

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page