Skip to main content

TableOne

Project description

https://travis-ci.org/tompollard/tableone.svg?branch=master https://zenodo.org/badge/DOI/10.5281/zenodo.837898.svg https://anaconda.org/conda-forge/tableone/badges/installer/conda.svg Documentation Status

tableone is a package for creating “Table 1” summary statistics for a patient population. It was inspired by the R package of the same name by Yoshida and Bohn.

Documentation

Documentation is available on readthedocs. An executable demonstration of the package is available on GitHub as a Jupyter Notebook. The easiest way to try out this notebook is to open it in Google Colaboratory.

Suggested citation

If you use tableone in your study, please cite the following paper:

Tom J Pollard, Alistair E W Johnson, Jesse D Raffa, Roger G Mark;
tableone: An open source Python package for producing summary statistics
for research papers, JAMIA Open, Volume 1, Issue 1, 1 July 2018, Pages 26–31,
https://doi.org/10.1093/jamiaopen/ooy012

Download the BibTex file from: https://academic.oup.com/jamiaopen/downloadcitation/5001910?format=bibtex

A note for users of tableone

While we have tried to use best practices in creating this package, automation of even basic statistical tasks can be unsound if done without supervision. We encourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization to ensure appropriate data handling.

It is beyond the scope of our documentation to provide detailed guidance on summary statistics, but as a primer we provide some considerations for choosing parameters when creating a summary table in our documentation.

Guidance should be sought from a statistician when using `tableone` for a research study, especially prior to submitting the study for publication.

Overview

At a high level, you can use the package as follows:

  • Import the data into a pandas DataFrame

Starting DataFrame
  • Run tableone on this dataframe to output summary statistics

Table 1
  • Specify your desired output format: text, latex, markdown, etc.

Export to LaTex

Additional options include:

  • Select a subset of columns.

  • Specify the data type (e.g. categorical, numerical, nonnormal).

  • Compute p-values, and adjust for multiple testing (e.g. with the Bonferroni correction).

  • Compute standardized mean differences (SMDs).

  • Provide a list of alternative labels for variables

  • Limit the output of categorical variables to the top N rows.

  • Display remarks relating to the appopriateness of summary measures (for example, computing tests for multimodality and normality).

Installation

To install the package with pip, run:

pip install tableone

To install this package with conda, run:

conda install -c conda-forge tableone

Example

  1. Import libraries:

    from tableone import TableOne, load_dataset
    import pandas as pd
  2. Load sample data into a pandas dataframe:

    data=load_dataset('pn2012')
  3. Optionally, a list of columns to be included in Table 1:

    columns = ['age','bili','albumin','ast','platelet','protime',
           'ascites','hepato','spiders','edema','sex', 'trt']
  4. Optionally, a list of columns containing categorical variables:

    categorical = ['ascites','hepato','edema','sex','spiders','trt']
  5. Optionally, a categorical variable for stratification and a list of non-normal variables:

    groupby = 'trt'
    nonnormal = ['bili']
  6. Create an instance of TableOne with the input arguments:

    mytable = TableOne(data, columns=columns, categorical=categorical,
                       groupby=groupby, nonnormal=nonnormal)
  7. Display the table using the tabulate method. The tablefmt argument allows the table to be displayed in multiple formats, including “github”, “grid”, “fancy_grid”, “rst”, “html”, and “latex”.:

    print(mytable.tabulate(tablefmt="github"))
  8. …which prints the following table to screen:

    Stratified by trt
                           1.0                2.0                 missing
    ---------------------  -----------------  -----------------  --------
    n                      158                154                     106
    time (mean (std))      2015.62 (1094.12)  1996.86 (1155.93)         0
    age (mean (std))       51.42 (11.01)      48.58 (9.96)              0
    bili (median [IQR])    1.40 [0.80,3.20]   1.30 [0.72,3.60]          0
    chol (mean (std))      365.01 (209.54)    373.88 (252.48)         134
    albumin (mean (std))   3.52 (0.44)        3.52 (0.40)               0
    copper (mean (std))    97.64 (90.59)      97.65 (80.49)           108
    alk.phos (mean (std))  2021.30 (2183.44)  1943.01 (2101.69)       106
    ast (mean (std))       120.21 (54.52)     124.97 (58.93)          106
    trig (mean (std))      124.14 (71.54)     125.25 (58.52)          136
    platelet (mean (std))  258.75 (100.32)    265.20 (90.73)           11
    protime (mean (std))   10.65 (0.85)       10.80 (1.14)              2
    status (n (%))                                                      0
    0                      83 (52.53)         85 (55.19)
    1                      10 (6.33)          9 (5.84)
    2                      65 (41.14)         60 (38.96)
    ascites (n (%))                                                   106
    0.0                    144 (91.14)        144 (93.51)
    1.0                    14 (8.86)          10 (6.49)
    hepato (n (%))                                                    106
    0.0                    85 (53.80)         67 (43.51)
    1.0                    73 (46.20)         87 (56.49)
    spiders (n (%))                                                   106
    0.0                    113 (71.52)        109 (70.78)
    1.0                    45 (28.48)         45 (29.22)
    edema (n (%))                                                       0
    0.0                    132 (83.54)        131 (85.06)
    0.5                    16 (10.13)         13 (8.44)
    1.0                    10 (6.33)          10 (6.49)
    stage (n (%))                                                       6
    1.0                    12 (7.59)          4 (2.60)
    2.0                    35 (22.15)         32 (20.78)
    3.0                    56 (35.44)         64 (41.56)
    4.0                    55 (34.81)         54 (35.06)
    sex (n (%))                                                         0
    f                      137 (86.71)        139 (90.26)
    m                      21 (13.29)         15 (9.74)
  9. Compute p values by setting the pval argument to True:

    mytable = TableOne(data, columns=columns, categorical=categorical,
                       groupby=groupby, nonnormal=nonnormal, pval=True)
  10. …which prints:

    Stratified by trt
                           1.0                2.0                 missing  pval    test
    ---------------------  -----------------  -----------------  --------  ------  --------------
    n                      158                154                     106
    time (mean (std))      2015.62 (1094.12)  1996.86 (1155.93)         0  0.883   One_way_ANOVA
    age (mean (std))       51.42 (11.01)      48.58 (9.96)              0  0.018   One_way_ANOVA
    bili (median [IQR])    1.40 [0.80,3.20]   1.30 [0.72,3.60]          0  0.842   Kruskal-Wallis
    chol (mean (std))      365.01 (209.54)    373.88 (252.48)         134  0.748   One_way_ANOVA
    albumin (mean (std))   3.52 (0.44)        3.52 (0.40)               0  0.874   One_way_ANOVA
    copper (mean (std))    97.64 (90.59)      97.65 (80.49)           108  0.999   One_way_ANOVA
    alk.phos (mean (std))  2021.30 (2183.44)  1943.01 (2101.69)       106  0.747   One_way_ANOVA
    ast (mean (std))       120.21 (54.52)     124.97 (58.93)          106  0.460   One_way_ANOVA
    trig (mean (std))      124.14 (71.54)     125.25 (58.52)          136  0.886   One_way_ANOVA
    platelet (mean (std))  258.75 (100.32)    265.20 (90.73)           11  0.555   One_way_ANOVA
    protime (mean (std))   10.65 (0.85)       10.80 (1.14)              2  0.197   One_way_ANOVA
    status (n (%))                                                      0  0.894   Chi-squared
    0                      83 (52.53)         85 (55.19)
    1                      10 (6.33)          9 (5.84)
    2                      65 (41.14)         60 (38.96)
    ascites (n (%))                                                   106  0.567   Chi-squared
    0.0                    144 (91.14)        144 (93.51)
    1.0                    14 (8.86)          10 (6.49)
    hepato (n (%))                                                    106  0.088   Chi-squared
    0.0                    85 (53.80)         67 (43.51)
    1.0                    73 (46.20)         87 (56.49)
    spiders (n (%))                                                   106  0.985   Chi-squared
    0.0                    113 (71.52)        109 (70.78)
    1.0                    45 (28.48)         45 (29.22)
    edema (n (%))                                                       0  0.877   Chi-squared
    0.0                    132 (83.54)        131 (85.06)
    0.5                    16 (10.13)         13 (8.44)
    1.0                    10 (6.33)          10 (6.49)
    stage (n (%))                                                       6  0.201   Chi-squared
    1.0                    12 (7.59)          4 (2.60)
    2.0                    35 (22.15)         32 (20.78)
    3.0                    56 (35.44)         64 (41.56)
    4.0                    55 (34.81)         54 (35.06)
    sex (n (%))                                                         0  0.421   Chi-squared
    f                      137 (86.71)        139 (90.26)
    m                      21 (13.29)         15 (9.74)
  11. Tables can be exported to file in various formats, including LaTeX, CSV, and HTML. Files are exported by calling the to_format method on the DataFrame. For example, mytable can be exported to an Excel spreadsheet named ‘mytable.xlsx’ with the following command:

    mytable.to_excel('mytable.xlsx')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tableone-0.7.9.tar.gz (37.3 kB view details)

Uploaded Source

Built Distribution

tableone-0.7.9-py2.py3-none-any.whl (32.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file tableone-0.7.9.tar.gz.

File metadata

  • Download URL: tableone-0.7.9.tar.gz
  • Upload date:
  • Size: 37.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for tableone-0.7.9.tar.gz
Algorithm Hash digest
SHA256 731618d3d1ebce4bf6b49cf737b9a7d4b5af7709976d6ef7cf16d3ee7a7e96b5
MD5 c00bbc774cc037ba0197c8263651189d
BLAKE2b-256 2aadf2783b20a1099e38fe313ef1192c31723145de0bc1bef46974144fac1f30

See more details on using hashes here.

File details

Details for the file tableone-0.7.9-py2.py3-none-any.whl.

File metadata

  • Download URL: tableone-0.7.9-py2.py3-none-any.whl
  • Upload date:
  • Size: 32.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for tableone-0.7.9-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 9d7e7c83191df621f3060617532dc0cc1e9a82665c54b8c8ab07734400ed2e65
MD5 3c4580f6df353fb758c1553124e20912
BLAKE2b-256 1130bfe62594ec89716094774f86584c53a709806de2882c57a0fca007d40b92

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page