Skip to main content

A Python package for the life sciences to conduct hypothesis testing from CSV files.

Project description

csv-stats

Python package for rapid hypothesis testing on CSV files with data in long table format, which are common in the life sciences. Test results are saved to PDF as a rendered JSON string, or can be returned as a Python dictionary.

Installation

pip install csv-stats

Examples

All code examples below use the following constants:

DATA_PATH = "path/to/data.csv" # Path to your CSV file
DATA_COLUMN = 'values' # The column to run the hypothesis tests on
GROUP_COLUMN = 'groups' # Grouping variable (i.e. statistical factor)
REPEATED_MEASURES_COLUMN = 'subject_id' # Column indicating repeated measures (e.g. subject IDs)

ANOVA

One way ANOVA is currently supported. They include tests of homogeneity of variance and normality of residuals. Repeated measures ANOVA is also supported, including tests of sphericity.

NOTE: Two- and three-way ANOVA support is planned, but not yet implemented.

from csv_stats.anova import anova1way

# One way ANOVA, independent samples
result_anova1way = anova1way(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN,                            
                            filename = "anova1way_results.pdf", # Default save name. Enter `None` to not save.
                            render_plot = False # For speed, by default no plots are generated
                        )

# One way ANOVA, repeated measures
result_anova1way_rm = anova1way(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN,                             
                            repeated_measures_column = REPEATED_MEASURES_COLUMN,
                            filename = "anova1way_results.pdf", # Default save name. Enter `None` to not save.     
                            render_plot = False # For speed, by default no plots are generated                       
                        )

t-test

Both independent samples (one and two samples) and paired samples t-tests are supported. They include tests of homogeneity of variance and normality of residuals.

from csv_stats.ttest import ttest_ind, ttest_dep

# Independent samples t-test 
# Two sample when the GROUP_COLUMN has two groups
# One smaple when the GROUP_COLUMN has one group
result_ttest_ind = ttest_ind(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN,
                            popmean = 0, # Test against a population mean of 0 (default)
                            filename = "ttest_ind_results.pdf", # Default save name. Enter `None` to not save.
                            render_plot = False # For speed, by default no plots are generated
                        )

# Paired samples t-test
result_ttest_rel = ttest_dep(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN, 
                            repeated_measures_column = REPEATED_MEASURES_COLUMN,
                            filename = "ttest_dep_results.pdf", # Default save name. Enter `None` to not save.
                            render_plot = False # For speed, by default no plots are generated
                        )

Multiple Data Columns

If you have multiple columns of data that you want to run the same test on, you can specify the data_column argument as "_". This will automatically loop over all columns. Note that to save these results to a file, the filename should be an f-string containing f"{data_column}". The test function will replace data_column with the column name in the file name.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv_stats-0.1.5.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csv_stats-0.1.5-py3-none-any.whl (2.2 kB view details)

Uploaded Python 3

File details

Details for the file csv_stats-0.1.5.tar.gz.

File metadata

  • Download URL: csv_stats-0.1.5.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for csv_stats-0.1.5.tar.gz
Algorithm Hash digest
SHA256 1c3667859f398ecae6038f5ee8e4fa3df1921464eacdc9ebbf7ed16e3c7bbec2
MD5 b62e4b4c5f0dd9191dbe4a18c29b567c
BLAKE2b-256 42977d110480415a019d2f73c0066ef2b2d85fe8f1957df2a058abb75afea171

See more details on using hashes here.

File details

Details for the file csv_stats-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: csv_stats-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 2.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for csv_stats-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 61bb74c8ba1b0fc5a81d9a793f79ebcb99450860c370f105f5f4a2088c4f08bb
MD5 28304c7d6af09632a1a8de86369a7075
BLAKE2b-256 015b03cc926833b50132ec28fb8d69ac50d6345f5492a99a1350608439ce54d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page