Skip to main content

A Python package for the life sciences to conduct hypothesis testing from CSV files.

Project description

csv-stats

Python package for rapid hypothesis testing on CSV files with data in long table format, which are common in the life sciences. Test results are saved to PDF as a rendered JSON string, or can be returned as a Python dictionary.

Installation

pip install csv-stats

Examples

All code examples below use the following constants:

DATA_PATH = "path/to/data.csv" # Path to your CSV file
DATA_COLUMN = 'values' # The column to run the hypothesis tests on
GROUP_COLUMN = 'groups' # Grouping variable (i.e. statistical factor)
REPEATED_MEASURES_COLUMN = 'subject_id' # Column indicating repeated measures (e.g. subject IDs)

ANOVA

One way ANOVA is currently supported. They include tests of homogeneity of variance and normality of residuals. Repeated measures ANOVA is also supported, including tests of sphericity.

NOTE: Two- and three-way ANOVA support is planned, but not yet implemented.

from csv_stats.anova import anova1way

# One way ANOVA, independent samples
result_anova1way = anova1way(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN,                            
                            filename = "anova1way_results.pdf", # Default save name. Enter `None` to not save.
                            render_plot = False # For speed, by default no plots are generated
                        )

# One way ANOVA, repeated measures
result_anova1way_rm = anova1way(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN,                             
                            repeated_measures_column = REPEATED_MEASURES_COLUMN,
                            filename = "anova1way_results.pdf", # Default save name. Enter `None` to not save.     
                            render_plot = False # For speed, by default no plots are generated                       
                        )

t-test

Both independent samples (one and two samples) and paired samples t-tests are supported. They include tests of homogeneity of variance and normality of residuals.

from csv_stats.ttest import ttest_ind, ttest_dep

# Independent samples t-test 
# Two sample when the GROUP_COLUMN has two groups
# One smaple when the GROUP_COLUMN has one group
result_ttest_ind = ttest_ind(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN,
                            popmean = 0, # Test against a population mean of 0 (default)
                            filename = "ttest_ind_results.pdf", # Default save name. Enter `None` to not save.
                            render_plot = False # For speed, by default no plots are generated
                        )

# Paired samples t-test
result_ttest_rel = ttest_dep(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN, 
                            repeated_measures_column = REPEATED_MEASURES_COLUMN,
                            filename = "ttest_dep_results.pdf", # Default save name. Enter `None` to not save.
                            render_plot = False # For speed, by default no plots are generated
                        )

Multiple Data Columns

If you have multiple columns of data that you want to run the same test on, you can specify the data_column argument as "_". This will automatically loop over all columns. Note that to save these results to a file, the filename should be an f-string containing f"{data_column}". The test function will replace data_column with the column name in the file name.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv_stats-0.1.9.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csv_stats-0.1.9-py3-none-any.whl (17.0 kB view details)

Uploaded Python 3

File details

Details for the file csv_stats-0.1.9.tar.gz.

File metadata

  • Download URL: csv_stats-0.1.9.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for csv_stats-0.1.9.tar.gz
Algorithm Hash digest
SHA256 86a2d5d9e54a1c26db2840e82939c8e3a2d424a17e6bef940f5bc8279b411264
MD5 745a2617c3762303a498a2413284dd3f
BLAKE2b-256 a4623d5d7727c279f8ea7aa04920d02ab8d6e0ae4400361f57903256cee514d6

See more details on using hashes here.

File details

Details for the file csv_stats-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: csv_stats-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 17.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for csv_stats-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 cfce2a65441bcb9105fc5369e675b840ac25905d88cafb0be0431a5b58dc63b6
MD5 3fabfd9d32bb7ecce5096e23eff64c7b
BLAKE2b-256 f5a9132e2d9a009cc29bc45f92757a6da73f201aca1ab9d691a581fa9743d3d9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page