Skip to main content

A Python package for the life sciences to conduct hypothesis testing from CSV files.

Project description

csv-stats

Python package for rapid hypothesis testing on CSV files with data in long table format, which are common in the life sciences. Test results are saved to PDF as a rendered JSON string, or can be returned as a Python dictionary.

Installation

pip install csv-stats

Examples

All code examples below use the following constants:

DATA_PATH = "path/to/data.csv" # Path to your CSV file
DATA_COLUMN = 'values' # The column to run the hypothesis tests on
GROUP_COLUMN = 'groups' # Grouping variable (i.e. statistical factor)
REPEATED_MEASURES_COLUMN = 'subject_id' # Column indicating repeated measures (e.g. subject IDs)

ANOVA

One way ANOVA is currently supported. They include tests of homogeneity of variance and normality of residuals. Repeated measures ANOVA is also supported, including tests of sphericity.

NOTE: Two- and three-way ANOVA support is planned, but not yet implemented.

from csv_stats.anova import anova1way

# One way ANOVA, independent samples
result_anova1way = anova1way(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN,                            
                            filename = "anova1way_results.pdf", # Default save name. Enter `None` to not save.
                            render_plot = False # For speed, by default no plots are generated
                        )

# One way ANOVA, repeated measures
result_anova1way_rm = anova1way(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN,                             
                            repeated_measures_column = REPEATED_MEASURES_COLUMN,
                            filename = "anova1way_results.pdf", # Default save name. Enter `None` to not save.     
                            render_plot = False # For speed, by default no plots are generated                       
                        )

t-test

Both independent samples (one and two samples) and paired samples t-tests are supported. They include tests of homogeneity of variance and normality of residuals.

from csv_stats.ttest import ttest_ind, ttest_dep

# Independent samples t-test 
# Two sample when the GROUP_COLUMN has two groups
# One smaple when the GROUP_COLUMN has one group
result_ttest_ind = ttest_ind(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN,
                            popmean = 0, # Test against a population mean of 0 (default)
                            filename = "ttest_ind_results.pdf", # Default save name. Enter `None` to not save.
                            render_plot = False # For speed, by default no plots are generated
                        )

# Paired samples t-test
result_ttest_rel = ttest_dep(DATA_PATH, 
                            GROUP_COLUMN, 
                            DATA_COLUMN, 
                            repeated_measures_column = REPEATED_MEASURES_COLUMN,
                            filename = "ttest_dep_results.pdf", # Default save name. Enter `None` to not save.
                            render_plot = False # For speed, by default no plots are generated
                        )

Multiple Data Columns

If you have multiple columns of data that you want to run the same test on, you can specify the data_column argument as "_". This will automatically loop over all columns. Note that to save these results to a file, the filename should be an f-string containing f"{data_column}". The test function will replace data_column with the column name in the file name.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv_stats-0.1.7.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csv_stats-0.1.7-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file csv_stats-0.1.7.tar.gz.

File metadata

  • Download URL: csv_stats-0.1.7.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for csv_stats-0.1.7.tar.gz
Algorithm Hash digest
SHA256 525f2da1cb4eef29dd293593429d6ad850c0055e1a66ae018c5b16e8ceb7d89c
MD5 71259d3d40226ab815eef656652f7c95
BLAKE2b-256 71f916f131515366f213b78c48332175fa89742dea52ba7daa67dfe8164c0d73

See more details on using hashes here.

File details

Details for the file csv_stats-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: csv_stats-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for csv_stats-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 156f3610f1c8733d40626086204bfda97900b27c6639cabd0cbd1735c311a7a1
MD5 27ca16bede9247f4fa18a2f92c7058bc
BLAKE2b-256 3c52619ffa611f767ace8004599b4ba650288e338706fa2347494a27a194f637

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page