Package for ANCOVA analysis and visualization.

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

README for ANCOVA Analysis Script

Overview

This script, provides tools for performing ANCOVA (Analysis of Covariance) and related statistical analyses. It includes a primary function, do_ancova, which integrates multiple steps of ANCOVA analysis and allows for flexible customization of inputs and outputs, including graphical representations of results.

Key Functionality: `do_ancova`

The main purpose of the do_ancova function is to perform parametric or non-parametric ANCOVA on a dataset. It accepts a DataFrame containing the dependent variable, categorical variables, and covariates to evaluate the relationship between them while adjusting for covariates.

Features:

Parametric and Non-Parametric ANCOVA:
Automatically switches between parametric or ranked (non-parametric) ANCOVA depending on the assumptions of normality and homoscedasticity.
Interaction Effects:
Allows inclusion of interactions between variables.
Post-Hoc Analysis:
Automatically performs Tukey or Dunn post-hoc tests when significant differences are found between groups.
Data Visualization:
Generates boxplots and scatterplots with regression lines, including statistical significance indicators.
Customizable Options:
Users can customize interactions, colors, and plot details.

Usage: `do_ancova`

Parameters:

data:
A pandas DataFrame containing:
- Column 1: Dependent (response) variable.
- Column 2 (to n categories): Categorical independent variable(s).
- Remaining columns: Continuous covariates.
interactions (Optional):
Specifies interactions between variables:
- "ALL": Includes all interactions.
- list: List of tuples specifying interacting variables.
plot (Default: False):
If True, generates a regression plot and a boxplot.
save_plot (Default: False):
If provided with a file path, saves the generated plots to the specified location.
covariate_to_plot (Optional):
Specifies the covariate to display in plots.
palette (Optional):
A dictionary mapping categorical levels to colors.
categories (Default: 1):
Number of categorical variables.
ax (Optional):
A Matplotlib axis for custom plotting.
y_lab (Optional): Label for the y-axis in the generated plot. Default is False (no label).
x_lab (Optional): Label for the x-axis in the generated plot. Default is False (no label).
sum_of_squares_type (Optional): Specifies the type of sums of squares for ANCOVA. Default is Type 2 (value = 2).

Output:

Results:
- A summary data frame with the ANCOVA parameters and outcomes.
- An ANCOVA table with p-values for each effect.
- Post-hoc results (if applicable).
Plots:
- Scatterplot with regression lines for covariates + Boxplot for main categorical copmpaisons.
- A Matplotlib axis with a Boxplot for categorical comparisons (allows customizing).
Files (Optional):
Saves plots to the specified file path if save_plot is provided.

Dependencies

The script relies on the following Python packages:

numpy
pandas
statsmodels
scipy
seaborn
matplotlib
scikit_posthocs

Install these dependencies using:

pip install numpy pandas statsmodels scipy seaborn matplotlib scikit-posthocs

Notes

Ensure that your dataset has the shape: Cases*Variables.
The script assumes the columns are sorted like this: [Response variable, Main category to compare, Other categorical co-variables (optional), Other continous co-variables].
For multiple categorical variables, specify the number using the categories parameter.

AN EXAMPLE OF USE:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Charge the main function from our package
from Ancova_analysis import do_ancova

This invented dataset contains 150 entries with the following columns:

Number of T Cells: The number of T cells, which is affected by the individual's age and HIV status. Individuals with HIV+ (Untreated) have a significant reduction in T cells, while HIV+ (TAR Treatment) individuals have a minimal reduction compared to HIV- individuals.

HIV Status: A categorical variable representing the individual's HIV status. It can take three values:

  -> HIV- (no HIV)

  -> HIV+ (TAR Treatment) (HIV positive, receiving treatment)

  -> HIV+ (Untreated) (HIV positive, not receiving treatment)

Sex: The individual's sex, either Male or Female.
Age: The individual's age, ranging from 20 to 70 years.

The Number of T Cells decreases with age, and the reduction is more significant for individuals with HIV+ (Untreated).

# Set the seed for reproducibility
np.random.seed(4)

# Number of samples
n = 150

# Categorical variables
sex = np.random.choice(['Male', 'Female'], size=n)
hiv_status = np.random.choice(['HIV-', 'HIV+ (TAR Treatment)', 'HIV+ (Untreated)'], size=n, p=[0.4, 0.3, 0.3])

# Covariate: Age
age = np.random.randint(20, 70, size=n)

# Generate T cell count
t_cells = []
for i in range(n):
    base_t_cells = 1000  # General base for T cells
    age_effect = -3 * (age[i] - 30)  # Mild effect of age
    if hiv_status[i] == 'HIV+ (Untreated)':
        hiv_effect = -200  # Significant reduction for untreated
    elif hiv_status[i] == 'HIV+ (TAR Treatment)':
        hiv_effect = -30  # Minimal reduction for treated
    else:
        hiv_effect = 0  # No effect for HIV-
    noise = np.random.normal(0, 50)  # Random noise
    t_cells.append(base_t_cells + age_effect + hiv_effect + noise)

# Define a palette to select the plotting colors for each category, else it would be randomly assigned
palette = {"HIV-":"skyblue",
           "HIV+ (Untreated)":"salmon",
           "HIV+ (TAR Treatment)":"orange"}


# Create the DataFrame
data_hiv = pd.DataFrame({
    'Number of T Cells': np.round(t_cells).astype(int),
    'HIV Status': hiv_status,
    'Sex': sex,
    'Age': age
})

data_hiv.head()

Lets see if the ANCOVA analysis is able to capture this differences:

# Run the main function and display the results

df_results, ancova_summary,post_hoc = do_ancova(data=data_hiv,
                                                palette=palette,
                                                categories=2, # HIV Status and Sex
                                                interactions=[('HIV Status',"Age")], # Test the significance of the interaction of these variables
                                                y_lab="CD4 T Cells (count)",# Set the y_label 
                                                plot=True, # Create the plot
                                                save_plot= "./Images/ANCOVA_Regression_boxplot.png" # Sves the plot in that path
                                                ) 

display(df_results)
display(ancova_summary)
display(post_hoc)

Example Plot

# Create two subplots in a row
fig, axs = plt.subplots(ncols=2,figsize=(12,6))


df_results, ancova_summary,post_hoc,ax= do_ancova(data=data_hiv,palette=palette,categories=2, y_lab="CD4 T Cells (count)",plot=True,
          ax=axs[0] # When the axis is provided it returns the boxplot and can be integrated with other subplots as you wish
          )

# Modify the df order to plot the sex differences
data_hiv_sex = data_hiv[['Number of T Cells','Sex','HIV Status','Age']]

df_results, ancova_summary,post_hoc,ax= do_ancova(data=data_hiv_sex,categories=2, y_lab="CD4 T Cells (count)",plot=True,
          ax=axs[1], # The other subplot

          )
# Save and show
plt.savefig("./Images/ANCOVA_two_boxplots.png",bbox_inches="tight")
plt.show()

Example Plot 2

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.1.2

Nov 25, 2024

0.1.1

Nov 25, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ancova-0.1.2.tar.gz (10.0 kB view details)

Uploaded Nov 25, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ANCOVA-0.1.2-py3-none-any.whl (10.3 kB view details)

Uploaded Nov 25, 2024 Python 3

File details

Details for the file ancova-0.1.2.tar.gz.

File metadata

Download URL: ancova-0.1.2.tar.gz
Upload date: Nov 25, 2024
Size: 10.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for ancova-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`ab0dd632d256e99c90d8718f84908c8f8e5516d6fcfe95960be478510c7a9bbb`
MD5	`2dfc3c5edc0c36cf7fb73bc807ca339c`
BLAKE2b-256	`9b0e134c1f46c139dcd0cc61475c487b667dd411280ac98dd3063b425da35d74`

See more details on using hashes here.

File details

Details for the file ANCOVA-0.1.2-py3-none-any.whl.

File metadata

Download URL: ANCOVA-0.1.2-py3-none-any.whl
Upload date: Nov 25, 2024
Size: 10.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for ANCOVA-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2756e59e04a3520256c875cf60df2c94b657457860d43f421ae6be006647d5f6`
MD5	`a63cb53d06cd7190926dd28d8dd0dfe3`
BLAKE2b-256	`4e4488c66f4c0fcf8fa6a5d32eb7c6338d0f7776a5986bfcd6e39f2525b90870`

See more details on using hashes here.

ANCOVA 0.1.2

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

README for ANCOVA Analysis Script

Overview

Key Functionality: `do_ancova`

Features:

Usage: `do_ancova`

Parameters:

Output:

Dependencies

Notes

AN EXAMPLE OF USE:

This invented dataset contains 150 entries with the following columns:

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

ANCOVA 0.1.2

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

README for ANCOVA Analysis Script

Overview

Key Functionality: do_ancova

Features:

Usage: do_ancova

Parameters:

Output:

Dependencies

Notes

AN EXAMPLE OF USE:

This invented dataset contains 150 entries with the following columns:

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Key Functionality: `do_ancova`

Usage: `do_ancova`