A Python Library for Statistical Data Visualization
Project description
pltstat: A Python Library for Statistical Data Visualization
pltstat is a Python library designed to facilitate the visualization of statistical data analysis. This library includes a variety of tools and methods to streamline data exploration, statistical computation, and graphical representation.
Installation
Requirements
Before installing, make sure that you are using Python 3.12.
You can check your Python version by running:
python --version
You can download it from the official Python website.
Installation
To install the pltstat library, simply run the following command:
pip install pltstat
This will install the library along with all the required dependencies as specified in the requirements.txt file.
After installation the package, you can start using pltstat by importing the necessary modules in your Python scripts.
File Descriptions
Python Modules
-
- Marks the directory as a Python package. This file allows you to import modules from the
pltstatpackage.
- Marks the directory as a Python package. This file allows you to import modules from the
-
- Dedicated to the analysis and visualization of single-variable features, including plotting functions such as pie charts, count plots, and histograms.
-
- Provides tools for analyzing interactions between two features. Includes functions for creating crosstabs, computing correlations, and visualizing results using violin plots, boxplots, and distribution box plots. These functions also display p-values and other statistical metrics to summarize relationships between the two features.
-
- Provides tools for analyzing relationships between multiple features.
Includes visualization functions for analyzing missing data, comparing distributions, and visualizing dimensionality reductions. Additionally, it provides methods for creating heatmaps that display correlations and p-values, including Spearman's correlation, Mann-Whitney p-values, and Phik correlations.
- Provides tools for analyzing relationships between multiple features.
-
- Contains functions and methods related to circular statistical visualizations, such as radar charts or circular histograms.
-
- Contains custom colormap utilities for visualizations, such as rendering correlation matrices or creating two-colored maps for p-values with a threshold (e.g., alpha).
-
- Includes methods for calculating correlation matrices and related statistical relationships.
-
- Provides utilities for reading, writing, and preprocessing input and output data files.
Other Files
-
- Specifies intentionally untracked files to ignore in the repository, such as virtual environments and temporary files.
-
README.md- This file provides an overview of the project, including file descriptions and usage instructions.
-
- Lists the Python dependencies required to run the library. Install them using:
pip install -r requirements.txt
- Lists the Python dependencies required to run the library. Install them using:
Getting Started
-
Clone the repository:
git clone https://github.com/trojanskehesten/pltstat.git
-
Navigate to the project directory:
cd pltstat
-
Python Version: This library is compatible with Python 3.12. Ensure you have this version installed before running the project.
-
R Installation: Ensure that the R language is installed on your system, as the
rpy2library (used in this project) requires it. -
Install dependencies:
pip install -r requirements.txt
-
Explore the modules and utilize the library in your projects.
Usage
Each module in pltstat is designed to be modular and reusable. Import the required module and use its functions to visualize your statistical data.
Example 1: Pie Chart
import pandas as pd
from pltstat import singlefeat as sf
data = {
"Age": [25, 30, 22, 27, 35],
"A/B Test Group": ["A", "B", "A", "B", "A"],
}
df = pd.DataFrame(data)
# Plot a pie chart
sf.pie(df["A/B Test Group"])
Result 1
Example 2: Boxplot
import pandas as pd
from pltstat import twofeats as tf
# Data creation:
data = {
"gender": ["male", "female", "female", "male", "male", "female", "female", "male", "male",
"female", "male", "female", "male", "male", "female", "male", "female", "male",
"female", "male", "female", "male", "female", "male", "female", "male", "female",
"female", "male", "male", "male", "male"],
"age": [22, 20, 17, 16, 19, 17, 11, 29, 24, 12, 22, 20, 19, 16, 11, 29, 24, 20, 16, 22,
17, 29, 24, 16, 17, 29, 22, 19, 22, 22, 24, 29]
}
df = pd.DataFrame(data)
# Boxplot creation:
tf.boxplot(df, "gender", "age")
Result 2
Example 3: Boxplot and Distribution Plot
import numpy as np
import pandas as pd
from pltstat import twofeats as tf
# Example DataFrame
np.random.seed(42)
df = pd.DataFrame({
'category': np.random.choice(['A', 'B'], size=100),
'value': np.random.randn(100),
})
# Create a boxplot and a distribution plot
tf.dis_box_plot(df, cat_feat='category', num_feat='value')
Result 3
Contributing
Contributions are welcome! If you'd like to improve the library or fix issues, please:
- Fork the repository.
- Create a new branch.
- Make your changes and commit them.
- Submit a pull request.
License
This project is licensed under the BSD 3-Clause License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pltstat-0.10.0.tar.gz.
File metadata
- Download URL: pltstat-0.10.0.tar.gz
- Upload date:
- Size: 34.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3815e6343e550bf30c0424b54306779d4996d1f5ee7245cdc8a8b7c41a9c37b1
|
|
| MD5 |
6d5eb45a55f7a3d21f31fcf25522f95b
|
|
| BLAKE2b-256 |
255941066f74b921280fe8224ad4fc63db16c4ab5269e229b6c085ad66198f39
|