Vertical summary statistics for data frames
Project description
showstats: quick and compact summary statistics
showstats quickly produces compact summary statistic tables with vertical orientation.
from showstats import show_stats
show_stats(df)
-Date and datetime columns------------------------------------------------------
Var. N=100 NA% Min Max Median
date_col 0 1501-01-20 1996-04-09 1755-07-20 00:00:00
date_col_2 0 1511-12-06 1999-05-05 1776-03-03 00:00:00
datetime_col 0 1501-01-20 1996-04-09 06:29:29 1755-07-20 10:10:59
14:37:46
datetime_col_2 0 1511-12-06 1999-05-05 14:12:20 1776-03-03 13:25:50
23:40:13
-Numerical columns--------------------------------------------------------------
Var. N=100 NA% Avg SD Min Max Median
float_mean_2 0 2.0 0.89 -0.36 4.12 2.0
float_std_2 0 0.14 2.0 -5.17 4.91 0.14
float_min_-7 0 -4.64 0.89 -7.0 -2.51 -4.63
float_max_17 0 14.88 0.89 12.51 17.0 14.88
float_big 0 1.23E6 0.89 1.23E6 1.23E6 1.23E6
float_col 0 0.5 0.29 0.0 0.99 0.5
U 0 0.54 0.26 0.02 0.98 0.57
int_col 0 49.5 29.01 0 99 49.5
int_with_missings 5 48.32 28.8 0 99 49.0
bool_col 26 0.5 0.5 false true 0.5
null_col 100
-Categorical columns------------------------------------------------------------
Var. N=100 NA% Uniques Top 1 Top 2 Top 3
str_col 48 5 foo (15%) ABC (13%) bar (12%)
categorical_col 0 2 Fara (57%) Car (43%)
enum_col 0 3 best (36%) worst (35%) medium (29%)
# Only one type
show_stats(df, "cat") # Other are num, time
-Categorical columns------------------------------------------------------------
Var. N=100 NA% Uniques Top 1 Top 2 Top 3
str_col 48 5 foo (15%) ABC (13%) bar (12%)
categorical_col 0 2 Fara (57%) Car (43%)
enum_col 0 3 best (36%) worst (35%) medium (29%)
# Importing **statsshow** adds the stats namespace
df.select("U", "int_col").stats.show()
-Numerical columns--------------------------------------------------------------
Var. N=100 NA% Avg SD Min Max Median
U 0 0.54 0.26 0.02 0.98 0.57
int_col 0 49.5 29.01 0 99 49.5
-
Primarily built for polars data frames, showstats converts other inputs.
- For full compatibility with pandas.DataFrames install via
pip install showstats[pandas].
- For full compatibility with pandas.DataFrames install via
-
Heavily inspired by the great R-packages skimr and modelsummary.
-
Numbers with many digits are automatically converted to scientific notation.
-
Because showstats leverages polars efficiency, it`s fast: <1 second for a 1,000,000 × 1,000 data frame, running on a M1 MacBook.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file showstats-0.0.3.tar.gz.
File metadata
- Download URL: showstats-0.0.3.tar.gz
- Upload date:
- Size: 16.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.8.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08f53ff92e7e406fc9323f81cf8cfa7cb1f22ac2486a9046b5cd026e28cb7370
|
|
| MD5 |
ea934db4ee5f48dfc588c9ea7247f945
|
|
| BLAKE2b-256 |
03e218df62cc21973550e578fe314b451a19c78435c0f4a26e52a77847fcc22d
|
File details
Details for the file showstats-0.0.3-py3-none-any.whl.
File metadata
- Download URL: showstats-0.0.3-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.8.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49bb2a8100f4f7c1beaf5b3ac50766893715a7908eb87bb8f503a5b56fdf3d44
|
|
| MD5 |
99c270783af4d51692330e59477ce8f4
|
|
| BLAKE2b-256 |
447a26702024c7ec152ae146facd2ab1a8e8ae18c8b2341f310cb58cc70b3e89
|