Skip to main content

Quickly create summary statistics for a given dataframe.

Project description

pyskim

PyPI Tests

Quickly create summary statistics for a given dataframe.

This package aspires to be as awesome as skimr.

Installation

$ pip install pyskim

Usage

Commandline tool

pyskim can be used from the commandline:

$ pyskim iris.csv
── Data Summary ────────────────────────────────────────────────────────────────────────────────────
type                 value
-----------------  -------
Number of rows         150
Number of columns        5
──────────────────────────────────────────────────
Column type frequency:
           Count
-------  -------
Float64        4
string         1

── Variable type: number ───────────────────────────────────────────────────────────────────────────
    name            na_count    mean     sd    p0    p25    p50    p75    p100  hist
--  ------------  ----------  ------  -----  ----  -----  -----  -----  ------  ----------
 0  sepal_length           0    5.84  0.828   4.3    5.1   5.8     6.4     7.9  ▂▆▃▇▄▇▅▁▁▁
 1  sepal_width            0    3.06  0.436   2      2.8   3       3.3     4.4  ▁▁▄▅▇▆▂▂▁▁
 2  petal_length           0    3.76  1.77    1      1.6   4.35    5.1     6.9  ▇▃▁▁▂▅▆▄▃▁
 3  petal_width            0    1.2   0.762   0.1    0.3   1.3     1.8     2.5  ▇▂▁▂▂▆▁▄▂▃

── Variable type: string ───────────────────────────────────────────────────────────────────────────
    name       na_count    n_unique  top_counts
--  -------  ----------  ----------  -----------------------------------------
 0  species           0           3  setosa: 50, versicolor: 50, virginica: 50

Full overview:

$ pyskim --help
Usage: pyskim [OPTIONS] <file>

  Quickly create summary statistics for a given dataframe.

Options:
  -d, --delimiter TEXT   Delimiter of file.
  -i, --interactive      Open prompt with dataframe as `df` after displaying
                         summary.
  --no-dtype-conversion  Skip automatic dtype conversion.
  --groupby TEXT         Group dataframe by this/these variable(s).
  --help                 Show this message and exit.

Python API

Alternatively, it is possible to use it in code:

>>> from pyskim import skim
>>> from seaborn import load_dataset

>>> iris = load_dataset('iris')
>>> skim(iris)
# ── Data Summary ────────────────────────────────────────────────────────────────────────────────────
# type                 value
# -----------------  -------
# Number of rows         150
# Number of columns        5
# ──────────────────────────────────────────────────
# Column type frequency:
#            Count
# -------  -------
# float64        4
# string         1
#
# ── Variable type: number ───────────────────────────────────────────────────────────────────────────
#     name            na_count    mean     sd    p0    p25    p50    p75    p100  hist
# --  ------------  ----------  ------  -----  ----  -----  -----  -----  ------  ----------
#  0  sepal_length           0    5.84  0.828   4.3    5.1   5.8     6.4     7.9  ▂▆▃▇▄▇▅▁▁▁
#  1  sepal_width            0    3.06  0.436   2      2.8   3       3.3     4.4  ▁▁▄▅▇▆▂▂▁▁
#  2  petal_length           0    3.76  1.77    1      1.6   4.35    5.1     6.9  ▇▃▁▁▂▅▆▄▃▁
#  3  petal_width            0    1.2   0.762   0.1    0.3   1.3     1.8     2.5  ▇▂▁▂▂▆▁▄▂▃
#
# ── Variable type: string ───────────────────────────────────────────────────────────────────────────
#     name               na_count    n_unique  top_counts
# --  ---------------  ----------  ----------  -----------------------------------------
#  0          species           0           3  versicolor: 50, setosa: 50, virginica: 50

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyskim-0.1.5.tar.gz (5.9 kB view hashes)

Uploaded Source

Built Distribution

pyskim-0.1.5-py3-none-any.whl (7.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page