Skip to main content

skimpy

Project description

PyPI Status Python Version License

Read the documentation at https://skimpy.readthedocs.io/ Tests Codecov Downloads

pre-commit Black Google Colab

Welcome

Welcome to skimpy! skimpy is a light weight tool that provides summary statistics about variables in data frames within the console. Think of it as a super version of df.summary().

Quickstart

skim a dataframe and produce summary statistics within the console using:

from skimpy import skim

skim(df)

If you need to a dataset to try skimpy out on, you can use the built-in test dataframe:

from skimpy import skim, generate_test_data

df = generate_test_data()
skim(df)
https://raw.githubusercontent.com/aeturrell/skimpy/master/img/skimpy_example.png

It is recommended that you set your datatypes before using skimpy (for example converting any text columns to pandas string datatype), as this will produce richer statistical summaries.

skim accepts keyword arguments that change the colour of the datatypes as displayed. For example, to change the colour of datetimes to be chartreuse instead of red, the code is:

skim(df, datetime="chartreuse1")

You can also change the colours of the headers of the first three summary tables using, for example,

skim(df, header_style="italic green")

You can try this package out right now in your browser using this Google Colab notebook (requires a Google account). Note that the Google Colab notebook uses the latest package released on PyPI (rather than the development release).

Features

  • Support for boolean, numeric, datetime, string, and category datatypes

  • Command line interface in addition to interactive console functionality

  • Light weight, with results printed to terminal using the rich package.

  • Support for different colours for different types of output

Requirements

You can find a full list of requirements in the pyproject.toml file. The main requirements are:

  • python = “>=3.7.1,<4.0.0”

  • click = “^8.0.1”

  • rich = “^10.9.0”

  • pandas = “^1.3.2”

Installation

You can install the latest release of skimpy via pip from PyPI:

$ pip install skimpy

To install the development version from git, use:

$ pip install git+https://github.com/aeturrell/skimpy.git

For development, see the Contributor Guide.

Usage

This package is mostly designed to be used within an interactive console session or Jupyter notebook

from skimpy import skim

skim(df)

However, you can also use it on the command line:

$ skimpy file.csv

skimpy will do its best to infer column datatypes.

Contributing

Contributions are very welcome. To learn more, see the Contributor Guide.

License

Distributed under the terms of the MIT license, skimpy is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was generated from @cjolowicz’s Hypermodern Python Cookiecutter template.

skimpy was inspired by the R package skimr and by exploratory Python packages including pandas_profiling and dataprep.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skimpy-0.0.3.tar.gz (11.0 kB view hashes)

Uploaded Source

Built Distribution

skimpy-0.0.3-py3-none-any.whl (9.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page