Skip to main content

Utility functions to help with exploratory data analysis on top the Pandas APIs

Project description

Pandas Data Exploration Utility Package

Table of content

Overview

Pandas Data Exploration utility is an interactive, notebook based library for quickly profiling and exploring the shape of data and the relationships between data. Using existing APIs from IpyWidget, Plot.ly, and Pandas, it creates a flexible point and click widget that allows the user to easily explore and visualize the dataset.

Installation

pip install Pandas-Data-Exploration-Utility-Package

Usage

import pandas as pd
import pandas_exploration_util.viz.explore as pe

global_temp = pd.read_csv("./data/GlobalTemperatures.csv", parse_dates = [0], infer_datetime_format=True)

pe.generate_widget(global_temp)

see /test for sample data and test jupyter notebook https://github.com/yifeihuang/pandas_exploration_util/tree/master/test


Pareto plot

Visualize the top values of any column as ranked by aggregation of any other column. Support aggregation functions include 'count', 'sum', 'mean', 'std', 'max', 'min', 'uniques'

Distribution plot

Visualize distribution of any numerical value. Binning is automatically determined by the plot.ly histogram method.

X-Y plot

Visualize the X-Y scatter of any column vs aggregation of any other column. Support aggregation functions include 'count', 'sum', 'mean', 'std', 'max', 'min', 'uniques'

Recommended development setup

Local Dev

  1. Setup virtualenv
  2. Create a virtual environment using virtualenv /path/to/env/dir
  3. Activate virtual environment using source /path/to/env/dir/bin/activate
  4. Clone the repo locally
  5. Navigate the root directory of the repo where the setup.py lives
  6. Install the module in development mode using python setup.py develop
  7. Run the Jupyter notebook that is in the virtual environment directory, which should have installed as the part of the dependency of the module
  8. Dev away
  9. When done uninstall the package using python setup.py develop --uninstall
  10. Deactive the environment using deactivate

Building and distributing

https://packaging.python.org/tutorials/packaging-projects/ Assuming all relevant tools are installed and the relevant project files are properly defined

  1. build the distribution using python3 setup.py sdist bdist_wheel
  2. upload the distribution using twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
Pandas_Data_Exploration_Utility_Package-0.0.2-py3-none-any.whl (5.1 kB) Copy SHA256 hash SHA256 Wheel py3 Aug 16, 2018
Pandas Data Exploration Utility Package-0.0.2.tar.gz (4.5 kB) Copy SHA256 hash SHA256 Source None Aug 16, 2018

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page