Skip to main content

Easily Extracting Information About Your Data

Project description

overviewpy

Easily Extracting Information About Your Data

Installation

$ pip install overviewpy

Usage

The goal of overviewpy is to make it easy to get an overview of a data set by displaying relevant sample information. At the moment, there are the following functions:

  • overview_tab generates a tabular overview of the sample (and returns a data frame). The general sample plots a two-column table that provides information on an id in the left column and a the time frame on the right column.
  • overview_na plots an overview of missing values by variable (both by row and by column)

overviewpy seeks to mirror the functionality of overviewR and will extend its features with the following functionality in the future:

  • overview_crosstab generates a cross table. The conditional column allows to disaggregate the overview table by specifying two conditions, hence resulting a 2x2 table. This way, it is easy to visualize the time and scope conditions as well as theoretical assumptions with examples from the data set.
  • overview_latex converts the output of both overview_tab and overview_crosstab into LaTeX code and/or directly into a .tex file.
  • overview_plot is an alternative to visualize the sample (a way to present results from overview_tab)
  • overview_crossplot is an alternative to visualize a cross table (a way to present results from overview_crosstab)
  • overview_heat plots a heat map of your time line
  • overview_overlap plots comparison plots (bar graph and Venn diagram) to compare to data frames

overview_tab

Generate some general overview of the data set using the time and scope conditions with overview_tab. The resulting data frame collapses the time condition for each id by taking into account potential gaps in the time frame.

 from overviewpy.overviewpy import overview_tab
 import pandas as pd

 data = {
        'id': ['RWA', 'RWA', 'RWA', 'GAB', 'GAB', 'FRA', 'FRA', 'BEL', 'BEL', 'ARG'],
        'year': [2022, 2023, 2021, 2023, 2020, 2019, 2015, 2014, 2013, 2002]
    }

df = pd.DataFrame(data)

df_overview = overview_tab(df=df, id='id', time='year')

overview_na

overview_na is a simple function that provides information about the content of all variables in your data, not only the time and scope conditions. It returns a horizontal ggplot bar plot that indicates the amount of missing data (NAs) for each variable (on the y-axis). You can choose whether to display the relative amount of NAs for each variable in percentage (the default) or the total number of NAs.

from overviewpy.overviewpy import overview_na
import pandas as pd
import numpy as np

data_na = {
        'id': ['RWA', 'RWA', 'RWA', np.nan, 'GAB', 'GAB', 'FRA', 'FRA', 'BEL', 'BEL', 'ARG', np.nan,  np.nan],
        'year': [2022, 2001, 2000, 2023, 2021, 2023, 2020, 2019,  np.nan, 2015, 2014, 2013, 2002]
    }

df_na = pd.DataFrame(data_na)

overview_na(df_na)

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

overviewpy was created by Cosima Meyer. It is licensed under the terms of the BSD 3-Clause license.

Credits

overviewpy was created with cookiecutter and the py-pkgs-cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

overviewpy-0.1.0.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

overviewpy-0.1.0-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file overviewpy-0.1.0.tar.gz.

File metadata

  • Download URL: overviewpy-0.1.0.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.5 Linux/5.15.49-linuxkit-pr

File hashes

Hashes for overviewpy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bf01ddd1706d2b3d1337fa7dee51081f8bbbc4a453b762a009fd435521dfaf40
MD5 cf66a1744df4ccf192aa07a7aff4bcde
BLAKE2b-256 9213711227de9213358a5af6760166ba80a29b3178eba5a1edb2111a7b1037c6

See more details on using hashes here.

File details

Details for the file overviewpy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: overviewpy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.5 Linux/5.15.49-linuxkit-pr

File hashes

Hashes for overviewpy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 91707b50892f2f5d87237c5d4f557540c7ae17ff4ce662a034947fbdc67fbeb6
MD5 ed5defc79605b85b3cadeb1b4819b0e7
BLAKE2b-256 98f827e27bd423d985b76a35e001da2ae054a7ad1b291be29ab1b1f399566a5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page