Easily Extracting Information About Your Data
Project description
overviewpy
Easily Extracting Information About Your Data
Installation
$ pip install overviewpy
Usage
The goal of overviewpy is to make it easy to get an overview of a data
set by displaying relevant sample information. At the moment, there are
the following functions:
overview_tabgenerates a tabular overview of the sample (and returns a data frame). The general sample plots a two-column table that provides information on an id in the left column and a the time frame on the right column.overview_naplots an overview of missing values by variable (both by row and by column)
overviewpy seeks to mirror the functionality of overviewR and will extend its features with the following functionality in the future:
overview_crosstabgenerates a cross table. The conditional column allows to disaggregate the overview table by specifying two conditions, hence resulting a 2x2 table. This way, it is easy to visualize the time and scope conditions as well as theoretical assumptions with examples from the data set.overview_latexconverts the output of bothoverview_tabandoverview_crosstabinto LaTeX code and/or directly into a .tex file.overview_plotis an alternative to visualize the sample (a way to present results fromoverview_tab)overview_crossplotis an alternative to visualize a cross table (a way to present results fromoverview_crosstab)overview_heatplots a heat map of your time lineoverview_overlapplots comparison plots (bar graph and Venn diagram) to compare to data frames
overview_tab
Generate some general overview of the data set using the time and scope
conditions with overview_tab. The resulting data frame collapses the time condition for each id by
taking into account potential gaps in the time frame.
from overviewpy.overviewpy import overview_tab
import pandas as pd
data = {
'id': ['RWA', 'RWA', 'RWA', 'GAB', 'GAB', 'FRA', 'FRA', 'BEL', 'BEL', 'ARG'],
'year': [2022, 2023, 2021, 2023, 2020, 2019, 2015, 2014, 2013, 2002]
}
df = pd.DataFrame(data)
df_overview = overview_tab(df=df, id='id', time='year')
overview_na
overview_na is a simple function that provides information about the
content of all variables in your data, not only the time and scope
conditions. It returns a horizontal ggplot bar plot that indicates the
amount of missing data (NAs) for each variable (on the y-axis). You can
choose whether to display the relative amount of NAs for each variable
in percentage (the default) or the total number of NAs.
from overviewpy.overviewpy import overview_na
import pandas as pd
import numpy as np
data_na = {
'id': ['RWA', 'RWA', 'RWA', np.nan, 'GAB', 'GAB', 'FRA', 'FRA', 'BEL', 'BEL', 'ARG', np.nan, np.nan],
'year': [2022, 2001, 2000, 2023, 2021, 2023, 2020, 2019, np.nan, 2015, 2014, 2013, 2002]
}
df_na = pd.DataFrame(data_na)
overview_na(df_na)
Contributing
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
License
overviewpy was created by Cosima Meyer. It is licensed under the terms of the BSD 3-Clause license.
Credits
overviewpy was created with cookiecutter and the py-pkgs-cookiecutter template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file overviewpy-0.1.0.tar.gz.
File metadata
- Download URL: overviewpy-0.1.0.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.5 Linux/5.15.49-linuxkit-pr
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf01ddd1706d2b3d1337fa7dee51081f8bbbc4a453b762a009fd435521dfaf40
|
|
| MD5 |
cf66a1744df4ccf192aa07a7aff4bcde
|
|
| BLAKE2b-256 |
9213711227de9213358a5af6760166ba80a29b3178eba5a1edb2111a7b1037c6
|
File details
Details for the file overviewpy-0.1.0-py3-none-any.whl.
File metadata
- Download URL: overviewpy-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.5 Linux/5.15.49-linuxkit-pr
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91707b50892f2f5d87237c5d4f557540c7ae17ff4ce662a034947fbdc67fbeb6
|
|
| MD5 |
ed5defc79605b85b3cadeb1b4819b0e7
|
|
| BLAKE2b-256 |
98f827e27bd423d985b76a35e001da2ae054a7ad1b291be29ab1b1f399566a5c
|