Skip to main content

A package for streamlining EDA processes for basic Data Analysis

Project description

Documentation Status codecov

gdphelper

This package is designed to take the url of any of the several dozen GDP-related csv datasets from the Canadian Government Open Data Portal and download, clean load, summarize and visualize the data contained within.

It contains 4 functions:

gdpimporter: Downloads the zipped data, extracts, renames the appropriate csv, and returns a dataframe along with the title from the meta data.
gdpcleaner: Loads the data, removes spurious columns, renames used columns, scrubs and data issues. Returns a basic data frame and some category flags.
gdpdescribe : Evaluates the data category and generates summary statistics by year, region, industry, etc.
gdpplotter: Generates a set of visualizations of the data set according to the user's choices.

This package is built upon a bunch of popular packages in Python ecosystem, including zipfile, matplotlib, and pandas. What makes this package unique is that it incorporates the common functionalities and streamlines the workflow from downloading the data to performing simple EDA, specifically for the GDP-related data from the Canadian Government Open Data Portal.

Installation

$ pip install gdphelper

Usage

from gdphelper import gdpimporter
from gdphelper import gdpcleaner
from gdphelper import gdpdescribe
from gdphelper import gdpplotter

URL = "https://www150.statcan.gc.ca/n1/tbl/csv/36100400-eng.zip"
data_frame, title = gdpimporter.gdpimporter(URL)
clean_frame = gdpcleaner.gdpcleaner(data_frame)
gdpdescribe.gdpdescribe(clean_frame, "Value", "Location", stats=["mean", "median", "sd", "min", "max", "range_"], dec=2)
gdpplotter.gdpplotter(clean_frame)

for more detailed documentation, see: https://gdphelper.readthedocs.io/en/latest/

Contributors

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

gdphelper was created by Aldo Barros, Gabriel Fairbrother, Ramiro Mejia, Wanying Ye. It is licensed under the terms of the MIT license.

Credits

gdphelper was created with cookiecutter and the py-pkgs-cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gdphelper-1.1.11.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

gdphelper-1.1.11-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file gdphelper-1.1.11.tar.gz.

File metadata

  • Download URL: gdphelper-1.1.11.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for gdphelper-1.1.11.tar.gz
Algorithm Hash digest
SHA256 12afa16b57b94e0be45d890ef1ac04a6e450af782dbe8af58f168b0077833c5e
MD5 45242c36e6e7b5f56c8d0fc3da00f164
BLAKE2b-256 bfa82ee986e88174c9c462a45af977299472c58d3e6599c190a3be885f46bfde

See more details on using hashes here.

File details

Details for the file gdphelper-1.1.11-py3-none-any.whl.

File metadata

  • Download URL: gdphelper-1.1.11-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for gdphelper-1.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 df807e2a65e994f96cb099500969f732f9d3adf5bbcefd1a93e279a1f661c602
MD5 6a54e3dc03f1754134caf755b7614511
BLAKE2b-256 79c89417b28c9b55843454ba077a943bbc74e7c376f1bef111494cb0ad56deef

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page