Skip to main content

Dataprep: Data Preparation in Python

Project description

Dataprep Build Status

Documentation | Mail List & Forum

Dataprep let you prepare your data using a single library with a few lines of code.

Currently, you can use dataprep to:

  • Collect data from common data sources (through dataprep.data_connector)
  • Do your exploratory data analysis (through dataprep.eda)
  • ...more modules are coming

Installation

pip install dataprep

Examples & Usages

The following examples can give you an impression of what dataprep can do:

EDA

There are common tasks during the exploratory data analysis stage, like a quick look at the columnar distribution, or understanding the correlations between columns.

The EDA module categorizes these EDA tasks into functions helping you finish EDA tasks with a single function call.

  • Want to understand the distributions for each DataFrame column? Use plot.
  • Want to understand the correlation between columns? Use plot_correlation.
  • Or, if you want to understand the impact of the missing values for each column, use plot_missing.
  • You can drill down to get more information by given plot, plot_correlation and plot_missing a column name. E.g. for plot_missing:

Don't forget to checkout the examples folder for detailed demonstration!

Data Connector

You can download Yelp business search result into a pandas DataFrame, using two lines of code, without taking deep looking into the Yelp documentation!

from dataprep.data_connector import Connector

dc = Connector("yelp", auth_params={"access_token":"<Your yelp access token>"})
df = dc.query("businesses", term="korean", location="seattle")

Contribution

Dataprep is in its early stage. Any contribution including:

  • Filing an issue
  • Providing use cases
  • Writing down your user experience
  • Submitting a PR
  • ...

are greatly appreciated!

Please take a look at our wiki for development documentations!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataprep-0.2.3.tar.gz (44.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataprep-0.2.3-py3-none-any.whl (54.4 kB view details)

Uploaded Python 3

File details

Details for the file dataprep-0.2.3.tar.gz.

File metadata

  • Download URL: dataprep-0.2.3.tar.gz
  • Upload date:
  • Size: 44.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.7.5 Linux/4.4.0-169-generic

File hashes

Hashes for dataprep-0.2.3.tar.gz
Algorithm Hash digest
SHA256 505e23caa9a3da42098c4345895233958804f0c4485c003d07483ba070f8dd2a
MD5 6def8c797c1f2d0989cf5d55f24cd3fe
BLAKE2b-256 997142357e938ea410d3d7427967d5718a7d7ad6fea064ca11a53725f2381941

See more details on using hashes here.

File details

Details for the file dataprep-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: dataprep-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 54.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.7.5 Linux/4.4.0-169-generic

File hashes

Hashes for dataprep-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4cb2693334a4fbd399cc8fd55e9276d662039f4d7bcb2384096a3ce9d1abd586
MD5 7fddd2960b5bf83b69f18163bdf2a827
BLAKE2b-256 4f884ee5dab726fcefc5b55e93e7ec0849518121f25df5f762fb68409aec436e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page