Skip to main content

A convenient data flow to preprocess data using metadata.

Project description

dproc

A convenient data flow to preprocess data using metadata.

Install

pip install dproc

How to use

Import

from dproc import *

Load the definition file

Load the data defintion from the location of your choice (locally, server, cloud).

dproc.meta.definition = pd.read_excel('your-data-definition-file')

This file contains all meta information such as

In order to generate a specifc entity definition ...

dproc.meta.entity = 'your-entity'

and then you can apply the dataflow steps:

entity_cleaned = (entity_raw
                  .step_rename_cols()
                  .step_replace_missing_with_nan()
                  .step_remove_not_needed_cols()
                  .step_remove_rows_with_missing_ids()
                  .step_remove_duplicate_rows()
                  .step_format_dates(cols=['created'])
                  .step_format_dates(cols=['modified'])
                  .step_format_round_numeric_cols(cols=['rating'], decimal_places=2)
                  .step_change_dtypes()
                 )

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dproc-0.0.2.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

dproc-0.0.2-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file dproc-0.0.2.tar.gz.

File metadata

  • Download URL: dproc-0.0.2.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.6

File hashes

Hashes for dproc-0.0.2.tar.gz
Algorithm Hash digest
SHA256 5ba92bf50e2e366a4ba716310eb4f9ab8cdee10e4186862519e35f085d39553c
MD5 613312deda5437497354281af652c462
BLAKE2b-256 49636d6d43bfd37f09c7a157437c03def8b7bb8d8ac446439079cd42e970cc1e

See more details on using hashes here.

File details

Details for the file dproc-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: dproc-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.6

File hashes

Hashes for dproc-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b8ce053c019895e60f9fed8370935bd5dd06febbbfb35ab7fd6da5b9caf2a686
MD5 8ccfaa3d63d56e21f4ef9e6876bc50fc
BLAKE2b-256 46b4ba38e87b3d32ac32ee0f05a4f47548856fbbb37dc4b60333861544fda124

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page