A convenient data flow to preprocess data using metadata.
Project description
dproc
A convenient data flow to preprocess data using metadata.
Install
pip install dproc
How to use
Import
from dproc import *
Load the definition file
Load the data defintion from the location of your choice (locally, server, cloud).
dproc.meta.definition = pd.read_excel('your-data-definition-file')
This file contains all meta information such as
In order to generate a specifc entity definition ...
dproc.meta.entity = 'your-entity'
and then you can apply the dataflow steps:
entity_cleaned = (entity_raw
.step_rename_cols()
.step_replace_missing_with_nan()
.step_remove_not_needed_cols()
.step_remove_rows_with_missing_ids()
.step_remove_duplicate_rows()
.step_format_dates(cols=['created'])
.step_format_dates(cols=['modified'])
.step_format_round_numeric_cols(cols=['rating'], decimal_places=2)
.step_change_dtypes()
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dproc-0.0.2.tar.gz
(9.8 kB
view details)
Built Distribution
dproc-0.0.2-py3-none-any.whl
(8.3 kB
view details)
File details
Details for the file dproc-0.0.2.tar.gz
.
File metadata
- Download URL: dproc-0.0.2.tar.gz
- Upload date:
- Size: 9.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ba92bf50e2e366a4ba716310eb4f9ab8cdee10e4186862519e35f085d39553c |
|
MD5 | 613312deda5437497354281af652c462 |
|
BLAKE2b-256 | 49636d6d43bfd37f09c7a157437c03def8b7bb8d8ac446439079cd42e970cc1e |
File details
Details for the file dproc-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: dproc-0.0.2-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0.post20200106 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8ce053c019895e60f9fed8370935bd5dd06febbbfb35ab7fd6da5b9caf2a686 |
|
MD5 | 8ccfaa3d63d56e21f4ef9e6876bc50fc |
|
BLAKE2b-256 | 46b4ba38e87b3d32ac32ee0f05a4f47548856fbbb37dc4b60333861544fda124 |