Skip to main content

pywasher

Project description

General Information

Pywasher will make it easier to clean data and prepare it for analysis.

import pywasher

If you wish to use this cleaner locally for testing purposes you can install it using:

pip install pywasher as pw

Now the interface is accessible in your code by prefixing with 'pw'.

Exposed Classes

In this section all the available functions of the module will be described.

Column based

explore_datatypes

The explore_datatypes function returns the index, datatype, columnname and datatype given by pandas for each column in the dataframe. The difference between the datatype and the datatype by pandas is that the datatype is ignore the datatype of empty values and gives a more general datatype (string, number, datetime, boolean, list)

df.pw.explore_datatypes

column_merge

The column_merge function merges all the given columns. It adds the values of the column if the first column is empty. If delete is True the columns that will be added will be deleted.

df.pw.column_merge([columns], delete = False)

column_to_numeric

The numbering checks if all the values in the given columns can be converted to an float or integer. If this is possible it will convert every value in the column to an int or float. If the force value is True it will change every cell it cant convert to numbers to NA It returns an dataframe in which the values of the given columns are made into numbers.

df.pw.column_to_numeric([columns], force = False)

explore_column_names

The explore_column_names functions shows how the column names will be changed if they will be send towards the Clappform database

df.pw.explore_columnnames

replace_double_column_names

The explore_column_names function adds numbers to the columnnames of columnnames which are multiple times in the dataframe

df.pw.replace_double_column_names

sorting

The sorting function changes the order of the columns to an alphabetic order

df.pw.sorting()

explore_double

The explore_double function shows all the double columns in a dataframe

df.pw.explore_double()

cleaning

The cleaning functions cleans the dataframe. It removes double spaces, replaces spaces with underscores in the columns and makes sure the column names are valid variable names for Javascript

df.pw.cleaning()

Cell based

list_splitter

The list_splitter function returns a dataframe in which all the values of the chosen columns are given a column. These columns consist of True or False based on the values in the chosen columns. The input is an list with all the names of the columns which need to be split, the output is a modified dataframe

df.pw.list_splitter([columns])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pywasher-1.3.1.tar.gz (7.7 kB view hashes)

Uploaded source

Built Distribution

pywasher-1.3.1-py3-none-any.whl (7.4 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page