Skip to main content

pywasher

Project description

General Information

Pywasher will make it easier to clean data and prepare it for analysis.

import pywasher

If you wish to use this cleaner locally for testing purposes you can install it using:

pip install pywasher as pw

Now the interface is accessible in your code by prefixing with 'pw'.

Exposed Classes

In this section all the available functions of the module will be described.

Column based

explore_datatypes

The explore_datatypes function returns the index, datatype, columnname and datatype given by pandas for each column in the dataframe. The difference between the datatype and the datatype by pandas is that the datatype is ignore the datatype of empty values and gives a more general datatype (string, number, datetime, boolean, list)

df.pw.explore_datatypes

column_merge

The column_merge function merges all the given columns. It adds the values of the column if the first column is empty. If delete is True the columns that will be added will be deleted.

df.pw.column_merge([columns], delete = False)

column_to_numeric

The numbering checks if all the values in the given columns can be converted to an float or integer. If this is possible it will convert every value in the column to an int or float. If the force value is True it will change every cell it cant convert to numbers to NA It returns an dataframe in which the values of the given columns are made into numbers.

df.pw.column_to_numeric([columns], force = False)

explore_column_names

The explore_columnnames functions shows how the column names will be changed if they will be send towards the Clappform database

df.pw.explore_columnnames

replace_double_column_names

The explore_columnnames function adds numbers to the columnnames of columnnames which are multiple times in the dataframe

df.pw.replace_double_columnnames

Cell based

list_splitter

The list_splitter function returns a dataframe in which all the values of the chosen columns are given a column. These columns consist of True or False based on the values in the chosen columns. The input is an list with all the names of the columns which need to be split, the output is a modified dataframe

df.pw.list_splitter([columns])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pywasher-1.0.1.tar.gz (4.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page