Skip to main content

Automating Data Preprocessing

Project description

Automating Data Preprocessing

Shortly ADP is now a Python Library and you can use it by just installing using the following commands

pip install autodatap

And will install the package into you system

Purpose of autodatap:

  • to help you in data preprocessing

to know how can you use it:

  • import the package

import autodatap as adp

The main function in autodatap package is mainMethod so,

adp.mainMethod("link to data set")

and that's it, everything is done, you are good to go.

Now everything you will be doing will be in console (run)

Currently supported funcitons

  • Categorical Values (One-Hot-Encoding)

  • Normalization

  • Check for Imbalanced Data

  • Null values finder and filling with 0 (in future with mean)

  • dropping duplicate

Categorical Values (One-Hot-Encoding):

So, Categorical values are those values which may have to are more values of same class, if we look at the example below

Let's say we have gender class which is a categorical variable because it has 2 or more values (male, female etc) example1

gender
male
female
male
female

now as machine learning only except numerical values it does not support string values, we have to convert it from string to numerical values

so achieve that we have to (or more) option either we have to give custom values by replace function

data.replace("male",1,inplace=True)

or we can use builtin function like label encoding and One-Hot-Encoding.

in this library we are achieving this functionality using One-Hot-Encoding.

so, the above example could be like


gender_male
1
0
1
0

-------and---------


gender_female
0
1
0
1

how to use:

To use this function you have to write the exact column name the step of preprocessing like

[u'name', u'age', u'class', u'code']

in the above code the coulmn name should be u'name'

Licence

MIT License

Copyright (c) 2023 Syed Syab Ahmad

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Contribution

To contribute to the package follow the following link

https://github.com/SyabAhmad/Automating-Data-Preprocessing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autodatap-1.5.2.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

autodatap-1.5.2-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file autodatap-1.5.2.tar.gz.

File metadata

  • Download URL: autodatap-1.5.2.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.8.3 requests/2.27.1 setuptools/44.1.1 requests-toolbelt/1.0.0 tqdm/4.64.1 CPython/2.7.18

File hashes

Hashes for autodatap-1.5.2.tar.gz
Algorithm Hash digest
SHA256 b35ea56a5e0bd01c2cca9734a081448f54d6b8093cf49d8b3635aa6c5d6556d7
MD5 a6cfc0066cd754377e2189963bf18f80
BLAKE2b-256 b295ab8329fde94a86de2dc25a2cbafc3817fe3278a82ae55d73d72a4528159f

See more details on using hashes here.

File details

Details for the file autodatap-1.5.2-py3-none-any.whl.

File metadata

  • Download URL: autodatap-1.5.2-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.8.3 requests/2.27.1 setuptools/44.1.1 requests-toolbelt/1.0.0 tqdm/4.64.1 CPython/2.7.18

File hashes

Hashes for autodatap-1.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 21a62b549432fb47e8d9cf37cb4ac8c4cb57b929cd385326cefa25288bd3f72d
MD5 2bd3eb5fd2cca502f40d578c5b832a5d
BLAKE2b-256 0eeb902fb4ad824e930d7ae7eb158e3c327baa2c50a673ac336c8a6145ab2d7e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page