This is a Python package that automates the data preprocessing
Project description
DataFit
DataFit is a python package developed for automating data preprocessing.
Note: These commits are manual, just for the ease-of-access of users.
commit: Changes in descriptions
Note This Package is under development and is open source.
This package is developed by Syed Syab and Hamza Rustam for the purpose of Final Year Project at University of Swat. our information is given below
About Project:
DataFit is a python package developed for automating data preprocessing.
Project initilization data: 01/OCT/2023
Project Finilization Data: 01/Dec/2023 (Expected)
Team Member:
```Professor Naeem Ullah: **Supervisor**```
Basic Information:
[https://facebook.com/Naeem-Munna?mibextid=PzaGJu]
[naeem@uswat.edu.pk]
================================
```Syed Syab: **Student** (Me) ```
Basic information:
[https://github.com/SyabAhmad]
[lhttps://inkedin.com/SyedSyab]
[syab.se@hotmail.com]
```Hamza Rustam: **Student**```
Basic Information:
[https://github.com/Hamza-Rustam]
[linkedin.com/hamza-rustam-845a2b209]
[hs4647213@gmail.com]
This Package is desinged in a user-friendly manner which means every one can use it.
The main functionality of the package is to just automate the data pre-processing step, and make it easy for machine learning engineers or data scientist.
Current Functionality of the package is:
Function:
displaying information
Handling Null Value
Delete Multiple Columns
Handling Categorical Values
Normalization
Standardization
Extract Numeric Values
Tokenization
To use the package
pip install datafit
To use this package it's quit simple, just import it like pandas and then use it.
from datafit import datafit as df
# to check information of the data
df.information(data)
To categorize the data
from datafit import datafit as df
df.handleCategoricalValues(data,["column1","column2"])
if you want to not mention the columns name an do it for all columns then simply type None inplace of columns names.
from datafit import datafit as df
df.handleCategoricalValues(data,None)
To Extract numerical values from the columns
from datafit import datafit as df
df.extractValues(data,["columns1", "columns2"])
Note Again: This package is uder development. if it touches your heart do share it and follow me on github [https://github.com/SyabAhmad] and linkedin [lhttps://inkedin.com/SyedSyab] for mote intersting updates
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file datafit-0.2023.2.12.tar.gz
.
File metadata
- Download URL: datafit-0.2023.2.12.tar.gz
- Upload date:
- Size: 7.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a7b5db073c7f2d393af9a00427798d1108d418887697c999a37b03033f5fe42 |
|
MD5 | ed26e50ee5ab0e08d12dc57822da2f8d |
|
BLAKE2b-256 | 03d722b0da16598d2d81b71c8be5e7fcc3c769d4493af8a2297a6188a17d293f |
File details
Details for the file datafit-0.2023.2.12-py3-none-any.whl
.
File metadata
- Download URL: datafit-0.2023.2.12-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55ff46dfb2935bb19b41c8020fda3677d4c5e43f9e58dde7f26fb3f4670fc33f |
|
MD5 | a4a8ecdb74164f5e8fc72005f6a2bbe7 |
|
BLAKE2b-256 | 5b4ce87769acd12e39e7aaa354b75330ee9af94516c4baf195bc41e6c843a3cd |