A Python package to handle missing values in the dataset
Project description
Project MISSING VALUES
Name Kriti Pandey
Roll no 101703292
Group 3COE13
DESCRIPTION
Data can have missing values for a number of reasons such as observations that were not recorded and data corruption.Handling missing data is important as many machine learning algorithms do not support data with missing values.
Some typical reasons why data is missing:
-
User forgot to fill in a field.
-
Data was lost while transferring manually from a legacy database.
-
There was a programming error.
-
Users chose not to fill out a field tied to their beliefs about how the results would be used or interpreted.
Specifically, there are 2 steps to handle missing data:
-
mark invalid or corrupt values as missing in your dataset.
-
impute missing values with mean values in your dataset.
Installation
Use the package manager pip to install OUTLIER_101703292.
pip install MissingValues_101703292
Usage
Enter csv filename followed by .csv extentsion
MissingValues_101703292 data.csv
Sample dataset
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|
0 | 6 | 148.0 | 72.0 | 35.0 | NaN | 33.6 | 0.627 | 50 | 1 |
1 | 1 | 85.0 | 66.0 | 29.0 | NaN | 26.6 | 0.351 | 31 | 0 |
2 | 8 | 183.0 | 64.0 | NaN | NaN | 23.3 | 0.672 | 32 | 1 |
3 | 1 | 89.0 | 66.0 | 23.0 | 94.0 | 28.1 | 0.167 | 21 | 0 |
4 | 0 | 137.0 | 40.0 | 35.0 | 168.0 | 43.1 | 2.288 | 33 | 1 |
5 | 5 | 116.0 | 74.0 | NaN | NaN | 25.6 | 0.201 | 30 | 0 |
6 | 3 | 78.0 | 50.0 | 32.0 | 88.0 | 31.0 | 0.248 | 26 | 1 |
7 | 10 | 115.0 | NaN | NaN | NaN | 35.3 | 0.134 | 29 | 0 |
8 | 2 | 197.0 | 70.0 | 45.0 | 543.0 | 30.5 | 0.158 | 53 | 1 |
9 | 8 | 125.0 | 96.0 | NaN | NaN | NaN | 0.232 | 54 | 1 |
10 | 4 | 110.0 | 92.0 | NaN | NaN | 37.6 | 0.191 | 30 | 0 |
11 | 10 | 168.0 | 74.0 | NaN | NaN | 38.0 | 0.537 | 34 | 1 |
12 | 10 | 139.0 | 80.0 | NaN | NaN | 27.1 | 1.441 | 57 | 0 |
13 | 1 | 189.0 | 60.0 | 23.0 | 846.0 | 30.1 | 0.398 | 59 | 1 |
14 | 5 | 166.0 | 72.0 | 19.0 | 175.0 | 25.8 | 0.587 | 51 | 1 |
15 | 7 | 100.0 | NaN | NaN | NaN | 30.0 | 0.484 | 32 | 1 |
16 | 0 | 118.0 | 84.0 | 47.0 | 230.0 | 45.8 | 0.551 | 31 | 1 |
17 | 7 | 107.0 | 74.0 | NaN | NaN | 29.6 | 0.254 | 31 | 1 |
18 | 1 | 103.0 | 30.0 | 38.0 | 83.0 | 43.3 | 0.183 | 33 | 0 |
19 | 1 | 115.0 | 70.0 | 30.0 | 96.0 | 34.6 | 0.529 | 32 | 1 |
Input
MissingValues_101703292 Sampledata.csv
Result
S No. 1 2 3 4 5 6 7 8 9
0 0 6 148 72.0 35.0 116.15 33.60 0.627 50 1
1 1 1 85 66.0 29.0 116.15 26.60 0.351 31 0
2 2 8 183 64.0 17.8 116.15 23.30 0.672 32 1
3 3 1 89 66.0 23.0 94.00 28.10 0.167 21 0
4 4 0 137 40.0 35.0 168.00 43.10 2.288 33 1
5 5 5 116 74.0 17.8 116.15 25.60 0.201 30 0
6 6 3 78 50.0 32.0 88.00 31.00 0.248 26 1
7 7 10 115 61.7 17.8 116.15 35.30 0.134 29 0
8 8 2 197 70.0 45.0 543.00 30.50 0.158 53 1
9 9 8 125 96.0 17.8 116.15 30.95 0.232 54 1
10 10 4 110 92.0 17.8 116.15 37.60 0.191 30 0
11 11 10 168 74.0 17.8 116.15 38.00 0.537 34 1
12 12 10 139 80.0 17.8 116.15 27.10 1.441 57 0
13 13 1 189 60.0 23.0 846.00 30.10 0.398 59 1
14 14 5 166 72.0 19.0 175.00 25.80 0.587 51 1
15 15 7 100 61.7 17.8 116.15 30.00 0.484 32 1
16 16 0 118 84.0 47.0 230.00 45.80 0.551 31 1
17 17 7 107 74.0 17.8 116.15 29.60 0.254 31 1
18 18 1 103 30.0 38.0 83.00 43.30 0.183 33 0
19 19 1 115 70.0 30.0 96.00 34.60 0.529 32 1
Constraint
Your csv file should not have categorical data
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for MissingValues_101703292-1.0.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3cd5db851a931facc20733ef8b01cac089a2aeda06d67e15366414c78e3393c |
|
MD5 | f8ef501c6a9cd0369302d10e05652333 |
|
BLAKE2b-256 | f0c36f7077b282b7ccf9a1734c033ec8cb9a8474f9f931d940c43a710cd035ed |
Hashes for MissingValues_101703292-1.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24fcac4cdd00fac924deb91a813c3d408cd019f51a0b68f4af8efdcceb78b401 |
|
MD5 | f82245648d6028e20d3aadc7378335db |
|
BLAKE2b-256 | 91892a47d649db023e9d2fd6925b9dcb90dd291ac9de9b03ccf8cc68a2cb0162 |