Data Imputer API
Project description
Data Imputer API in Python
Check out the Wiki here.
'imputerApi' Documentation.
Currently Supported Strategies:
- Mean
- Median
- Most-Frequent
- Constant
Usage:
Read from csv file:
from imputerApi import ImputerApi
# Create instance of class
imm_api = ImputerApi(path_to_file="data.csv",strategy='mean', headers=True)
# Print data in console
imm_api.print_table(imm_api.data)
# Transform data by replacing missing values with mean
# and selecting only columns Age and Salary with indexes 1 and 2
replaced_data = imm_api.transform(column_indexes=[1, 2])
# Print repalced data in console
imm_api.print_table(replaced_data)
# Write new data to csv file
imm_api.dump_data_to_csv('datanew_mean.csv', replaced_data,use_header_from_data=True, override=True)
Read from a Two Dimensional Matrix (Python List):
from imputerApi import ImputerApi
matrix_2d = [
['Country', 'Age', 'Salary', 'Purchased'],
['France', 44, 72000, 'No'],
['Spain', 27, 48000, 'Yes'],
['Germany', 30, 54000, 'No'],
['Spain', 38, 61000, 'No'],
['Germany', 40, '', 'Yes'],
['France', 35, 58000, 'Yes'],
['Spain', '', 52000, 'No'],
['France', 48, 79000, 'Yes'],
['Germany', 50, 83000, 'No'],
['France', 37, 67000, 'Yes']
]
# Create instance of class
imm_api = ImputerApi(matrix_2D=matrix_2d, strategy='median', headers=True)
# Print data in console
imm_api.print_table(imm_api.data)
# Transform data by replacing missing values with median
# and selecting only columns Age and Salary
replaced_data = imm_api.transform(columns_by_header_name=["Age","Salary"])
# Print repalced data in console
imm_api.print_table(replaced_data)
# Write new data to csv file
imm_api.dump_data_to_csv('datanew_median.csv', replaced_data,use_header_from_data=True,override=True)
# Create instance with strategy most-frequent
imm_api_most_freq = ImputerApi(path_to_file='datanew_median.csv',strategy="most-frequent",headers=True)
imm_api_most_freq.print_table(imm_api_most_freq.data)
# Transform data by replacing missing values with most-frequent
# and selecting only column Purchased
replaced_data = imm_api_most_freq.transform(columns_by_header_name=["Purchased"])
imm_api_most_freq.print_table(replaced_data)
# Write new table to csv file
imm_api_most_freq.dump_data_to_csv('datanew_most_frequent.csv', replaced_data,
use_header_from_data=True, override=True)
Integrating with pandas,numpy:
from imputerApi import ImputerApi
import numpy as np
import pandas as pd
# Read csv data as Pandas DataFrame
df = pd.read_csv('data.csv')
# Convert Pandas Dataframe to Numpy Array
arr = df.values
# Convert Numpy Array to Python List
arr_list = arr.tolist()
# Pass List to ImputerApi in parameter matrix_2D ; headers = False since it is 2D array
imputer_api = ImputerApi(matrix_2D=arr_list,strategy="mean",headers=False)
# Replacing missing value 'np.nan' with mean
replaced_data = imputer_api.transform(column_indexes=[1,2],missing_value=np.nan)
# Print to console
imputer_api.print_table(arr_2D=replaced_data)
# Write data to CSV file2
imputer_api.dump_data_to_csv("data2.csv",replaced_data,override=True)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ImputerApi-0.0.1.tar.gz
(6.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ImputerApi-0.0.1.tar.gz.
File metadata
- Download URL: ImputerApi-0.0.1.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.20.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e72402d19ad6b9e3353252531ac6751908dcf16ee1cfa3915a45b6cec5395e6b
|
|
| MD5 |
57afe8f9ec723f28006e93025deb8ed3
|
|
| BLAKE2b-256 |
214401ec7828b2107c4bd26982584482fa6225b5eda155842cac95ee98eb0fd3
|
File details
Details for the file ImputerApi-0.0.1-py3-none-any.whl.
File metadata
- Download URL: ImputerApi-0.0.1-py3-none-any.whl
- Upload date:
- Size: 7.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.20.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfd206b8520ca27bfa51e0945730b0e8d70eab1775ffa0e4ab65c8ddf35f67b9
|
|
| MD5 |
e97b070a773481f9e01229e02b4d730d
|
|
| BLAKE2b-256 |
6fb3873c263292ecb5c909ed44da0339daeb8ab5820b6dd6e11f8d2453ce21ec
|