Data Imputer API
Project description
Data Imputer API in Python
Check out the Wiki here.
'imputerApi' Documentation.
Currently Supported Strategies:
- Mean
- Median
- Most-Frequent
- Constant
Usage:
Read from csv file:
from imputerApi import ImputerApi
# Create instance of class
imm_api = ImputerApi(path_to_file="data.csv",strategy='mean', headers=True)
# Print data in console
imm_api.print_table(imm_api.data)
# Transform data by replacing missing values with mean
# and selecting only columns Age and Salary with indexes 1 and 2
replaced_data = imm_api.transform(column_indexes=[1, 2])
# Print repalced data in console
imm_api.print_table(replaced_data)
# Write new data to csv file
imm_api.dump_data_to_csv('datanew_mean.csv', replaced_data,use_header_from_data=True, override=True)
Read from a Two Dimensional Matrix (Python List):
from imputerApi import ImputerApi
matrix_2d = [
['Country', 'Age', 'Salary', 'Purchased'],
['France', 44, 72000, 'No'],
['Spain', 27, 48000, 'Yes'],
['Germany', 30, 54000, 'No'],
['Spain', 38, 61000, 'No'],
['Germany', 40, '', 'Yes'],
['France', 35, 58000, 'Yes'],
['Spain', '', 52000, 'No'],
['France', 48, 79000, 'Yes'],
['Germany', 50, 83000, 'No'],
['France', 37, 67000, 'Yes']
]
# Create instance of class
imm_api = ImputerApi(matrix_2D=matrix_2d, strategy='median', headers=True)
# Print data in console
imm_api.print_table(imm_api.data)
# Transform data by replacing missing values with median
# and selecting only columns Age and Salary
replaced_data = imm_api.transform(columns_by_header_name=["Age","Salary"])
# Print repalced data in console
imm_api.print_table(replaced_data)
# Write new data to csv file
imm_api.dump_data_to_csv('datanew_median.csv', replaced_data,use_header_from_data=True,override=True)
# Create instance with strategy most-frequent
imm_api_most_freq = ImputerApi(path_to_file='datanew_median.csv',strategy="most-frequent",headers=True)
imm_api_most_freq.print_table(imm_api_most_freq.data)
# Transform data by replacing missing values with most-frequent
# and selecting only column Purchased
replaced_data = imm_api_most_freq.transform(columns_by_header_name=["Purchased"])
imm_api_most_freq.print_table(replaced_data)
# Write new table to csv file
imm_api_most_freq.dump_data_to_csv('datanew_most_frequent.csv', replaced_data,
use_header_from_data=True, override=True)
Integrating with pandas,numpy:
from imputerApi import ImputerApi
import numpy as np
import pandas as pd
# Read csv data as Pandas DataFrame
df = pd.read_csv('data.csv')
# Convert Pandas Dataframe to Numpy Array
arr = df.values
# Convert Numpy Array to Python List
arr_list = arr.tolist()
# Pass List to ImputerApi in parameter matrix_2D ; headers = False since it is 2D array
imputer_api = ImputerApi(matrix_2D=arr_list,strategy="mean",headers=False)
# Replacing missing value 'np.nan' with mean
replaced_data = imputer_api.transform(column_indexes=[1,2],missing_value=np.nan)
# Print to console
imputer_api.print_table(arr_2D=replaced_data)
# Write data to CSV file2
imputer_api.dump_data_to_csv("data2.csv",replaced_data,override=True)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ImputerApi-0.0.2.tar.gz
(6.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ImputerApi-0.0.2.tar.gz.
File metadata
- Download URL: ImputerApi-0.0.2.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.20.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46a72ccad350b20330dcfc74980b167e11e59a60b97a0e42ea431c2b6c944d9e
|
|
| MD5 |
116559aff546300c158857c496b37bc0
|
|
| BLAKE2b-256 |
7310115126c459a3b2ae4a8ec82810b697ba021e906758665893203734e75246
|
File details
Details for the file ImputerApi-0.0.2-py3-none-any.whl.
File metadata
- Download URL: ImputerApi-0.0.2-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.20.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e4a7ffd9a4593ea8ff95e3ee563a0522d1008b380eda4735a66d2715649db4ee
|
|
| MD5 |
575790e3449666865a56926a3d0cb2e6
|
|
| BLAKE2b-256 |
1dd659ca9fd4b2687e5f002deabe591e484db2c9666254d3aba7f1a8ee92494d
|