Skip to main content

Outlier Removal Using Z-score or IQR

Project description

Library for removing outliers from pandas dataframe

PROJECT 2, UCS633 - Data Analysis and Visualization
Navkiran Singh  
COE17
Roll number: 101703365

Update in 1.1.0 - command line script method changed, supports calling from both windows and linux terminal

Takes two inputs - filename of input csv, intended filename of output csv.

Two optional arguments - which must be provided together or left out. Third argument is threshold, by default it's 1.5. Fourth argument is method - z_score or IQR.

Output is the number of rows removed from the input dataset. Resulting csv is saved as output.csv.

Installation

pip install outliers_navkiran

Recommended - test in a virtual environment.

Use via command line

outliers_navkiran_cli in.csv out.csv

Defaults are 1.5 threshold and IQR.

When providing custom threshold and method:

outliers_navkiran_cli in.csv out.csv 3 z_score outliers_navkiran_cli in.csv out.csv 3 IQR

First argument after outcli is the input csv filename from which the dataset is extracted. The second argument is for storing the final dataset after processing.

Use in .py script

from outliers_navkiran import remove_outliers_z,remove_outliers_iqr
# for using z-score
remove_outliers_z('input.csv', 'output.csv',1.5)
# for using IQR
remove_outliers_iqr('input.csv', 'output.csv',1.5)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for outliers-navkiran, version 1.1.0
Filename, size File type Python version Upload date Hashes
Filename, size outliers_navkiran-1.1.0-py3-none-any.whl (4.3 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size outliers_navkiran-1.1.0.tar.gz (2.9 kB) File type Source Python version None Upload date Hashes View

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page