Skip to main content

Outlier Removal Using Z-score or IQR

Project description

Library for removing outliers from pandas dataframe

PROJECT 2, UCS633 - Data Analysis and Visualization
Navkiran Singh  
COE17
Roll number: 101703365

Update in 1.1.0 - command line script method changed, supports calling from both windows and linux terminal

Takes two inputs - filename of input csv, intended filename of output csv.

Two optional arguments - which must be provided together or left out. Third argument is threshold, by default it's 1.5. Fourth argument is method - z_score or IQR.

Output is the number of rows removed from the input dataset. Resulting csv is saved as output.csv.

Installation

pip install outliers_navkiran

Recommended - test in a virtual environment.

Use via command line

outliers_navkiran_cli in.csv out.csv

Defaults are 1.5 threshold and IQR.

When providing custom threshold and method:

outliers_navkiran_cli in.csv out.csv 3 z_score outliers_navkiran_cli in.csv out.csv 3 IQR

First argument after outcli is the input csv filename from which the dataset is extracted. The second argument is for storing the final dataset after processing.

Use in .py script

from outliers_navkiran import remove_outliers_z,remove_outliers_iqr
# for using z-score
remove_outliers_z('input.csv', 'output.csv',1.5)
# for using IQR
remove_outliers_iqr('input.csv', 'output.csv',1.5)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outliers_navkiran-1.1.0.tar.gz (2.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

outliers_navkiran-1.1.0-py3-none-any.whl (4.3 kB view details)

Uploaded Python 3

File details

Details for the file outliers_navkiran-1.1.0.tar.gz.

File metadata

  • Download URL: outliers_navkiran-1.1.0.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.5

File hashes

Hashes for outliers_navkiran-1.1.0.tar.gz
Algorithm Hash digest
SHA256 1b2da4c9a500098c5acac0ef4736f21ed4e2600835bb0f6acc525fab2cc535f7
MD5 ff9082e7a6b90cf91448693775354c25
BLAKE2b-256 67e02755d1bbe407018d775a7dbf122c9147858d4b5b8c7ec5ff2486bed30762

See more details on using hashes here.

File details

Details for the file outliers_navkiran-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: outliers_navkiran-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.5

File hashes

Hashes for outliers_navkiran-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 36632ae478d7011f6e3582121b280ae5f38fe220aea1a04874443f1a403f7959
MD5 34a743a4148e3c365e2f0dd39259a8a3
BLAKE2b-256 f22ff7148009c31b5c2edbe12b2f0ecc4bf08eb738bb14e41b0a0e83553b8eac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page