Skip to main content

data pre-processing library

Project description

mern

mern is python library to help us process our dataset, it can process numeric and text data

Installation

pip3 install mern

or

git clone https://github.com/bluenet-analytica/mern.git && cd mern && pip3 install -r requirements.txt

1. Remove outlier in numerical data

There are 2 ways to remove data on numerical data type

  1. Z Score
  2. Inter Quartile Score Range (IQR Score)
from mern import NumericOutlier

obj = NumericOutlier()
x = [11,31,21,19,8,54,35,26,23,13,29,17]

# using Z Score
print(obj.find(x, "zscore"))

# using Inter Quartile Range Score
print(obj.find(x, "iqr"))

2. Remove outlier in text data

from mern import TextOutlier

obj = TextOutlier()
text = "abcd!G#45!"

# remove punctuation ex : !@#$%
no_punctuation = obj.remove_punctuation([text])
print(no_punctuation)

# remove stop words ex : this, the, a, etc
# tweets by @SomeGuyAbides

tweets = "Is a burning compressed liquid hydrogen as rocket fuel feasible for a propellant? Could this process be deleveloped through electrolysis of water? "

no_sw = obj.remove_stopwords([tweets], lang="english")

That's it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mern-0.6.tar.gz (3.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page