Skip to main content

data pre-processing library

Project description

mern

mern is python library to help us process our dataset, it can process numeric and text data

Installation

pip3 install mern

or

git clone https://github.com/bluenet-analytica/mern.git && cd mern && pip3 install -r requirements.txt

1. Remove outlier in numerical data

There are 2 ways to remove data on numerical data type

  1. Z Score
  2. Inter Quartile Score Range (IQR Score)
from mern import NumericOutlier

obj = NumericOutlier()
x = [11,31,21,19,8,54,35,26,23,13,29,17]

# using Z Score
print(obj.find(x, "zscore"))

# using Inter Quartile Range Score
print(obj.find(x, "iqr"))

2. Remove outlier in text data

from mern import TextOutlier

obj = TextOutlier()
text = "abcd!G#45!"

# remove punctuation ex : !@#$%
no_punctuation = obj.remove_punctuation([text])
print(no_punctuation)

# remove stop words ex : this, the, a, etc
# tweets by @SomeGuyAbides

tweets = "Is a burning compressed liquid hydrogen as rocket fuel feasible for a propellant? Could this process be deleveloped through electrolysis of water? "

no_sw = obj.remove_stopwords([tweets], lang="english")

That's it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mern-0.6.tar.gz (3.8 kB view details)

Uploaded Source

File details

Details for the file mern-0.6.tar.gz.

File metadata

  • Download URL: mern-0.6.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.1

File hashes

Hashes for mern-0.6.tar.gz
Algorithm Hash digest
SHA256 e505357694a16dc9ca54a817d8f97f92b5153282fdabde2b2a01b9b708862aef
MD5 eba89627343a71549274a8443c6e8035
BLAKE2b-256 007f8539d187953904aeb72a5e7764905ff5b62f39385ef7ae3bb27dd9ccc471

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page