data pre-processing library
Project description
mern
mern is python library to help us process our dataset, it can process numeric and text data
Installation
pip3 install mern
or
git clone https://github.com/bluenet-analytica/mern.git && cd mern && pip3 install -r requirements.txt
1. Remove outlier in numerical data
There are 2 ways to remove data on numerical data type
- Z Score
- Inter Quartile Score Range (IQR Score)
from mern import NumericOutlier
obj = NumericOutlier()
x = [11,31,21,19,8,54,35,26,23,13,29,17]
# using Z Score
print(obj.find(x, "zscore"))
# using Inter Quartile Range Score
print(obj.find(x, "iqr"))
2. Remove outlier in text data
from mern import TextOutlier
obj = TextOutlier()
text = "abcd!G#45!"
# remove punctuation ex : !@#$%
no_punctuation = obj.remove_punctuation([text])
print(no_punctuation)
# remove stop words ex : this, the, a, etc
# tweets by @SomeGuyAbides
tweets = "Is a burning compressed liquid hydrogen as rocket fuel feasible for a propellant? Could this process be deleveloped through electrolysis of water? "
no_sw = obj.remove_stopwords([tweets], lang="english")
That's it.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mern-0.6.tar.gz
(3.8 kB
view details)
File details
Details for the file mern-0.6.tar.gz
.
File metadata
- Download URL: mern-0.6.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e505357694a16dc9ca54a817d8f97f92b5153282fdabde2b2a01b9b708862aef |
|
MD5 | eba89627343a71549274a8443c6e8035 |
|
BLAKE2b-256 | 007f8539d187953904aeb72a5e7764905ff5b62f39385ef7ae3bb27dd9ccc471 |