Automating Data Science
Project description
GML
Creators
Muhammad AhmedNaman Tuli
Contributors
Rafey Iqbal RahmanTired of doing Data Science manually? GML is here for you!
GML is an automatic data science library in python built on top of multiple Python packages. Complete features which we offer are listed as:
Installation:
pip install GML
https://pypi.org/project/GML
Features:
Auto Feature Engineering
from GML import FeatureEngineering
fe = FeatureEngineering(Data, 'target', fill_missing_data=True, encode_data=True,
normalize=True, remove_outliers=True,
new_features=True, feateng_steps=2 ) # feateng_steps = 0 for features selection without feature creation
X_new, y, test = fe.get_new_data()
Click Here for complete DEMO
Auto EDA (Powered by Sweetviz)
from GML import sweetviz
result1 = sweetviz.compare([train,'train'],[test,'test'],'target')
result2 = sweetviz.analyze([train,'train'])
result.show_html()
result2.show_html()
Click Here for complete DEMO
Auto Machine Learning
from GML import AutoML
gml_ml = AutoML()
gml_ml.GMLClassifier(X, y, metric = accuracy_score, folds = 10)
Click Here for complete DEMO
Auto Text Cleaning
from GML import AutoNLP
nlp = AutoNLP()
cleanX = X.apply(lambda x: nlp.clean(x))
Click Here for complete DEMO
Auto Text Classification using transformers
from GML import AutoNLP
nlp = AutoNLP()
nlp.set_params(cleanX, tokenizer_name='roberta-large-mnli', BATCH_SIZE=4,
model_name='roberta-large-mnli', MAX_LEN=200)
model = nlp.train_model(tokenizedX, y)
Click Here for complete DEMO
Auto Image Classification with Augmentation
from GML import Auto_Image_Processing
gml_image_processing = Auto_Image_Processing()
model = gml_image_processing.imgClassificationcsv(img_path = './covid_image_data/train',
train_path = './covid_image_data/Training_set_covid.csv',
model_list = models,
tfms = True, advance_augmentation = True,
epochs=1)
Click Here for complete DEMO
Text Augmentation using transformers: GPT-2
from GML import AutoNLP
nlp = AutoNLP()
nlp.augmentation_train('./data.csv')
nlp.set_params(X['Text'])
new_Text = nlp.augmentation_generate(y = y, SENTENCES = 100)
Click Here for complete DEMO
More cool features and handling of different data types like audio data etc will be added in future.
Feel free to give suggestions, report bugs and contribute.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
GML-3.0.3.tar.gz
(15.3 MB
view hashes)
Built Distribution
GML-3.0.3-py3-none-any.whl
(15.4 MB
view hashes)