Skip to main content

Automating Data Science

Project description

GML Brain+Machine Adding AI Revolution

Generic badge Generic badge Generic badge Generic badge
PyPI version PyPI license PyPI pyversions GitHub issues

Creators

Muhammad Ahmed
Naman Tuli

Contributors

Rafey Iqbal Rahman

Tired of doing Data Science manually? GML is here for you!

GML is an automatic data science library in python built on top of multiple Python packages. Complete features which we offer are listed as:


Installation:


pip install GML

https://pypi.org/project/GML

Features:


Auto Feature Engineering



from GML import FeatureEngineering

fe = FeatureEngineering(Data, 'target', fill_missing_data=True, encode_data=True, 
                        normalize=True, remove_outliers=True, 
                        new_features=True, feateng_steps=2 ) # feateng_steps = 0 for features selection without feature creation

X_new, y, test = fe.get_new_data()

Click Here for complete DEMO


Auto EDA (Powered by Sweetviz)



from GML import sweetviz

result1 = sweetviz.compare([train,'train'],[test,'test'],'target') 
result2 = sweetviz.analyze([train,'train'])

result.show_html()
result2.show_html()

Click Here for complete DEMO


Auto Machine Learning



from GML import AutoML

gml_ml = AutoML()

gml_ml.GMLClassifier(X, y, metric = accuracy_score, folds = 10)

Click Here for complete DEMO

Auto Text Cleaning



from GML import AutoNLP

nlp = AutoNLP()

cleanX = X.apply(lambda x: nlp.clean(x))

Click Here for complete DEMO


Auto Text Classification using transformers



from GML import AutoNLP

nlp = AutoNLP()

nlp.set_params(cleanX, tokenizer_name='roberta-large-mnli', BATCH_SIZE=4,
               model_name='roberta-large-mnli', MAX_LEN=200)

model = nlp.train_model(tokenizedX, y)

Click Here for complete DEMO


Auto Image Classification with Augmentation



from GML import Auto_Image_Processing

gml_image_processing = Auto_Image_Processing()

model = gml_image_processing.imgClassificationcsv(img_path = './covid_image_data/train', 
                                                  train_path = './covid_image_data/Training_set_covid.csv', 
                                                  model_list = models,
                                                 tfms = True, advance_augmentation = True, 
                                                  epochs=1)

Click Here for complete DEMO


Text Augmentation using transformers: GPT-2



from GML import AutoNLP

nlp = AutoNLP()

nlp.augmentation_train('./data.csv')

nlp.set_params(X['Text'])

new_Text = nlp.augmentation_generate(y = y, SENTENCES = 100) 

Click Here for complete DEMO



More cool features and handling of different data types like audio data etc will be added in future.
Feel free to give suggestions, report bugs and contribute.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for GML, version 3.0.6
Filename, size File type Python version Upload date Hashes
Filename, size GML-3.0.6.tar.gz (15.4 MB) File type Source Python version None Upload date Hashes View
Filename, size GML-3.0.6-py3-none-any.whl (15.5 MB) File type Wheel Python version py3 Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page