Skip to main content

AUTO Machine Learning & AUTO Feature Engineering with many powerful tools.

Project description

GML - Ghalat Machine Learning! Brain+Machine Adding AI Revolution

Generic badge Generic badge
PyPI version fury.io PyPI license PyPI pyversions GitHub issues

Tired of engineering the data, analyzing it to make new features and training multiple models and then picking the best among them? No worries now! GML is here for you!

GML is an automatic machine learning and feature engineering library in python built on top of Multiple Machine Learning packages. with this library,you can find and fill the missing values in your data, encode them, generate new features from them, select the best features and train your data on multiple machine learning algorithms and a neural network! not only training but scaling the data for normal distribution and after scaling and training, testing the data on validation data. in AUTO Machine Learning, there would be two rounds, in first round all the models will compete for top 5 and after that in second round those top 5 will compete for number one spot. the first ranked model will be returned (untrained, so you can train it yourself and check results).
You already got some models? no problem! pass them to us to make them compete with our models and let see who wins ;-)

In future updates many other things will also be automated like hyper parameter tunning, multiple neural network architectures, other machine learning algorithms and many more cool things!

Installation:


pip install GML

https://pypi.org/project/GML

See GML in Action!!



Function description:


Auto_Feature_Engineering


* X
  Data columns excluding target column
* y
  target column
* type_of_task
  Either 'Regression' or 'Classification' (default = None)
  Optional, but in the case of feature generation, compulsory.
* test_data
  test data if any. (default = None)
* splits
  splits for stratified k folds when encoding features with target encoding (default = 5)
* fill_na_
  fill missing values in the columns, either 'Mean' , 'Median' , 'Mode'. for string/character data = Mode. by default = Median 
* ratio_drop
  if there are so many missing values in column, so its better to drop them. default = 0.2
* generate_features
  generate new features and select the important ones only (default = False)
* feateng_steps
  the more step = the more features and more computational power required (default = 2)
* max_gb 
  limit of gbs

GMLRegressor and GMLClassifier


* X 
  Data column excluding the target column. it can either be a pandas dataframe or a numpy array. but please make sure your data doesn't contains missing data or non-numeric data. (clean it before passing)
* y 
  The targeted column

Below parameters are optional.

* metric
  metric on which you want to test your model. by default, it is mean-squared-error for regression and accuracy score for classification
* test_Size 
  size to split your test data, by default = 0.3 (70% training 30% testing)
* folds (only in GMLClassifier)
  Data will also be validated using KFolds. pass number of folds. by default folds = 5
* shuffle
  Shuffle the data when spliting for validation. by default = True
* scaler
  for Scaler pass:  
    'SS' for StandardScalar
    'MM' for MinMaxScalar
    'log' for Log scalar
     None for not scaling
  by default: StandardScalar
* models
  You got your own models to make them compete with our models? pass them in a list here. default = None
* neural_net
  Want to train on Neural Networks? Pass 'Yes', default = 'No'
* epochs
  for neural networks, by default = 10 
* verbose
  for neural networks, by default = True

Parameter when creating object of GML

models = Ghalat_Machine_Learning(n_estimators=300)
  • by default n_estimators are 300, you can change it to whatever you want.

As its first version of GML, feel free to give suggestions,ask questions,report bugs etc in issues portion of this repository!
you can directly contact me at: m.ahmed.memonn@gmail.com

I haven't uploaded source code yet on this repo. will upload it later after writing comments.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GML-2.0.4.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

GML-2.0.4-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file GML-2.0.4.tar.gz.

File metadata

  • Download URL: GML-2.0.4.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.9

File hashes

Hashes for GML-2.0.4.tar.gz
Algorithm Hash digest
SHA256 c352b3e6293424b012fe1314eff7bb9113154bfd7fb58b83a3424ee63904f82d
MD5 5534cb4da67f7528d58db6039154a707
BLAKE2b-256 acdc5bdba032f9813abb62b267dbf47588f7f9c4c3fbdc2f05bed25795f60305

See more details on using hashes here.

File details

Details for the file GML-2.0.4-py3-none-any.whl.

File metadata

  • Download URL: GML-2.0.4-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.9

File hashes

Hashes for GML-2.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9f544bc6a582c37a84a91c214d670f0388a13874134378a3d0c5e853e9fd644b
MD5 5b72c8efbe5044255ed82c5e931f4d45
BLAKE2b-256 b0913580e3e1f4151fed64cf37840bae994ea7d1409a58a527b8fd010c31c909

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page