Skip to main content

Generic model API, Model Zoo in Tensorflow, Keras, Pytorch, Hyperparamter search

Project description

mlmodels

This repository is the Model ZOO for Pytorch, Tensorflow, Keras, Gluon, LightGBM, Keras, Sklearn models etc with Lightweight Functional interface to wrap access to Recent and State of Art Deep Learning, ML models and Hyper-Parameter Search, cross platforms that follows the logic of sklearn, such as fit, predict, transform, metrics, save, load etc. Now, recent models are available across those fields :

  • Time Series,
  • Text classification,
  • Vision,
  • Image Generation,Text generation,
  • Gradient Boosting, Automatic Machine Learning tuning,
  • Hyper-parameter search.

With the goal to transform Script/Research code into re-usable batch/code with minimal code change, we used functional interface instead of pure OOP. This is because functional reduces the amount of code needed which is good to scientific computing. Thus, we can focus on the computing part than design. Also, it is easy to maintain for medium size project.

A collection of Deep Learning and Machine Learning research papers is available in this repository.

alt text alt text alt text

Benefits :

Having a standard framework for both machine learning models and deep learning models, allows a step towards automatic Machine Learning. The collection of models, model zoo in Pytorch, Tensorflow, Keras allows removing dependency on one specific framework, and enable richer possibilities in model benchmarking and re-usage. Unique and simple interface, zero boilerplate code (!), and recent state of art models/frameworks are the main strength of MLMODELS. Emphasis is on traditional machine learning algorithms but recent state of art Deep Learning algorithms. Processing of high-dimensional data is considered very useful using Deep Learning. For different applications, such as computer vision, natural language processing, object detection, facial recognition and speech recognition, deep learning created significant improvements and outstanding results.

Here you can find usages guide

Model List :

Time Series:

  1. MILA, Nbeats: 2019, Advanced interpretable Time Series Neural Network, [Link]

  2. Amazon Deep AR: 2019, Multi-variate Time Series NNetwork, [Link]

  3. Facebook Prophet 2017, Time Series prediction [Link]

  4. ARMDN, Advanced Multi-variate Time series Prediction : 2019, Associative and Recurrent Mixture Density Networks for time series. [Link]

  5. LSTM Neural Network prediction : Stacked Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction [Link]

NLP:

  1. Sentence Transformers : 2019, Embedding of full sentences using BERT, [Link]

  2. Transformers Classifier : Using Transformer for Text Classification, [Link]

  3. TextCNN Pytorch : 2016, Text CNN Classifier, [Link]

  4. TextCNN Keras : 2016, Text CNN Classifier, [Link]

  5. Bi-directionnal Conditional Random Field LSTM for Name Entiryt Recognition, [Link]

  6. DRMM: Deep Relevance Matching Model for Ad-hoc Retrieval.[Link]

  7. DRMMTKS: Deep Top-K Relevance Matching Model for Ad-hoc Retrieval. [Link]

  8. ARC-I: Convolutional Neural Network Architectures for Matching Natural Language Sentences [Link]

  9. ARC-II: Convolutional Neural Network Architectures for Matching Natural Language Sentences [Link]

TABULAR:

LightGBM : Light Gradient Boosting

AutoML Gluon : 2020, AutoML in Gluon, MxNet using LightGBM, CatBoost

Auto-Keras : 2020, Automatic Keras model selection

All sklearn models :

linear_model.ElasticNetlinear_model.ElasticNetCVlinear_model.Larslinear_model.LarsCVlinear_model.Lassolinear_model.LassoCVlinear_model.LassoLarslinear_model.LassoLarsCVlinear_model.LassoLarsIClinear_model.OrthogonalMatchingPursuitlinear_model.OrthogonalMatchingPursuitCV

svm.LinearSVCsvm.LinearSVRsvm.NuSVCsvm.NuSVRsvm.OneClassSVMsvm.SVCsvm.SVRsvm.l1_min_c

neighbors.KNeighborsClassifierneighbors.KNeighborsRegressorneighbors.KNeighborsTransformer

Binary Neural Prediction from tabular data:

VISION:

  1. Vision Models (pre-trained) :
    alexnet: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size [Link]

  2. densenet121: Adversarial Perturbations Prevail in the Y-Channel of the YCbCr Color Space [Link]

  3. densenet169: Classification of TrashNet Dataset Based on Deep Learning Models [Link]

  4. densenet201: Utilization of DenseNet201 for diagnosis of breast abnormality [Link]

  5. densenet161: Automated classification of histopathology images using transfer learning [Link]

  6. inception_v3: Menfish Classification Based on Inception_V3 Convolutional Neural Network [Link]

  7. resnet18: Leveraging the VTA-TVM Hardware-Software Stack for FPGA Acceleration of 8-bit ResNet-18 Inference [Link]

  8. resnet34: Automated Pavement Crack Segmentation Using Fully Convolutional U-Net with a Pretrained ResNet-34 Encoder [Link]

  9. resnet50: Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes [Link]

  10. resnet101: Classification of Cervical MR Images using ResNet101 [Link]

  11. resnet152: Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: Automatic construction of onychomycosis datasets by region-based convolutional deep neural network [Link]

More resources are available here

######################################################################################

① Installation Guide:

(A) Using pre-installed Setup (one click) :

Read-more

(C) Using Colab :

Read-more

Initialize template and Tests

Will copy template, dataset, example to your folder

ml_models --init  /yourworkingFolder/
To test Hyper-parameter search:
ml_optim
To test model fitting
ml_models

Actual test runs

Read-more

test_fast_linux

test_fast_windows

 All model testing (Linux)


Usage in Jupyter/Colab

Read-more


Command Line tools:

Read-more


Model List

Read-more


How to add a new model

Read-more


Index of functions/methods

Read-more


LSTM example in TensorFlow (Example notebook)

Define model and data definitions

# import library
import mlmodels


model_uri    = "model_tf.1_lstm.py"
model_pars   =  {  "num_layers": 1,
                  "size": ncol_input, "size_layer": 128, "output_size": ncol_output, "timestep": 4,
                }
data_pars    =  {"data_path": "/folder/myfile.csv"  , "data_type": "pandas" }
compute_pars =  { "learning_rate": 0.001, }

out_pars     =  { "path": "ztest_1lstm/", "model_path" : "ztest_1lstm/model/"}
save_pars = { "path" : "ztest_1lstm/model/" }
load_pars = { "path" : "ztest_1lstm/model/" }


#### Load Parameters and Train
from mlmodels.models import module_load

module        =  module_load( model_uri= model_uri )                           # Load file definition
model         =  module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)             # Create Model instance
model, sess   =  module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)


#### Inference
metrics_val   =  module.fit_metrics( model, sess, data_pars, compute_pars, out_pars) # get stats
ypred         = module.predict(model, sess,  data_pars, compute_pars, out_pars)     # predict pipeline

AutoML example in Gluon (Example notebook)

# import library
import mlmodels
import autogluon as ag

#### Define model and data definitions
model_uri = "model_gluon.gluon_automl.py"
data_pars = {"train": True, "uri_type": "amazon_aws", "dt_name": "Inc"}

model_pars = {"model_type": "tabular",
              "learning_rate": ag.space.Real(1e-4, 1e-2, default=5e-4, log=True),
              "activation": ag.space.Categorical(*tuple(["relu", "softrelu", "tanh"])),
              "layers": ag.space.Categorical(
                          *tuple([[100], [1000], [200, 100], [300, 200, 100]])),
              'dropout_prob': ag.space.Real(0.0, 0.5, default=0.1),
              'num_boost_round': 10,
              'num_leaves': ag.space.Int(lower=26, upper=30, default=36)
             }

compute_pars = {
    "hp_tune": True,
    "num_epochs": 10,
    "time_limits": 120,
    "num_trials": 5,
    "search_strategy": "skopt"
}

out_pars = {
    "out_path": "dataset/"
}



#### Load Parameters and Train
from mlmodels.models import module_load

module        =  module_load( model_uri= model_uri )                           # Load file definition
model         =  module.Model(model_pars=model_pars, compute_pars=compute_pars)             # Create Model instance
model, sess   =  module.fit(model, data_pars=data_pars, model_pars=model_pars, compute_pars=compute_pars, out_pars=out_pars)      


#### Inference
ypred       = module.predict(model, data_pars, compute_pars, out_pars)     # predict pipeline

RandomForest example in Scikit-learn (Example notebook)

# import library
import mlmodels

#### Define model and data definitions
model_uri    = "model_sklearn.sklearn.py"

model_pars   = {"model_name":  "RandomForestClassifier", "max_depth" : 4 , "random_state":0}

data_pars    = {'mode': 'test', 'path': "../mlmodels/dataset", 'data_type' : 'pandas' }

compute_pars = {'return_pred_not': False}

out_pars    = {'path' : "../ztest"}


#### Load Parameters and Train
from mlmodels.models import module_load

module        =  module_load( model_uri= model_uri )                           # Load file definition
model         =  module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)             # Create Model instance
model, sess   =  module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)          # fit the model


#### Inference
ypred       = module.predict(model,  data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)     # predict pipeline

TextCNN example in keras (Example notebook)

# import library
import mlmodels

#### Define model and data definitions
model_uri    = "model_keras.textcnn.py"

data_pars    = {"path" : "../mlmodels/dataset/text/imdb.csv", "train": 1, "maxlen":400, "max_features": 10}

model_pars   = {"maxlen":400, "max_features": 10, "embedding_dims":50}

compute_pars = {"engine": "adam", "loss": "binary_crossentropy", "metrics": ["accuracy"] ,
                        "batch_size": 32, "epochs":1, 'return_pred_not':False}

out_pars     = {"path": "ztest/model_keras/textcnn/"}



#### Load Parameters and Train
from mlmodels.models import module_load

module        =  module_load( model_uri= model_uri )                           # Load file definition
model         =  module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)             # Create Model instance
module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)          # fit the model


#### Inference
data_pars['train'] = 0
ypred       = module.predict(model,  data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)

Using json config file for input (Example notebook, JSON file)

Import library and functions

# import library
import mlmodels

#### Load model and data definitions from json
from mlmodels.models import module_load
from mlmodels.util import load_config

model_uri    = "model_tf.1_lstm.py"
module        =  module_load( model_uri= model_uri )                           # Load file definition

model_pars, data_pars, compute_pars, out_pars = module.get_params(param_pars={
    'choice':'json',
    'config_mode':'test',
    'data_path':'../mlmodels/example/1_lstm.json'
})

#### Load parameters and train
model         =  module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)             # Create Model instance
model, sess   =  module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)          # fit the model

#### Check inference
ypred       = module.predict(model, sess=sess,  data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)     # predict pipeline

Using Scikit-learn's SVM for Titanic Problem from json file (Example notebook, JSON file)

Import library and functions

# import library
import mlmodels

#### Load model and data definitions from json
from mlmodels.models import module_load
from mlmodels.util import load_config

model_uri    = "model_sklearn.sklearn.py"
module        =  module_load( model_uri= model_uri )                           # Load file definition

model_pars, data_pars, compute_pars, out_pars = module.get_params(param_pars={
    'choice':'json',
    'config_mode':'test',
    'data_path':'../mlmodels/example/sklearn_titanic_svm.json'
})

#### Load Parameters and Train

model         =  module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)             # Create Model instance
model, sess   =  module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)          # fit the model


#### Inference
ypred       = module.predict(model,  data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)     # predict pipeline
ypred


#### Check metrics
import pandas as pd
from sklearn.metrics import roc_auc_score

y = pd.read_csv('../mlmodels/dataset/tabular/titanic_train_preprocessed.csv')['Survived'].values
roc_auc_score(y, ypred)

Using Scikit-learn's Random Forest for Titanic Problem from json file (Example notebook, JSON file)

Import library and functions

# import library
import mlmodels

#### Load model and data definitions from json
from mlmodels.models import module_load
from mlmodels.util import load_config

model_uri    = "model_sklearn.sklearn.py"
module        =  module_load( model_uri= model_uri )                           # Load file definition

model_pars, data_pars, compute_pars, out_pars = module.get_params(param_pars={
    'choice':'json',
    'config_mode':'test',
    'data_path':'../mlmodels/example/sklearn_titanic_randomForest.json'
})


#### Load Parameters and Train
model         =  module.Model(model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars)             # Create Model instance
model, sess   =  module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)          # fit the model


#### Inference

ypred       = module.predict(model,  data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)     # predict pipeline
ypred

#### Check metrics
import pandas as pd
from sklearn.metrics import roc_auc_score

y = pd.read_csv('../mlmodels/dataset/tabular/titanic_train_preprocessed.csv')['Survived'].values
roc_auc_score(y, ypred)

Using Autogluon for Titanic Problem from json file (Example notebook, JSON file)

Import library and functions

# import library
import mlmodels

#### Load model and data definitions from json
from mlmodels.models import module_load
from mlmodels.util import load_config

model_uri    = "model_gluon.gluon_automl.py"
module        =  module_load( model_uri= model_uri )                           # Load file definition

model_pars, data_pars, compute_pars, out_pars = module.get_params(
    choice='json',
    config_mode= 'test',
    data_path= '../mlmodels/example/gluon_automl.json'
)


#### Load Parameters and Train
model         =  module.Model(model_pars=model_pars, compute_pars=compute_pars)             # Create Model instance
model   =  module.fit(model, model_pars=model_pars, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)          # fit the model
model.model.fit_summary()


#### Check inference
ypred       = module.predict(model,  data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)     # predict pipeline

#### Check metrics
model.model.model_performance

import pandas as pd
from sklearn.metrics import roc_auc_score

y = pd.read_csv('../mlmodels/dataset/tabular/titanic_train_preprocessed.csv')['Survived'].values
roc_auc_score(y, ypred)


Using hyper-params (optuna) for Titanic Problem from json file (Example notebook, JSON file)

Import library and functions

# import library
from mlmodels.models import module_load
from mlmodels.optim import optim
from mlmodels.util import params_json_load


#### Load model and data definitions from json

###  hypermodel_pars, model_pars, ....
model_uri   = "model_sklearn.sklearn.py"
config_path = path_norm( 'example/hyper_titanic_randomForest.json'  )
config_mode = "test"  ### test/prod



#### Model Parameters
hypermodel_pars, model_pars, data_pars, compute_pars, out_pars = params_json_load(config_path, config_mode= config_mode)
print( hypermodel_pars, model_pars, data_pars, compute_pars, out_pars)


module            =  module_load( model_uri= model_uri )                      
model_pars_update = optim(
    model_uri       = model_uri,
    hypermodel_pars = hypermodel_pars,
    model_pars      = model_pars,
    data_pars       = data_pars,
    compute_pars    = compute_pars,
    out_pars        = out_pars
)


#### Load Parameters and Train
model         =  module.Model(model_pars=model_pars_update, data_pars=data_pars, compute_pars=compute_pars)y
model, sess   =  module.fit(model, data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)

#### Check inference
ypred         = module.predict(model,  data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)     # predict pipeline
ypred


#### Check metrics
import pandas as pd
from sklearn.metrics import roc_auc_score

y = pd.read_csv( path_norm('dataset/tabular/titanic_train_preprocessed.csv') )
y = y['Survived'].values
roc_auc_score(y, ypred)

Using LightGBM for Titanic Problem from json file (Example notebook, JSON file)

Import library and functions

# import library
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import path_norm_dict, path_norm
import json

#### Load model and data definitions from json
# Model defination
model_uri    = "model_sklearn.model_lightgbm.py"
module        =  module_load( model_uri= model_uri)

# Path to JSON
data_path = '../dataset/json/lightgbm_titanic.json'  

# Model Parameters
pars = json.load(open( data_path , mode='r'))
for key, pdict in  pars.items() :
  globals()[key] = path_norm_dict( pdict   )   ###Normalize path

#### Load Parameters and Train
model = module.Model(model_pars, data_pars, compute_pars) # create model instance
model, session = module.fit(model, data_pars, compute_pars, out_pars) # fit model


#### Check inference
ypred       = module.predict(model,  data_pars=data_pars, compute_pars=compute_pars, out_pars=out_pars)     # get predictions
ypred


#### Check metrics
metrics_val = module.fit_metrics(model, data_pars, compute_pars, out_pars)
metrics_val 

Using Vision CNN RESNET18 for MNIST dataset (Example notebook, JSON file)

# import library
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import path_norm_dict, path_norm, params_json_load
import json


#### Model URI and Config JSON
model_uri   = "model_tch.torchhub.py"
config_path = path_norm( 'model_tch/torchhub_cnn.json'  )
config_mode = "test"  ### test/prod


#### Model Parameters
hypermodel_pars, model_pars, data_pars, compute_pars, out_pars = params_json_load(config_path, config_mode= config_mode)
print( hypermodel_pars, model_pars, data_pars, compute_pars, out_pars)


#### Setup Model 
module         = module_load( model_uri)
model          = module.Model(model_pars, data_pars, compute_pars) 
`
#### Fit
model, session = module.fit(model, data_pars, compute_pars, out_pars)           #### fit model
metrics_val    = module.fit_metrics(model, data_pars, compute_pars, out_pars)   #### Check fit metrics
print(metrics_val)


#### Inference
ypred          = module.predict(model, session, data_pars, compute_pars, out_pars)   
print(ypred)

Using ARMDN Time Series (Example notebook, JSON file)

# import library
import mlmodels
from mlmodels.models import module_load
from mlmodels.util import path_norm_dict, path_norm, params_json_load
import json


#### Model URI and Config JSON
model_uri   = "model_keras.ardmn.py"
config_path = path_norm( 'model_keras/ardmn.json'  )
config_mode = "test"  ### test/prod




#### Model Parameters
hypermodel_pars, model_pars, data_pars, compute_pars, out_pars = params_json_load(config_path, config_mode= config_mode)
print( hypermodel_pars, model_pars, data_pars, compute_pars, out_pars)


#### Setup Model 
module         = module_load( model_uri)
model          = module.Model(model_pars, data_pars, compute_pars) 
`
#### Fit
model, session = module.fit(model, data_pars, compute_pars, out_pars)           #### fit model
metrics_val    = module.fit_metrics(model, data_pars, compute_pars, out_pars)   #### Check fit metrics
print(metrics_val)


#### Inference
ypred          = module.predict(model, session, data_pars, compute_pars, out_pars)   
print(ypred)



#### Save/Load
module.save(model, save_pars ={ 'path': out_pars['path'] +"/model/"})

model2 = module.load(load_pars ={ 'path': out_pars['path'] +"/model/"})

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlmodels-0.38.1.tar.gz (5.2 MB view details)

Uploaded Source

Built Distribution

mlmodels-0.38.1-py3-none-any.whl (4.3 MB view details)

Uploaded Python 3

File details

Details for the file mlmodels-0.38.1.tar.gz.

File metadata

  • Download URL: mlmodels-0.38.1.tar.gz
  • Upload date:
  • Size: 5.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.6.5

File hashes

Hashes for mlmodels-0.38.1.tar.gz
Algorithm Hash digest
SHA256 34fa60bb3aaa4997ac28c9e2e64176ce05f39351d8d314b4d8a5d3194d0d74cc
MD5 f545272393d79fa38c62f5d030d9dc85
BLAKE2b-256 8a6923f54dc4af5166b555115d1f50b460c2f87462ab44df92d1debfcc3051d7

See more details on using hashes here.

File details

Details for the file mlmodels-0.38.1-py3-none-any.whl.

File metadata

  • Download URL: mlmodels-0.38.1-py3-none-any.whl
  • Upload date:
  • Size: 4.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.6.5

File hashes

Hashes for mlmodels-0.38.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e63e67edd3df5e9ee5ddcc4e14ffe61419f87e5ce6c4411ebc9b0f8c527bf09e
MD5 5c94017bce3bd077e560ec9268bf4a6e
BLAKE2b-256 2781b48e400d7d0e6c97fb7566a6a48ec31882c1ec918f360e243b88d7147ea4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page