Skip to main content

A consistent interface for creating Machine Learning Models compatible with VisualFabriq environment

Project description

portalytics

Portable Jupyter Setup for Machine Learning.

A consistent interface for creating Machine Learning Models compatible with VisualFabriq environment.

Build models using our portalytics module. The module is available as pip package, install simply by:

pip install vf-portalytics

Pay attention to the requirements because it is important for the model to be built with the ones that we support. 

There are examples of how you can use portalytics. Examples for a simple model or more complex models like MultiModel.

Make sure that after saving the model using portalyctis, its possible that the model can be loaded and still contains all the important information (eg. the loaded model is able to perform a prediction?)

MultiModel and MultiTransformer

MultiModel is a custom sklearn model that contains one model for each group of training data. It is valuable in cases that our dataset vary a lot, but we still need to manage one model because the problem is the same.

  • Define the groups using input parameter clusters which is a list of all possible groups and group_col which is a string that indicates in which feature the groups can be found.

  • selected_features give the ability of using different features for each group.

  • params give the ability of using different model and categorical-feature transformer for each group.

The Jupyter notebook multimodel_example.ipynb contains an end-to-end example of how MultiModel can be trained and saved using vf_portalytics Model wrapper.

MultiModel can support every sklearn based model, the only thing that is need to be done is to extend POTENTIAL_MODELS dictionary. Feel free to raise a PR.

MultiTransformer is the transformer that is being used inside MultiModel to transform categorical features into numbers. It is a custom sklearn transformer that contains one transformer for each group of training data.

  • Can be used also separately, in the same way as MultiModel. Check example

MultiTransformer can support every sklearn based transformer, the only thing that is need to be done is to extend POTENTIAL_TRANSFORMER dictionary. Feel free to raise a PR.

Model

Model is a wrapper for ML models to make the model more portable and easier to use inside Visualfabriq environment.

import numpy as np
import pandas as pd
from sklearn.dummy import DummyRegressor

from vf_portalytics.model import PredictionModel

model_name = 'test_model'
prediction_model = PredictionModel(model_name, '.')

train_df = pd.DataFrame({
    'baseline_units': [800, 700],
    'promotion_technical_id': ['promotion_id_1', 'promotion_id_1'],
    'promotion_type': [1, 2],
    'promotion_ext_id': [1, 1],
    'account_id': ['pa_1', 'pa_1'],
    'pid': ['pid_1', 'pid_2']
})
dummy_regression = DummyRegressor(strategy="mean")
dummy_regression.fit(train_df, np.array([1800, 1700]))

prediction_model.features = {
    'baseline_units': [],
    'total_baseline_units': [],
    'total_nr_products': [],
    'base_price': [],
    'discount_perc': [],
    'discount_amt': [],
    'account_id': [],
}
prediction_model.model = dummy_regression

prediction_model.save()

The save function will generate 2 files: test_model.pkl and test_model.meta in the current directory. These are the files needed to load the model and make predictions inside Visualfabriq environment.

For more details check the example of how to use the model wrapper.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vf_portalytics-2.1.2.tar.gz (62.8 kB view details)

Uploaded Source

Built Distribution

vf_portalytics-2.1.2-py2.py3-none-any.whl (72.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file vf_portalytics-2.1.2.tar.gz.

File metadata

  • Download URL: vf_portalytics-2.1.2.tar.gz
  • Upload date:
  • Size: 62.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for vf_portalytics-2.1.2.tar.gz
Algorithm Hash digest
SHA256 7d40be038ce0196c070b9104a56796133d9c10e3cdfd842f09763d694f43000a
MD5 079bc38c578ccda7cf7b008a7c226991
BLAKE2b-256 0b185026f086cdb67f1b48d69301650e5d7262639ec7133bdd17dc7820b5adc3

See more details on using hashes here.

File details

Details for the file vf_portalytics-2.1.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for vf_portalytics-2.1.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 a0d99d96c47e793e60256b0d05c351739fd11403accd258d9204e2e286fb72e2
MD5 8e44d2840b9ea2ab7111984aaf6952c1
BLAKE2b-256 4be162fa9b53f6136c6d8903dbe59c16f030ab7eb07f105dacfbe9efd970a9d6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page