Skip to main content

A machine learning package

Project description

Pygmalion in the greek mythologie is a sculptor that fell in love with one of his creations. In the myth, Aphrodite gives life to Galatea, the sculpture he fell in love with. This package is a python machine learning library that implements models for some common machine learning tasks. Everything that you need to give a mind of their own to inanimate objects.

Installing pygmalion

pygmalion can be installed through pip.

python -m pip install pygmalion

Fast prototyping of models with pygmalion

Architectures for several common machine learning tasks (regression, image classification, machine translation ...) are implemented in this package.

The inputs and outputs of the models are common python objects (such as numpy array and pandas dataframes).

In this section we are going to see how to load a dataset, train a model, display some metrics, and save a model.

>>> import pygmalion as ml
>>> import pygmalion.neural_networks as nn
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt

You can download a dataset and split it with the split function.

>>> ml.datasets.boston_housing("./")
>>> df = pd.read_csv("./boston_housing.csv")
>>> df_train, df_val, df_test = ml.utilities.split(df, weights=(0.8, 0.1, 0.1))

Creating and training a model takes few lines of code.

>>> inputs, target = [c for c in df.columns if c != "medv"], "medv"
>>> model = nn.DenseRegressor(inputs, target, hidden_layers=[32, 32])
>>> x_train, y_train = model.data_to_tensor(df_train[inputs], df_train[target])
>>> x_val, y_val = model.data_to_tensor(df_val[inputs], df_val[target])
>>> history = model.fit((x_train, y_train), (x_val, y_val), n_steps=1000, patience=100, learning_rate=1.0E-3)

Some usefull metrics can easily be evaluated.

For a regressor model, the available metrics are MSE, RMSE, R2, and the correlation between target and prediction can be visualized with the plot_fitting function.

>>> f, ax = plt.subplots()
>>> ml.utilities.plot_fitting(df_train[target], model.predict(df_train), ax=ax, label="training")
>>> ml.utilities.plot_fitting(df_val[target], model.predict(df_val), ax=ax, label="validation")
>>> ml.utilities.plot_fitting(df_test[target], model.predict(df_test), ax=ax, label="testing", color="C3")
>>> R2 = ml.utilities.R2(model.predict(df_test), df_test[target])
>>> ax.set_title(f"R²={R2:.3g}")
>>> ax.set_xlabel("target")
>>> ax.set_ylabel("predicted")
>>> plt.show()

pairplot

For a classifier model you can evaluate the accuracy, and display the confusion matrix.

>>> ml.datasets.iris("./")
>>> df = pd.read_csv("./iris.csv")
>>> df_train, df_val, df_test = ml.utilities.split(df, weights=(0.7, 0.2, 0.1))
>>> inputs, target = [c for c in df.columns if c != "variety"], "variety"
>>> classes = df[target].unique()
>>> model = nn.DenseClassifier(inputs, target, classes, hidden_layers=[8, 8, 8])
>>> train_data = model.data_to_tensor(df_train[inputs], df_train[target])
>>> val_data = model.data_to_tensor(df_train[inputs], df_train[target])
>>> model.fit(train_data, val_data, n_steps=1000, patience=100)
>>> f, ax = plt.subplots()
>>> y_test, y_pred = df_test[target], model.predict(df_test)
>>> ml.utilities.plot_matrix(ml.utilities.confusion_matrix(y_test, y_pred, classes=classes), ax=ax, cmap="Greens", write_values=True, format=".2%")
>>> acc = ml.utilities.accuracy(y_pred, y_test)
>>> ax.set_title(f"Accuracy: {acc:.2%}")
>>> plt.tight_layout()
>>> plt.show()

confusion matrix

All the models can be saved directly to the disk with the save method. A model saved on the disk can then be loaded back with the load_model function.

>>> model.save("./model.pth")
>>> model = ml.utilities.load_model("./model.pth")

Implemented models

For examples of model training see the samples folder in the github page.

Neural networks

The neural networks are implemented in pytorch under the hood. Each model is a pytorch Module. The fit method of neural networks returns a train loss, validation loss, gradient scale history that can be ploted with the plot_loss functions.

>>> train_losses, val_losses, grad, best_step = model.fit(...)
>>> ml.utilities.plot_losses(train_losses, val_losses, grad, best_step)

loss history

DenseRegressor

A DenseRegressor (or multi layer perceptron regressor) predicts a scalar value given an input of several variables. An example of DenseRegressor training was demonstrated in a previous section.

DenseClassifier

A DenseClassifier (or multi layer perceptron classifier) predicts a str class value given an input of several variables. An example of DenseClassifier training was presented in a previous section.

ProbabilityDistribution

A ProbabilityDistribution is a multilayer perceptron used to learn the CDF (Cumulated Distribution Function) of tabular data in an unsupervised fashion. Countrary to gaussian mixture models it's PDF (Probability Density Function) is not constrained to beeing positive which makes it a degenerate distribution function. This model is usefull for anomaly detection, or training domain learning.

distribution predictions

ImageClassifier

An ImageClassifier predicts a str class given as input an image. Here below the predictions of a model trained on the fashion-MNIST dataset.

fashion-MNIST predictions

It is implemented as a Convolutional Neural Network similar to ResNet.

ImageSegmenter

An ImageSegmenter predicts a class for each pixel of the input image (semantic segmentation). Here below the predictions of a model trained on the cityscape dataset.

segmented_cityscapes

It is implemented as a Convolutional Neural Network similar to U-Net.

ImageObjectDetector

An ImageObjectDetector predict the presence and box coordinates of objects in an image. This model is an implementation of the YOLO convolutional neural network. Here below the prediction of a model trained to detect circles and squares in images generated on the fly:

segmented_cityscapes

TextClassifier

A TextClassifier classifies text inputs. It is implemented as a transformer encoder. Here below some prediction of the model on a sentiment analysis task where tweets were to be classified as positive, neutral or negative.

@JetBlue Thanks! Her flight leaves at 2 but she's arriving to the airport early. Wedding is in VT in Sept. Grateful you fly to BTV!! :)
>>> positive

@united how are conditions in BOS today? I'm in UA994. Everything appears to be in time but I wanted to check.
>>> neutral

@AmericanAir it's been almost 3 days and it's still frozen. Thanks doll 😘😑
>>> negative

TextTranslator

A TextTranslator model predicts a string outputs for a string inputs. It is implemented as an encoder/decoder transformer. Here below some predictions of a model trained to translate arabic numerals to roman numerals.

1411 >>> MCDXI
1132 >>> MCXXXII
1354 >>> MCCCLIV
1469 >>> MCDLXIX
1290 >>> MCCXC
1698 >>> MDCXCVIII
657 >>> DCLVII
132 >>> CXXXII
1662 >>> MDCLXII
1150 >>> MCL

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygmalion-0.1.8.tar.gz (72.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pygmalion-0.1.8-py3-none-any.whl (106.9 kB view details)

Uploaded Python 3

File details

Details for the file pygmalion-0.1.8.tar.gz.

File metadata

  • Download URL: pygmalion-0.1.8.tar.gz
  • Upload date:
  • Size: 72.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for pygmalion-0.1.8.tar.gz
Algorithm Hash digest
SHA256 45c53cf88d8c2bbe845a59d356599d6f2a0aeb575936612406edd46cd987a54b
MD5 11c158e681fb1d9990bb82bac323addc
BLAKE2b-256 bda0cf9a3ddb2670530daf52ed946b7659b555547d4f1f5921d7a0944454c98e

See more details on using hashes here.

File details

Details for the file pygmalion-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: pygmalion-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 106.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for pygmalion-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 a22d4b66c851fafe76f16d09176be1b59a5d67f593dc721e81aaf066af55bd09
MD5 fdeb53fc8fd26e8b42df5e6ab7622c32
BLAKE2b-256 1bdd6352d46a473e6b215e0c9cab9a9bd331ef0dd2b36361f8a58ecd3b30afd2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page