Skip to main content

Streamlined Recommender System workflows with TensorFlow and Kubeflow

Project description



Build License Documentation GitHub release

Rexify is a library to streamline recommender systems model development.

In essence, Rexify adapts dynamically to your data, and outputs high-performing TensorFlow models that may be used wherever you want, independently of your data. Rexify also includes modules to deal with feature engineering as Scikit-Learn Transformers and Pipelines.

With Rexify, users may easily train Recommender Systems models, just by specifying what their data looks like. Rexify also comes equipped with pre-built machine learning pipelines which can be used serverlessly.

What is Rexify?

Rexify is a low-code personalization tool, that makes use of traditional machine learning frameworks, such as Scikit-Learn and TensorFlow, to create scalable Recommender Systems workflows that anyone can use.

Who is it for?

Rexify is a project that simplifies and standardizes the workflow of recommender systems. It is mostly geared towards people with little to no machine learning knowledge, that want to implement somewhat scalable Recommender Systems in their applications.

Installation

The easiest way to install Rexify is via pip:

pip install rexify

Quick Tour

Rexify is meant to be usable right out of the box. All you need to set up your model is interaction data - something that kind of looks like this:

user_id item_id timestamp event_type
22 67 2021/05/13 Purchase
37 9 2021/04/11 Page View
22 473 2021/04/11 Add to Cart
... ... ... ...
358 51 2021/04/11 Purchase

Additionally, we'll have to have configured a schema for the data. This schema is what will allow Rexify to generate a dynamic model and preprocessing steps. The schema should be comprised of two dictionaries (user, ìtem) and two key-value pairs: event_type (which should point to the column of the event type) and timestamp ( which should point to the timestamp column)

Each of these dictionaries should consist of features and internal data types, such as: id, category, number. More data types will be available in the future.

{
  "user": {
    "user_id": "id",
    "age": "number"
  },
  "item": {
    "item_id": "id",
    "category": "category"
  },
  "timestamp": "timestamp"
  "event_type": "event_type"
}

Essentially, what Rexify will do is take the schema, and dynamically adapt to the data.

There are two main components in Rexify workflows: FeatureExtractor and Recommender.

The FeatureExtractor is a scikit-learn Transformer that basically takes the schema of the data, and transforms the event data accordingly. Another method .make_dataset(), converts the transformed data into a tf.data.Dataset, all correctly configured to be fed to the Recommender model.

Recommender is a tfrs.Model that basically implements the Query and Candidate towers. During training, the Query tower will take the user ID, user features, and context, to learn an embedding; the Candidate tower will do the same for the item ID and its features.

More information about how the FeatureExtractor and the Recommender works can be found here.

A sample Rexify workflow should sort of look like this:

import pandas as pd

from rexify import Schema, FeatureExtractor, Recommender

events = pd.read_csv('path/to/events/data')
schema = Schema.load('path/to/schema')

fe = FeatureExtractor(schema, users='path/to/users/data', items='path/to/events/data', return_dataset=True)
x = fe.fit(events).transform(events)

model = Recommender(**fe.model_params)
model.compile()
model.fit(events, batch_size=512)

When training is complete, you'll have a trained tf.keras.Model ready to be used, as you normally would.

Alternatively, you can also run:

python -m rexify.pipeline -p events=$EVENTS_PATH -p users=$USER_PATH -p items=$ITEMS_PATH -p schema=$SCHEMA_PATH

Which will generate a pipeline.json file, that you can use on Kubeflow Pipelines (or Vertex AI Pipelines).

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rexify-0.1.21.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rexify-0.1.21-py3-none-any.whl (31.5 kB view details)

Uploaded Python 3

File details

Details for the file rexify-0.1.21.tar.gz.

File metadata

  • Download URL: rexify-0.1.21.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.11 Linux/5.15.0-1030-aws

File hashes

Hashes for rexify-0.1.21.tar.gz
Algorithm Hash digest
SHA256 3bb01ec89c6b63e310d3cb7735d32ff2a4ae3c9aca43aaec4bc097f36079553c
MD5 72bf59f8dae810051340c459de00e1b1
BLAKE2b-256 3e3b43a838660bb9004cf6fa3d2679e16cf09ce80471f16472fe5bfe36cf13a2

See more details on using hashes here.

File details

Details for the file rexify-0.1.21-py3-none-any.whl.

File metadata

  • Download URL: rexify-0.1.21-py3-none-any.whl
  • Upload date:
  • Size: 31.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.11 Linux/5.15.0-1030-aws

File hashes

Hashes for rexify-0.1.21-py3-none-any.whl
Algorithm Hash digest
SHA256 df6d93198e8a305fcb518b1d6afd890916bacfd4f0182891bb22476cbcc3f63d
MD5 b9f3a20bd002e12857a906edde8a73e0
BLAKE2b-256 d646be1ba77c348894f9e301c927873a2f1057430e9c5fe79efa1f3f7365f20a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page