An end-to-end machine learning pipeline to train ml model and deploy it to realtime inference endpoint
Project description
personalization
An end-to-end demo machine learning pipeline to provide an artifact for a real-time inference service
Requirements: we want to create a machine learning pipeline which satisfies the following properties
1. Multiple Models Support: The code should support maintaining
wide range of machine learning algorithms,
linear regression, decision trees, random forests,
and deep learning models, to meet diverse business requirements.
2. Configurability: The API should be highly configurable to
allow users to customize
the machine learning models to their specific use cases.
This may include hyperparameter tuning, feature selection, and feature engineering.
3. Flexibility: The API should be flexible enough to handle a wide range of data formats,
such as CSV, JSON, and Parquet. It should also support various
deployment environments, such as on-premises, cloud-based, and hybrid environments.
4. Scalability: The API should be designed with scalability in mind,
meaning it can handle large volumes of data, high request rates, and multiple concurrent users.
This may involve incorporating distributed computing
and parallel processing techniques to handle the workload.
5. Support versioning with MLFlow
6. Documentation: The API should be accompanied by comprehensive documentation,
including user manuals, API reference guides, and developer documentation.
This will make it easier for users to learn
how to use the API and integrate it into their applications.
How to run
- git clone git@github.com:ra312/personalization.git && cd personalization
- obttain sessions.csv and venues.csv and move them to the root folder
- install poetry on Linux, MacOS
curl -sSL https://install.python-poetry.org | python3 - --version 1.3.2
How to train pipeline and get artifact, copy this into bash
python3 -m personalization \
--sessions-bucket-path sessions.csv \
--venues-bucket-path venues.csv \
--objective lambdarank \
--num_leaves 100 \
--min_sum_hessian_in_leaf 10 \
--metric ndcg --ndcg_eval_at 10 20 \
--learning_rate 0.8 \
--force_row_wise True \
--num_iterations 10 \
--trained-model-path trained_model.joblib
Read Latest Documentation - Browse GitHub Code Repository
personalization An endpoint service to provide real-time personalization
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for personalization-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4f21d3ea90e4ec97038db2efd7595eec764bc163b3b82b85267924e86002975 |
|
MD5 | 740ed517088abc9050ca5285065ea30c |
|
BLAKE2b-256 | c2c6cab69407b3cf1e97cbaa4d5f9bd760b7babe7911b50afde91918f9523c1c |