Skip to main content

A Python package for registering machine learning models directly to the Snowflake Model Registry, leveraging Snowflake ML capabilities.

Project description

Fosforml

Overview

The fosforml package is designed to facilitate the registration, management, and deployment of machine learning models with a focus on integration with Snowflake. It provides tools for managing datasets, model metadata, and the lifecycle of models within a Snowflake environment.

Features

  • Model Registration: Register models to the Snowflake Model registry with detailed metadata, including descriptions, types, and dependencies.
  • Dataset Management: Handle datasets within Snowflake, including creation, versioning, and deletion of dataset objects.
  • Metadata Management: Update model registry with descriptions and tags for better organization and retrieval.
  • Snowflake Session Management: Manage Snowflake sessions for executing operations within the Snowflake environment.

Installation

To install the fosforml package, ensure you have Python installed on your system and run the following command:

pip install fosforml

Usage

Register a model with the Snowflake Model Registry using the register_model function. The function supports both Snowflake and Pandas dataframes, catering to different data handling preferences.

Requirements

  • Snowflake DataFrame: If you are using Snowflake as your data warehouse, you must provide a Snowflake DataFrame (snowflake.snowpark.dataframe.DataFrame) that includes model feature names, labels, and output column names.

  • Pandas DataFrame: For users preferring local or in-memory data processing, you must upload the following as Pandas DataFrames (pandas.DataFrame):

    • x_train: Training data with feature columns.
    • y_train: Training data labels.
    • x_test: Test data with feature columns.
    • y_test: Test data labels.
    • y_pred: Predicted labels for the test data.
    • y_prob: Predicted probabilities for the test data classes for classification problems.
  • Numpy data arrays are not allowed as input datasets to register the model

  • dataset_name: Name fo dataset on which model trained.

  • dataset_source: Name fo source from where dataset is pulled/created.

  • source: Model environment name where model getting developed Ex: Notebook/Experiment.

Supported Model Flavours

Currently, the framework supports the following model flavours:

  • Snowflake Models (snowflake): Models that are directly integrated with Snowflake, leveraging Snowflake's data processing capabilities.
  • Scikit-Learn Models (sklearn): Models built using the Scikit-Learn library, a widely used library for machine learning in Python.

Registering a Model

To register a model with the fosforml package, you need to provide the model object, session, and other relevant details such as the model name, description, and type.

For Snowflake Models :

from fosforml import register_model

register_model(
  model_obj=pipeline,
  session=my_session,
  name="MyModel",
  snowflake_df=pred_df,
  dataset_name="HR_CHURN",
  dataset_source="Dataset",
  source="Notebook",
  description="This is a Snowflake model",
  flavour="snowflake",
  model_type="classification",
  conda_dependencies=["scikit-learn==1.3.2"]
)

For Scikit-Learn Models :

from fosforml import register_model

register_model(
  model_obj=model,
  session=session,
  x_train=x_train,
  y_train=y_train,
  x_test=x_test,
  y_test=y_test,
  y_pred=y_pred,
  y_prob=y_prob,
  source="Notebook",
  dataset_name="HR_CHURN",
  dataset_source="InMemory",
  name="MyModel",
  description="This is a sklearn model",
  flavour="sklearn",
  model_type="classification",
  conda_dependencies=["scikit-learn==1.3.2"]
)

Snowflake Session Management

The SnowflakeSession class is used to manage connections to Snowflake, facilitating the execution of operations within the Snowflake environment.

from fosforml.model_manager.snowflakesession import get_session

my_session = get_session()

Managing Datasets

The DatasetManager class allows for the creation, uploading, and removal of datasets associated with models.

from fosforml.model_manager import DatasetManager

dataset_manager = DatasetManager(model_name="MyModel", version_name="v1", session=my_session)
dataset_manager.upload_datasets(session=my_session, datasets={"x_train": x_train_df, "y_train": y_train_df})

Dependencies

  • pandas
  • snowflake-ml-python
  • requests

Ensure these dependencies are installed in your environment to use the fosforml package effectively.

For issues and contributions, please refer to the project's GitHub repository.

Additional Resources

For further assistance and examples on how to register models using fosforml, please refer to the example folder in the project repository. This folder contains Jupyter notebooks that provide step-by-step guidance on model registration and other operations.

Visit www.fosfor.com for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fosforml-1.1.5.tar.gz (34.5 kB view details)

Uploaded Source

Built Distribution

fosforml-1.1.5-py3-none-any.whl (38.4 kB view details)

Uploaded Python 3

File details

Details for the file fosforml-1.1.5.tar.gz.

File metadata

  • Download URL: fosforml-1.1.5.tar.gz
  • Upload date:
  • Size: 34.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for fosforml-1.1.5.tar.gz
Algorithm Hash digest
SHA256 373ed78bafdd9c913ca17f97ea33117dc040ed043cb5d4761412422be7d2dedb
MD5 a8ef3dd58c20d1122ec338b333e5965d
BLAKE2b-256 a1fbb3254e3539e8db8e7584e2cd0a9c8b8420ef1ff1c39f9a0f0d210e1b7da4

See more details on using hashes here.

Provenance

File details

Details for the file fosforml-1.1.5-py3-none-any.whl.

File metadata

  • Download URL: fosforml-1.1.5-py3-none-any.whl
  • Upload date:
  • Size: 38.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for fosforml-1.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3efe215118a05ba60469b14a40a6ee0b619c34c5d060020f963839c66d6b8e0a
MD5 a8deeb46122c78271f4c6052bc6cafda
BLAKE2b-256 fed2d08131cf9db64e0fc21277a3f49e6874227ceba2c4cee69a3a1f437f6304

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page