The machine learning client library that is used for interacting with Snowflake to build machine learning solutions.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Snowpark ML

Snowpark ML is a set of tools including SDKs and underlying infrastructure to build and deploy machine learning models. With Snowpark ML, you can pre-process data, train, manage and deploy ML models all within Snowflake, using a single SDK, and benefit from Snowflake’s proven performance, scalability, stability and governance at every stage of the Machine Learning workflow.

Key Components of Snowpark ML

The Snowpark ML Python SDK provides a number of APIs to support each stage of an end-to-end Machine Learning development and deployment process, and includes two key components.

Snowpark ML Development [Public Preview]

A collection of python APIs to enable efficient model development directly in Snowflake:

Modeling API (snowflake.ml.modeling) for data preprocessing, feature engineering and model training in Snowflake. This includes snowflake.ml.modeling.preprocessing for scalable data transformations on large data sets utilizing the compute resources of underlying Snowpark Optimized High Memory Warehouses, and a large collection of ML model development classes based on sklearn, xgboost, and lightgbm. See the private preview limited access docs (Preprocessing, Modeling for more details on these.
Framework Connectors: Optimized, secure and performant data provisioning for Pytorch and Tensorflow frameworks in their native data loader formats.

Snowpark ML Ops [Private Preview]

Snowpark MLOps complements the Snowpark ML Development API, and provides model management capabilities along with integrated deployment into Snowflake. Currently, the API consists of

FileSet API: FileSet provides a Python fsspec-compliant API for materializing data into a Snowflake internal stage from a query or Snowpark Dataframe along with a number of convenience APIs.
Model Registry: A python API for managing models within Snowflake which also supports deployment of ML models into Snowflake Warehouses as vectorized UDFs.

During PrPr, we are iterating on API without backward compatibility guarantees. It is better to recreate your registry everytime you update the package. This means, at this time, you cannot use the registry for production use.

Getting started

Have your Snowflake account ready

If you don't have a Snowflake account yet, you can sign up for a 30-day free trial account.

Create a Python virtual environment

Python 3.8 is required. You can use miniconda, anaconda, or virtualenv to create a Python 3.8 virtual environment.

To have the best experience when using this library, creating a local conda environment with the Snowflake channel is recommended.

Install the library to the Python virtual environment

pip install snowflake-ml-python

Release History

1.0.2 (2023-06-22)

Behavior Changes

Model Registry: Prohibit non-snowflake-native models from being logged.
Model Registry: _use_local_snowml parameter in options of deploy() has been removed.
Model Registry: A default False embed_local_ml_library parameter has been added to the options of log_model(). With this set to False (default), the version of the local snowflake-ml-python library will be recorded and used when deploying the model. With this set to True, local snowflake-ml-python library will be embedded into the logged model, and will be used when you load or deploy the model.

New Features

Model Registry: A new optional argument named code_paths has been added to the arguments of log_model() for users to specify additional code paths to be imported when loading and deploying the model.
Model Registry: A new optional argument named options has been added to the arguments of log_model() to specify any additional options when saving the model.
Model Development: Added metrics:
- d2_absolute_error_score
- d2_pinball_score
- explained_variance_score
- mean_absolute_error
- mean_absolute_percentage_error
- mean_squared_error

Bug Fixes

Model Development: accuracy_score() now works when given label column names are lists of a single value.

1.0.1 (2023-06-16)

Behavior Changes

Model Development: Changed Metrics APIs to imitate sklearn metrics modules:
- accuracy_score(), confusion_matrix(), precision_recall_fscore_support(), precision_score() methods move from respective modules to metrics.classification.
Model Registry: The dafault table/stage created by the Registry now uses "SYSTEM" as a prefix.
Model Registry: get_model_history() method as been enhanced to include the history of model deployment.

New Features

Model Registry: A default False flag named replace_udf has been added to the options of deploy(). Setting this to True will allow overwrite existing UDF with the same name when deploying.
Model Development: Added metrics:
- f1_score
- fbeta_score
- recall_score
- roc_auc_score
- roc_curve
- log_loss
- precision_recall_curve
Model Registry: A new argument named permanent has been added to the arguemnt of deploy(). Setting this to True allows the creation of a permanent deployment without needing to specify the UDF location.
Model Registry: A new method list_deployments() has been added to enumerate all permanent deployments originating from a specific model.
Model Registry: A new method get_deployment() has been added to fetch a deployment by its deployment name.
Model Registry: A new method delete_deployment() has been added to remove an existing permanent deployment.

1.0.0 (2023-06-09)

Behavior Changes

Model Registry: predict() method moves from Registry to ModelReference.
Model Registry: _snowml_wheel_path parameter in options of deploy(), is replaced with _use_local_snowml with default value of False. Setting this to True will have the same effect of uploading local SnowML code when executing model in the warehouse.
Model Registry: Removed id field from ModelReference constructor.
Model Development: Preprocessing and Metrics move to the modeling package: snowflake.ml.modeling.preprocessing and snowflake.ml.modeling.metrics.
Model Development: get_sklearn_object() method is renamed to to_sklearn(), to_xgboost(), and to_lightgbm() for respective native models.

New Features

Added PolynomialFeatures transformer to the snowflake.ml.modeling.preprocessing module.
Added metrics:
- accuracy_score
- confusion_matrix
- precision_recall_fscore_support
- precision_score

Bug Fixes

Model Registry: Model version can now be any string (not required to be a valid identifier)
Model Deployment: deploy() & predict() methods now correctly escapes identifiers

0.3.2 (2023-05-23)

Behavior Changes

Use cloudpickle to serialize and deserialize models throughout the codebase and removed dependency on joblib.

New Features

Model Deployment: Added support for snowflake.ml models.

0.3.1 (2023-05-18)

Behavior Changes

Standardized registry API with following
- Create & open registry taking same set of arguments
- Create & Open can choose schema to use
- Set_tag, set_metric, etc now explicitly calls out arg name as metric_name, tag_name, metric_name, etc.

New Features

Changes to support python 3.9, 3.10
Added kBinsDiscretizer
Support for deployment of XGBoost models & int8 types of data

0.3.0 (2023-05-11)

Behavior Changes

Big Model Registry Refresh
- Fixed API discrepancies between register_model & log_model.
- Model can be referred by Name + Version (no opaque internal id is required)

New Features

Model Registry: Added support save/load/deploy SKL & XGB Models

0.2.3 (2023-04-27)

Bug Fixes

Allow using OneHotEncoder along with sklearn style estimators in a pipeline.

New Features

Model Registry: Added support for delete_model. Use delete_artifact = False to not delete the underlying model data but just unregister.

0.2.2 (2023-04-11)

New Features

Initial version of snowflake-ml modeling package.
- Provide support for training most of scikit-learn and xgboost estimators and transformers.

Bug Fixes

Minor fixes in preprocessing package.

0.2.1 (2023-03-23)

New Features

New in Preprocessing:
- SimpleImputer
- Covariance Matrix
Optimization of Ordinal Encoder client computations.

Bug Fixes

Minor fixes in OneHotEncoder.

0.2.0 (2023-02-27)

New Features

Model Registry
PyTorch & Tensorflow connector file generic FileSet API
New to Preprocessing:
- Binarizer
- Normalizer
- Pearson correlation Matrix
Optimization in Ordinal Encoder to cache vocabulary in temp tables.

0.1.3 (2023-02-02)

New Features

Initial version of transformers including:
- Label Encoder
- Max Abs Scaler
- Min Max Scaler
- One Hot Encoder
- Ordinal Encoder
- Robust Scaler
- Standard Scaler

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.5.0

May 1, 2024

1.4.1 yanked

Apr 22, 2024

1.4.0

Apr 8, 2024

1.3.1

Mar 21, 2024

1.3.0

Mar 12, 2024

1.2.3

Feb 26, 2024

1.2.2

Feb 13, 2024

1.2.1

Jan 26, 2024

1.2.0

Jan 11, 2024

1.1.2

Dec 18, 2023

1.1.1

Dec 6, 2023

1.1.0

Dec 1, 2023

1.0.12

Nov 13, 2023

1.0.11

Oct 27, 2023

1.0.10

Oct 14, 2023

1.0.9 yanked

Sep 29, 2023

Reason this release was yanked:

Yank 1.0.9

1.0.8

Sep 15, 2023

1.0.7

Sep 5, 2023

1.0.6

Sep 1, 2023

1.0.5

Aug 18, 2023

1.0.4

Jul 28, 2023

1.0.3

Jul 14, 2023

This version

1.0.2

Jun 23, 2023

1.0.1

Jun 16, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

snowflake_ml_python-1.0.2-py3-none-any.whl (1.9 MB view hashes)

Uploaded Jun 23, 2023 Python 3

Hashes for snowflake_ml_python-1.0.2-py3-none-any.whl

Hashes for snowflake_ml_python-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bcdc6de32a592c1ae6af956c326b64522dd57ca6e948a0d4c8838c7233c73de2`
MD5	`240405c90887e86c6fb7883224cbb302`
BLAKE2b-256	`59d0e242d23e4dbdd3140d5a2692b5cdf9853095245d2ad48b78b876d97912c5`

snowflake-ml-python 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Snowpark ML

Key Components of Snowpark ML

Snowpark ML Development [Public Preview]

Snowpark ML Ops [Private Preview]

Getting started

Have your Snowflake account ready

Create a Python virtual environment

Install the library to the Python virtual environment

Release History

1.0.2 (2023-06-22)

Behavior Changes

New Features

Bug Fixes

1.0.1 (2023-06-16)

Behavior Changes

New Features

1.0.0 (2023-06-09)

Behavior Changes

New Features

Bug Fixes

0.3.2 (2023-05-23)

Behavior Changes

New Features

0.3.1 (2023-05-18)

Behavior Changes

New Features

0.3.0 (2023-05-11)

Behavior Changes

New Features

0.2.3 (2023-04-27)

Bug Fixes

New Features

0.2.2 (2023-04-11)

New Features

Bug Fixes

0.2.1 (2023-03-23)

New Features

Bug Fixes

0.2.0 (2023-02-27)

New Features

0.1.3 (2023-02-02)

New Features

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution