hana-ml

Python Machine Learning Client for SAP HANA

These details have not been verified by PyPI

Project links

Project description

Introduction

Welcome to Python machine learning client for SAP HANA (hana-ml)!

This package enables Python data scientists to access SAP HANA data and build various machine learning models using the data directly in SAP HANA. This page provides an overview of hana-ml.

Overview

Python machine learning client for SAP HANA consists of two main parts:

SAP HANA DataFrame, which provides a set of methods for accessing and querying data in SAP HANA without bringing the data to the client.
A set of machine learning APIs for developing machine learning models.

Specifically, machine learning APIs are composed of two packages:

PAL package

PAL package consists of a set of Python algorithms and functions which provide access to machine learning capabilities in SAP HANA Predictive Analysis Library (PAL). SAP HANA PAL functions cover a variety of machine learning algorithms for training a model and then the trained model is used for scoring.
APL package

Automated Predictive Library (APL) package exposes the data mining capabilities of the Automated Analytics engine in SAP HANA through a set of functions. These functions develop a predictive modeling process that analysts can use to answer simple questions on their customer datasets stored in SAP HANA.

In addition to SAP HANA DataFrame methods and machine learning API, hana-ml also offers the following features:

Visualizers: a bunch of methods to visualize dataset and model, e.g. eda (plot functions, e.g. Distribution plot, Pie plot and Correlation plot), dataset_report (analyze the dataset and generate a report in HTML format), model_debriefing (visualize a tree model and explain the output of model with Shapley value ), unified_report (integrated dataset report and model report for UnifiedClassfication() and UnifiedRegression()).
Model storage: offers a series of methods to save, list, load and delete models in SAP HANA. Models are saved into SAP HANA tables in a schema specified by the user.
Text Mining: provides a series of functions, such as perform tf_analysis, text classification on the given document.
Spatial and Graph: introduces additional engines that can be used for analytics focused on Geospatial and Graph or network modeled data.

Please see Python Machine Learning Client for SAP HANA Documentation for more details of methods.

Prerequisites

hana-ml uses SAP HANA Python driver (hdbcli) to connect to SAP HANA. Please install and see the following information:

SAP HANA Python driver: hdbcli 2.2.23 (shipped with SAP HANA SP03) or higher. Please see SAP HANA Client Interface Programming Reference for SAP HANA Service

hana-ml uses SAP HANA PAL and SAP HANA APL for machine learning API. Please refer to the following information:

SAP HANA PAL: Security AFL__SYS_AFL_AFLPAL_EXECUTE and AFL__SYS_AFL_AFLPAL_EXECUTE_WITH_GRANT_OPTION roles. See SAP HANA Predictive Analysis Library for more information.
SAP HANA APL 1905 or higher. Please see SAP HANA Automated Predictive Library Reference Guide for more information. Only valid when using the APL package.

Getting Started

Install via

>>> pip install hana-ml

Quick Start

First, create a connection to SAP HANA:

>>> from hana_ml import dataframe
>>> conn = dataframe.ConnectionContext(address="<hostname>",
                                       port=3<NN>MM,
                                       user="<username>",
                                       password="<password>")

NN and MM in Port is explained as follows:

For HANA tenant databases, use the port number 3NN13 (where NN is the SAP instance number - e.g. 30013).
For HANA system databases in a multitenant system, the port number is 3NN13.
For HANA single-tenant databases, the port number is 3NN15.

Return a DataFrame referenced to a SAP HANA table:

>>> df = conn.table('MY_TABLE', schema='MY_SCHEMA').filter('COL3>5').select('COL1', 'COL2')

Return a DataFrame from select statement:

>>> df = dataframe.DataFrame(conn, 'select * from MY_SCHEMA.MY_TABLE')

Convert a SAP HANA DataFrame to be a pandas DataFrame:

>>> pandas_df = df.collect()

Convert to a pandas DataFrame to be a SAP HANA DataFrame:

>>> df = dataframe.create_dataframe_from_pandas(conn, pandas_df, 'MY_TABLE', force=True)

Example: Build an UnifiedClassification model and display the dataset and model with UnifiedReport function.

Step 1: Import related modules:

>>> from hana_ml import dataframe
>>> from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
>>> from hana_ml.visualizers.unified_report import UnifiedReport

Step 2: Create a ConnectionContext object:

>>> conn = dataframe.ConnectionContext('<address>', <port>, '<user>', '<password>')

Step 3: Create a SAP HANA DataFrame df_fit and point to a table "DATA_TBL_FIT":

>>> df_fit = conn.table("DATA_TBL_FIT")

Step 4: Inspect df_fit:

>>> df_fit.head(6).collect()
  ID   OUTLOOK  TEMP  HUMIDITY WINDY        CLASS
0  1     Sunny    75      70.0   Yes         Play
1  2     Sunny    80      90.0   Yes  Do not Play
2  3     Sunny    85      85.0    No  Do not Play
3  4     Sunny    72      95.0    No  Do not Play
4  5     Sunny    69      70.0    No         Play
5  6  Overcast    72      90.0   Yes         Play

Step 5: Invoke UnifiedReport function to display the dataset:

>>> UnifiedReport(df_fit).build().display()

Step 6: Create an 'UnifiedClassification' instance and specify the parameters:

>>> rdt_params = dict(random_state=2,
                      split_threshold=1e-7,
                      min_samples_leaf=1,
                      n_estimators=10,
                      max_depth=55)
>>> uc_rdt = UnifiedClassification(func = 'RandomDecisionTree', **rdt_params)

Step 7: Invoke the fit method and inspect one of returned attributes importance_:

  >>> uc_rdt.fit(data=df_fit, 
                 partition_method='stratified',
                 stratified_column='CLASS', 
                 partition_random_state=2,
                 training_percent=0.7, 
                 ntiles=2)
  >>> print(uc_rdt.importance_.collect())
    VARIABLE_NAME  IMPORTANCE
  0       OUTLOOK    0.191748
  1          TEMP    0.418285
  2      HUMIDITY    0.389968
  3         WINDY    0.000000

Step 8: View the 'UnifiedClassification' model report:

>>> UnifiedReport(uc_rdt).build().display()

Step 9: Create a SAP HANA DataFrame df_predict and point to a table "DATA_TBL_PREDICT":

>>> df_predict = conn.table("DATA_TBL_PREDICT")

Step 10: Preview df_predict:

>>> df_predict.collect()
   ID   OUTLOOK     TEMP  HUMIDITY WINDY
0   0  Overcast     75.0      70.0   Yes
1   1      Rain     78.0      70.0   Yes
2   2     Sunny     66.0      70.0   Yes
3   3     Sunny     69.0      70.0   Yes
4   4      Rain      NaN      70.0   Yes
5   5      None     70.0      70.0   Yes
6   6       ***     70.0      70.0   Yes

Step 11: Invoke the predict method and inspect the result:

>>> result = uc_rdt.predict(df_predict, key = "ID", top_k_attributions=10)
>>> print(result.collect())
   ID       SCORE  CONFIDENCE
0   0        Play         0.8
1   1        Play         1.0
2   2        Play         0.6
3   3        Play         1.0
4   4        Play         1.0
5   5 Do not Play         0.8
6   6        Play         0.8

Step 12: Create a TreeModelDebriefing.shapley_explainer object and then invoke summary_plot() to explain the output of 'UnifiedClassification' model :

>>> from hana_ml.visualizers.model_debriefing import TreeModelDebriefing
>>> shapley_explainer = TreeModelDebriefing.shapley_explainer(res, df_score, key='ID', label='CLASS')
>>> shapley_explainer.summary_plot()

Step 13: Create a SAP HANA DataFrame df_score and point to a "DATA_TBL_SCORE" Table:

>>> df_score = conn.table("DATA_TBL_SCORE")

Step 14: Preview df_score:

>>> df_score.collect()
   ID   OUTLOOK     TEMP  HUMIDITY WINDY        CLASS
0   0  Overcast     75.0  -10000.0   Yes         Play
1   1      Rain     78.0      70.0   Yes         Play
2   2     Sunny -10000.0       NaN   Yes  Do not Play
3   3     Sunny     69.0      70.0   Yes  Do not Play
4   4      Rain      NaN      70.0   Yes         Play
5   5      None     70.0      70.0   Yes  Do not Play
6   6       ***     70.0      70.0   Yes         Play

Step 15: Perform the score method and inspect the result:

>>> score_res = uc_rdt.score(data=df_score,
                             key='ID',
                             max_result_num=2,
                             ntiles=2,
                             attribution_method='tree-shap')[1].head(4)
>>> print(score_res.collect())
   STAT_NAME          STAT_VALUE   CLASS_NAME
0        AUC  0.5102040816326531         None
1     RECALL                   0  Do not Play
2  PRECISION                   0  Do not Play
3   F1_SCORE                   0  Do not Play

Step 16: Close the connection to the SAP HANA:

>>> conn.close()

Help

Please see Python Machine Learning Client for SAP HANA Documentation for more details of methods.

License

The SAP HANA ML API is provided via the SAP Developer License Agreement.

By using this software, you agree that the following text is incorporated into the terms of the Developer Agreement:

If you are an existing SAP customer for On Premise software, your use of this current software is also covered by the terms of your software license agreement with SAP, including the Use Rights, the current version of which can be found at: https://www.sap.com/about/agreements/product-use-and-support-terms.html?tag=agreements:product-use-support-terms/on-premise-software/software-use-rights

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.29.26061601

Jun 22, 2026

2.28.26042901

Apr 29, 2026

2.28.26042900

Apr 29, 2026

2.28.26031701

Mar 18, 2026

2.27.26022300

Feb 24, 2026

2.27.26020601

Feb 11, 2026

2.27.26020600

Feb 10, 2026

2.27.26012900

Jan 26, 2026

2.27.26011300

Jan 15, 2026

2.27.25122200

Dec 22, 2025

2.27.25121601

Dec 16, 2025

2.26.25111700

Nov 17, 2025

2.26.25110400

Nov 5, 2025

2.26.25102400

Oct 24, 2025

2.26.25101500

Oct 14, 2025

2.26.25091602

Sep 16, 2025

2.25.25080800

Aug 7, 2025

2.25.25072901

Jul 30, 2025

2.25.25072900

Jul 28, 2025

2.25.25062301

Jun 23, 2025

2.25.25061701

Jun 17, 2025

2.24.25051600

May 16, 2025

2.24.25042500

Apr 25, 2025

2.24.25040300

Apr 3, 2025

2.24.25032500

Mar 25, 2025

2.24.25032100

Mar 21, 2025

2.24.25031802

Mar 18, 2025

2.23.25031700

Mar 17, 2025

2.23.25030801

Mar 11, 2025

2.23.25030800

Mar 9, 2025

2.23.25021400

Feb 13, 2025

2.23.25020600

Feb 6, 2025

2.23.25010300

Jan 3, 2025

2.23.24121701

Dec 17, 2024

2.22.24110601

Nov 5, 2024

2.22.24101100

Oct 9, 2024

2.22.24091701

Sep 18, 2024

2.21.24090900

Sep 9, 2024

2.21.24072600

Jul 25, 2024

2.21.24071200

Jul 14, 2024

2.21.24062800

Jun 28, 2024

2.21.24062400

Jun 24, 2024

2.20.24042601

Apr 30, 2024

2.20.24031902

Mar 19, 2024

2.19.24022101

Feb 21, 2024

2.19.24022100

Feb 21, 2024

2.19.24013100

Jan 31, 2024

2.19.24012400

Jan 24, 2024

2.19.24011500

Jan 15, 2024

2.19.24010400

Jan 4, 2024

2.19.23120702

Dec 13, 2023

2.18.23120102

Dec 5, 2023

2.18.23120101

Dec 4, 2023

2.18.23120100

Dec 1, 2023

2.18.23111400

Nov 14, 2023

2.18.23110300

Nov 2, 2023

2.18.23092701

Sep 29, 2023

2.18.23092700

Sep 27, 2023

2.18.23091401

Sep 13, 2023

2.17.23080800

Aug 8, 2023

2.17.23072700

Jul 26, 2023

2.17.23071400

Jul 13, 2023

2.17.23062800

Jun 28, 2023

2.17.23062200

Jun 20, 2023

2.16.23060100

May 31, 2023

2.16.23052600

May 25, 2023

2.16.23051900

May 18, 2023

2.16.23050800

May 8, 2023

2.16.23041300

Apr 12, 2023

2.16.23032300

Mar 23, 2023

2.16.23031601

Mar 16, 2023

2.15.23021701

Feb 15, 2023

2.15.23011100

Jan 10, 2023

2.15.22122300

Dec 23, 2022

2.15.22121601

Dec 17, 2022

2.14.22120800

Dec 8, 2022

2.14.22120100

Dec 1, 2022

2.14.22102800

Oct 29, 2022

2.14.22101400

Oct 13, 2022

2.14.22092300

Sep 23, 2022

2.14.22091801

Sep 16, 2022

2.13.22072200

Jul 19, 2022

2.13.22071500

Jul 11, 2022

2.13.22070101

Jul 1, 2022

2.13.22060800

Jun 7, 2022

2.13.22051101

May 11, 2022

2.12.22042800

Apr 28, 2022

2.12.22042500

Apr 25, 2022

2.12.22040801

Apr 7, 2022

2.12.22032503

Mar 22, 2022

2.11.22020900

Feb 9, 2022

2.11.22010700

Jan 6, 2022

2.11.21121103

Dec 16, 2021

2.10.21091803

Sep 17, 2021

This version

2.9.21072600

Jul 26, 2021

2.9.21070902

Jul 9, 2021

2.9.21063001

Jun 29, 2021

2.9.21061902

Jun 24, 2021

2.8.21042100

Apr 21, 2021

2.6.21012600

Jan 25, 2021

2.6.21011300

Jan 12, 2021

2.6.20120900

Dec 9, 2020

2.6.20110601

Dec 9, 2020

2.6.20110600

Nov 6, 2020

2.6.20101606

Oct 16, 2020

2.5.20062609

Jul 20, 2020

2.5.20062608

Jul 20, 2020

2.5.20062605

Jun 26, 2020

1.0.8.post11

Mar 24, 2020

1.0.8.post8

Mar 23, 2020

1.0.8.post5

Jan 16, 2020

1.0.8.post2

Jan 15, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hana_ml-2.9.21072600-py3-none-any.whl (700.3 kB view details)

Uploaded Jul 26, 2021 Python 3

File details

Details for the file hana_ml-2.9.21072600-py3-none-any.whl.

File metadata

Download URL: hana_ml-2.9.21072600-py3-none-any.whl
Upload date: Jul 26, 2021
Size: 700.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.15.0 pkginfo/1.7.1 requests/2.25.1 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.5.3

File hashes

Hashes for hana_ml-2.9.21072600-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1eec9021813f1bab1aad186300ba975faedd8ac9abf9c3d2677a5e62c0de7ece`
MD5	`d54faf856c2f9e7c7a3167b01848ffea`
BLAKE2b-256	`11a41139c239a5357dfb4c7bf0d380f11f356195222a7feb2dde835c60f9d7a9`

See more details on using hashes here.

hana-ml 2.9.21072600

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Introduction

Overview

Prerequisites

Getting Started

Quick Start

Help

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes