Python Machine Learning Client for SAP HANA
Project description
Introduction
Welcome to the SAP HANA Python Client API for machine learning algorithms. This API enables Python data scientists to access SAP HANA data and build machine learning models using that data directly in SAP HANA.
Overview
The SAP HANA Python Client API for machine learning algorithms provides a set of client-side Python functions for accessing and querying SAP HANA data, and a set of functions for developing machine learning models.
The Python Client API for Machine Learning consists of two main parts:
-
A set of machine learning APIs for different algorithms.
-
The SAP HANA DataFrame, which provides a set of methods for analyzing data in SAP HANA without bringing the data to the client.
This set of APIs is composed of two packages:
-
PAL package
-
APL package
The Predictive Analysis Library (PAL) package consists of a set of Python algorithms and functions which provide access to the machine learning capabilities in SAP HANA. The PAL Python functions cover a variety of different machine learning algorithms for training a model and then the trained model is used for scoring. For details on which specific algorithms are available in this release, please refer to the documentation.
The Automated Predictive Library (APL) package exposes the data mining capabilities of the Automated Analytics engine in SAP HANA through a set of functions. These functions develop a predictive modeling process that analysts can use to answer simple questions on their customer datasets stored in SAP HANA.
This Python library uses the SAP HANA Python driver (hdbcli) to connect to and access SAP HANA.
Getting Started
Install via
>>> pip install hana-ml
Quick Start
-
For HANA tenant databases, use the port number 3NN13 (where NN is the SAP instance number - e.g. 30013).
-
For HANA system databases in a multitenant system, the port number is 3NN13.
-
For HANA single-tenant databases, the port number is 3NN15.
>>> from hana_ml import dataframe
>>> conn = dataframe.ConnectionContext( address="<hostname>", port=3<NN>MM, user="<username>", password="<password>")
Return a DataFrame referenced to SAP HANA table.
>>> df = conn.table('MY_TABLE', schema='MY_SCHEMA').filter('COL3>5').select('COL1', 'COL2')
Return a DataFrame from select statement.
>>> df = dataframe.DataFrame(conn, 'select * from MY_SCHEMA.MY_TABLE')
Convert to pandas DataFrame.
>>> pandas_df = df.collect()
Convert to HANA DataFrame from pandas DataFrame.
>>> df = dataframe.create_dataframe_from_pandas(conn, pandas_df, 'MY_TABLE', force=True)
Call SAP HANA Machine Learning Algorithms.
>>> rfc = RandomForestClassifier(n_estimators=3,
max_features=3,
random_state=2,
split_threshold=0.00001,
calculate_oob=True,
min_samples_leaf=1,
thread_ratio=1.0)
>>> rfc.fit(data=df, features=['OUTLOOK', 'TEMP', 'HUMIDITY', 'WINDY'],
label='LABEL')
>>> rfc.feature_importances_.collect()
VARIABLE_NAME IMPORTANCE
0 OUTLOOK 0.449550
1 TEMP 0.216216
2 HUMIDITY 0.208108
3 WINDY 0.126126
>>> result = rfc.predict(data=df2, key='ID', verbose=False)
>>> result.collect()
ID SCORE CONFIDENCE
0 0 Play 0.666667
1 1 Play 0.666667
Help
See the [SAP HANA ML API Reference] (https://help.sap.com/doc/1d0ebfe5e8dd44d09606814d83308d4b/2.0.05/en-US/index.html) for details about developing with the SAP HANA ML API.
License
The SAP HANA ML API is provided via the [SAP Developer License Agreement] (https://tools.hana.ondemand.com/developer-license-3_1.txt).
By using this software, you agree that the following text is incorporated into the terms of the Developer Agreement:
If you are an existing SAP customer for On Premise software, your use of this current software is also covered by the
terms of your software license agreement with SAP, including the Use Rights, the current version of which can be found at:
`https://www.sap.com/about/agreements/product-use-and-support-terms.html?tag=agreements:product-use-support-terms/on-premise-software/software-use-rights <https://www.sap.com/about/agreements/product-use-and-support-terms.html?tag=agreements:product-use-support-terms/on-premise-software/software-use-rights>`_
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for hana_ml-2.6.20110601-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 560d41d1edbb69398cf9ba17f2430829fcec0fe4b7035c913ccc18203967692f |
|
MD5 | 78cada51e1fbc35ac992855ed63f42d2 |
|
BLAKE2b-256 | 7202f58a2d63a3b30ed725ad2ae86eb0db549c8e1fdf1a77d7303ca70cfd2e44 |