Skip to main content

Teradata Consulting Python Client Extensions

Project description

Teradata ML Extensions

Extensions to the core teradataml library by Teradata Consulting to aid in field development work around BYOM, STO, RTO and AnalyticOps solutions.

Installation

You can install via pip.

pip install tdextension

Usage

You must use the same version of python on your client side as is used in Teradata (3.6+ at the time of writing). The reason for this is due to differences in serialization between versions of python (e.g. between 3.5 and 3.6).

from teradataml.dataframe.dataframe import DataFrame
from tdextensions.distributed import DistDataFrame, DistMode
from teradataml import create_context
import pandas as pd
import numpy as np

pd.options.display.max_colwidth = 250

engine = create_context(host="localhost", username="ivsm_user", password="ivsm_user")

A simple map row example where we multiple the value of two columns on a row by row basis

def my_fun(row):
    return np.array([row.idx, row.sepal_length * row.sepal_width])

df = DistDataFrame("iris_train", dist_mode=DistMode.STO, sto_id="my_dumb_map")
df = df.map(lambda row: my_fun(row), 
            returns=[["idx", "INTEGER"], ["my_derived_col", "INTEGER"]])

df.head()

A more advanced example where we train a different model for each partition of a dataset

from sklearn.ensemble import RandomForestClassifier
import base64
import dill

def train(partition):
    X = partition[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]
    y = partition[['species']]

    clf = RandomForestClassifier()
    clf.fit(X, y.values.ravel())

    return np.array([[partition.species.iloc[0], "my_model_id", base64.b64encode(dill.dumps(clf))]])

df = DistDataFrame("iris_train", dist_mode=DistMode.STO, sto_id="my_model_train")
df = df.map_partition(lambda partition: train(partition), 
                      partition_by="species", 
                      returns=[["partition_id", "VARCHAR(255)"], 
                               ["model_id", "VARCHAR(255)"],
                               ["model_artefact", "CLOB"]])
df.to_pandas().head()

Permissions

SET SESSION SEARCHUIFDBPATH = <database>;
GRANT EXECUTE procedure on <db> to <user>;
GRANT EXECUTE procedure on SYSUIF to <user>;
GRANT CREATE external procedure on <db> to <user>;
GRANT EXECUTE FUNCTION ON TD_SYSFNLIB.SCRIPT to <user>;
GRANT EXECUTE ON  SYSUIF.DEFAULT_AUTH TO <user>;

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tdextensions-1.0.0rc1.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

tdextensions-1.0.0rc1-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file tdextensions-1.0.0rc1.tar.gz.

File metadata

  • Download URL: tdextensions-1.0.0rc1.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.12

File hashes

Hashes for tdextensions-1.0.0rc1.tar.gz
Algorithm Hash digest
SHA256 a0f1143501f491522a587f11e86880e9fa6219cac5c58bfa37d3bedf1daffd7e
MD5 f0936fc9f852283077dd94da6bc13797
BLAKE2b-256 a7a836c9fc4532ca1c29e9f0d25f73f7b5d8fa3b7715d2864eee4093b8044515

See more details on using hashes here.

File details

Details for the file tdextensions-1.0.0rc1-py3-none-any.whl.

File metadata

  • Download URL: tdextensions-1.0.0rc1-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.12

File hashes

Hashes for tdextensions-1.0.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 ff541707fc959de7920ef550152449b6d3fba99b9848b2403756676ab612301d
MD5 0e551be38da1b66077825415cc8a93d0
BLAKE2b-256 e5bc373457bbbd9f3756e2d1aef8c3a9dc754df34bd33f2ce9b224e3dadac18f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page