A package for Script Table Operator that applies set theory to machine learning in Python.

Project description

`tdstone2` Package

Overview

The tdstone2 package is designed to simplify the operationalization of Python code for data analysis and machine learning on the Teradata Vantage system. It leverages the massive parallel processing architecture of Teradata Vantage to run hundreds of Python scripts across hundreds of data partitions. This approach enables the industrialization, lineage, and reproducibility of millions of custom models while minimizing data movement.

Features

Hyper-segmented Model Deployment: Deploy scikit-learn pipelines or custom Python functions across segmented datasets for parallel execution.
Model Lineage and Reproducibility: Automatically track the lineage of models and ensure reproducibility across different data partitions.
Efficient Data Handling: Minimize data movement by leveraging Teradata's parallel processing capabilities to execute models directly on the database.

Installation

To install tdstone2, use pip:

pip install tdstone2

Ensure you have access to a Teradata Vantage system and the necessary credentials to connect and execute queries.

Usage

Hyper-segmented Model Deployment

3.1 Engineering of the Scikit-learn Classifier Pipeline

from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier

# Example usage
steps_classifier = [
    ('scaler', StandardScaler()),
    ('classifier', RandomForestClassifier(max_depth=5, n_estimators=95))
]

3.2 Deployment of the Scikit-learn Pipeline

from tdstone2.tdshypermodel import HyperModel
from tdstone2.tdstone import TDStone

sto = TDStone(schema_name=Param['database'], SEARCHUIFDBPATH=Param['user'])
model_parameters = {
    "target": 'Y2',
    "column_categorical": ['flag', 'Y2'],
    "column_names_X": ['X1', 'X2', 'X3', 'X4', 'X5', 'X6', 'X7', 'X8', 'X9', 'flag']
}

model = HyperModel(
    tdstone=sto,
    metadata={'project': 'test'},
    skl_pipeline_steps=steps_classifier,
    model_parameters=model_parameters,
    dataset=tdml.in_schema(Param['database'], 'dataset_00'),
    id_row='ID',
    id_partition='Partition_ID',
    id_fold='FOLD',
    fold_training='train',
    convert_to_onnx=False, # <-- set to True if you want to get the ONNX version of your trained models
    store_pickle=True, # <-- to store your full object in pickle format
)

# Model deployment outputs

3.3 Local Execution for Validation/Debugging

exec(code_and_data['code'])
local_model = MyModel(**code_and_data['arguments'])
df_local = code_and_data['data']
df_local['flag'] = df_local['flag'].astype('category')
df_local['Y2'] = df_local['Y2'].astype('category')
local_model.fit(code_and_data['data'])
local_model.score(code_and_data['data'])

Execution of the Deployed Hypermodel

4.1 Models Training

model.train()
# Outputs: Trained models are inserted into the specified repository

This training operation launches as many training there are data partitions identified by the id_partition column list, and belonging to the training FOLD.

4.2 Model Scoring

model.score()
# Outputs: Model scores are inserted into the specified scores table

This runs the scoring on all the data, using the latest model available for the corresponding data partition.

Reuse of the Hyper-segmented Model

2. Retrieve the Hyper-segmented Model

sto = TDStone(schema_name=Param['database'], SEARCHUIFDBPATH=Param['user'])
sto.list_hyper_models()
# Outputs: List of hyper models with their metadata

id = '0286d259-ecde-4cd0-ae4a-bcb3191383d1'  # Example model ID
existing_model = HyperModel(tdstone=sto)
existing_model.download(id=id)

Note that the model is not actually downloaded, but this just establish a connection between the model hosted in Vantage and the python client.

3. Update the Training

existing_model.train()
# Outputs: Updated trained models are inserted into the specified repository

4. Update the Scoring

existing_model.score()
# Outputs: Updated scores are inserted into the specified scores table

Project details

Release history Release notifications | RSS feed

0.1.9.3

May 4, 2026

0.1.9.2

May 4, 2026

0.1.9.1

May 3, 2026

0.1.9.0

May 3, 2026

0.1.8.2

Oct 17, 2025

0.1.8.1

Jun 10, 2025

0.1.8.0

Feb 28, 2025

0.1.7.3

Feb 7, 2025

0.1.7.2

Jan 22, 2025

0.1.7.1

Jan 22, 2025

0.1.7.0

Jan 22, 2025

0.1.6.9

Jan 21, 2025

0.1.6.8

Jan 21, 2025

0.1.6.7

Jan 21, 2025

0.1.6.6

Jan 21, 2025

0.1.6.5

Jan 21, 2025

0.1.6.4

Jan 21, 2025

0.1.6.2

Jan 21, 2025

0.1.6.1

Jan 21, 2025

0.1.6.0

Jan 20, 2025

0.1.5.10

Jan 20, 2025

0.1.5.9

Jan 20, 2025

0.1.5.8

Jan 20, 2025

0.1.5.7

Jan 20, 2025

0.1.5.6

Jan 20, 2025

0.1.5.5

Jan 18, 2025

0.1.5.4

Jan 18, 2025

0.1.5.0

Dec 20, 2024

0.1.4.23

Dec 20, 2024

0.1.4.22

Dec 20, 2024

This version

0.1.3.21

Nov 27, 2024

0.1.3.20

Nov 27, 2024

0.1.3.19

Nov 22, 2024

0.1.3.18

Nov 20, 2024

0.1.3.17

Nov 20, 2024

0.1.3.16

Nov 15, 2024

0.1.3.15

Nov 7, 2024

0.1.3.14

Nov 7, 2024

0.1.3.13

Oct 28, 2024

0.1.3.12

Oct 28, 2024

0.1.3.11

Oct 15, 2024

0.1.3.10

Oct 11, 2024

0.1.3.9

Oct 9, 2024

0.1.3.8

Oct 9, 2024

0.1.3.7

Oct 8, 2024

0.1.3.6

Oct 8, 2024

0.1.3.5

Oct 8, 2024

0.1.3.4

Oct 8, 2024

0.1.3.2

Oct 7, 2024

0.1.3.1

Oct 5, 2024

0.1.3.0

Oct 4, 2024

0.1.2.16

Jul 24, 2024

0.1.2.15

Jul 23, 2024

0.1.2.14

Jul 23, 2024

0.1.2.13

Jul 22, 2024

0.1.2.12

Apr 4, 2024

0.1.2.11

Apr 2, 2024

0.1.2.10

Apr 2, 2024

0.1.2.9

Apr 2, 2024

0.1.2.8

Mar 28, 2024

0.1.2.7

Mar 27, 2024

0.1.2.6

Feb 22, 2024

0.1.2.5

Feb 22, 2024

0.1.2.4

Feb 22, 2024

0.1.2.3

Oct 11, 2023

0.1.2.2

Oct 10, 2023

0.1.2.1

Oct 5, 2023

0.1.2.0

Oct 5, 2023

0.1.0.3

Aug 31, 2023

0.1.0.2

Aug 31, 2023

0.1.0.1

Aug 31, 2023

0.1.0

Aug 3, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tdstone2-0.1.3.21-py3-none-any.whl (100.5 kB view details)

Uploaded Nov 27, 2024 Python 3

File details

Details for the file tdstone2-0.1.3.21-py3-none-any.whl.

File metadata

Download URL: tdstone2-0.1.3.21-py3-none-any.whl
Upload date: Nov 27, 2024
Size: 100.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.12.3

File hashes

Hashes for tdstone2-0.1.3.21-py3-none-any.whl
Algorithm	Hash digest
SHA256	`450472f7b537fa8311b3c964fde77f5798e1cdd9e0516a3d56c63b1982856ab3`
MD5	`2f9946a167e36cc8cebfafd01ab9d349`
BLAKE2b-256	`5d307ad83d5c8e24cf615ddd3b7138e012dd5921167fce423a365987196dc93c`

See more details on using hashes here.

tdstone2 0.1.3.21

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

`tdstone2` Package

Overview

Features

Installation

Usage

Hyper-segmented Model Deployment

3.1 Engineering of the Scikit-learn Classifier Pipeline

3.2 Deployment of the Scikit-learn Pipeline

3.3 Local Execution for Validation/Debugging

Execution of the Deployed Hypermodel

4.1 Models Training

4.2 Model Scoring

Reuse of the Hyper-segmented Model

2. Retrieve the Hyper-segmented Model

3. Update the Training

4. Update the Scoring

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

tdstone2 0.1.3.21

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

tdstone2 Package

Overview

Features

Installation

Usage

Hyper-segmented Model Deployment

3.1 Engineering of the Scikit-learn Classifier Pipeline

3.2 Deployment of the Scikit-learn Pipeline

3.3 Local Execution for Validation/Debugging

Execution of the Deployed Hypermodel

4.1 Models Training

4.2 Model Scoring

Reuse of the Hyper-segmented Model

2. Retrieve the Hyper-segmented Model

3. Update the Training

4. Update the Scoring

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

`tdstone2` Package