Skip to main content

Teradata Python package for Generative-AI powered text analytics on Teradata Vantage

Project description

Teradata Python package for Generative-AI

teradatagenai is a Generative AI package developed by Teradata. It provides a robust suite of APIs tailored for diverse text analytics applications. With teradatagenai, users can seamlessly process and analyze text data from various sources, including emails, academic papers, social media posts, and product reviews. This enables users to gain insights with precision and depth that rival or surpass human analysis.

For community support, please visit the Teradata Community.

For Teradata customer support, please visit Teradata Support.

Copyright 2025, Teradata. All Rights Reserved.

Table of Contents

Documentation

General product information, including installation instructions, is available in the Teradata Documentation website.

Release Notes

Version 20.00.00.00

  • teradatagenai 20.00.00.00 marks the first release of the package.
  • This version supports the integration of Hugging Face models into VantageCloud Lake BYO LLM offering, enabling seamless utilization of these models for a wide array of text analytics tasks, including:
    • KeyPhrase Extraction
    • PII (Personally Identifiable Information) Entity Recognition
    • Masking PII Information
    • Language Detection
    • Language Translation
    • Text Summarization
    • Entity Recognition
    • Sentiment Analysis
    • Text Classification
    • Text Embeddings
    • Sentence Similarity
  • The package also features a versatile task function capable of performing any task supported by the underlying language model (LLM). This function is highly adaptable and can be customized to meet specific requirements. Refer to the example for more details on its usage.

Installation and Requirements

Package Requirements:

  • Python 3.9 or later

Note: 32-bit Python is not supported.

Minimum System Requirements:

  • Windows 7 (64Bit) or later
  • macOS 10.9 (64Bit) or later
  • Red Hat 7 or later versions
  • Ubuntu 16.04 or later versions
  • CentOS 7 or later versions
  • SLES 12 or later versions
  • VantageCloud Lake on AWS with Open Analytics Framework in order to use Teradata’s BYO LLM offering.

Installation

Use pip to install the Teradata Python Package for Generative AI

Platform Command
macOS/Linux pip install teradatagenai
Windows python -m pip install teradatagenai

When upgrading to a new version of the teradatagenai, you may need to use pip install's --no-cache-dir option to force the download of the new version.

Platform Command
macOS/Linux pip install --no-cache-dir -U teradatagenai
Windows python -m pip install --no-cache-dir -U teradatagenai

Using the Teradata Python Package for Generative AI:

Your Python script must import the teradatagenai package in order to use the Teradata Python Package for Generative AI. Let us walkthrough some examples to gain a better understanding. We need a common setup to load the data and import the required packages.

Common Setup

# Import the modules and create a teradataml DataFrame.
import os
import teradatagenai
from teradatagenai import TeradataAI, TextAnalyticsAI, load_data
from teradataml import DataFrame

load_data('employee', 'employee_data')
data = DataFrame('employee_data')
df_reviews = data.select(["employee_id", "employee_name", "reviews"])
df_articles = data.select(["employee_id", "employee_name", "articles"])

# Define the base directory and script path.
base_dir = os.path.dirname(teradatagenai.__file__)
sentence_similarity_script = os.path.join(base_dir, 'example-data', 'sentence_similarity.py')

Analyze Sentiment of Food Reviews

In this example, we will be using the analyze_sentiment API to analyze the sentiment of food reviews in the reviews column of a teradataml DataFrame using the Hugging Face model distilbert-base-uncased-emotion. Reviews are passed as a column name along with the teradataml DataFrame.

# Define the model name and arguments for the Hugging Face model.
model_name = 'bhadresh-savani/distilbert-base-uncased-emotion'
model_args = {
    'transformer_class': 'AutoModelForSequenceClassification',
    'task': 'text-classification'
}

# Create a TeradataAI object with the specified model.
llm = TeradataAI(api_type="hugging_face", model_name=model_name, model_args=model_args)
# Create a TextAnalyticsAI object.
obj = TextAnalyticsAI(llm=llm)
obj.analyze_sentiment(column='reviews', data=df_reviews, delimiter="#")

Get Embeddings and Similarity Score for Employee Data and Articles

In this example, we will use the task API to perform two tasks: generating embeddings and calculating similarity scores using the Hugging Face model all-MiniLM-L6-v2.

Generate Embeddings for Employee Reviews

We will generate embeddings for employee reviews from the articles column of a teradataml DataFrame using the Hugging Face model all-MiniLM-L6-v2.

# Define the script path for embeddings.
embeddings_script = os.path.join(base_dir, 'example-data', 'embeddings.py')

# Construct the returns argument based on the user script.
returns = OrderedDict([('text', VARCHAR(512))])
_ = [returns.update({"v{}".format(i+1): VARCHAR(1000)}) for i in range(384)]

# Use the task API to generate embeddings.
llm.task(
    column="articles",
    data=df_articles,
    script=embeddings_script,
    returns=returns,
    libs='sentence_transformers',
    delimiter='#'
)

Calculate Similarity Score

We will calculate the similarity score between employee data and articles using the Hugging Face model all-MiniLM-L6-v2.

# Define the model name and arguments for the Hugging Face model.
model_name = 'sentence-transformers/all-MiniLM-L6-v2'
model_args = {
    'transformer_class': 'AutoModelForSequenceClassification',
    'task': 'text-similarity'
}

# Create a TeradataAI object with the specified model.
llm = TeradataAI(api_type="hugging_face", model_name=model_name, model_args=model_args)

# Use the task API to get the similarity score.
llm.task(
    column=["employee_data", "articles"],
    data=data,
    script=sentence_similarity_script,
    libs='sentence_transformers',
    returns={
        "column1": "VARCHAR(10000)",
        "column2": "VARCHAR(10000)",
        "similarity_score": "VARCHAR(10000)"
    },
    delimiter="#"
)

License

Use of the Teradata Python package for Generative-AI is governed by the License Agreement for the Teradata Python package for Generative-AI. After installation, the LICENSE and LICENSE-3RD-PARTY.pdf files are located in the teradatagenai directory of the Python installation directory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

teradatagenai-20.0.0.0-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file teradatagenai-20.0.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for teradatagenai-20.0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a604ab30b6bb2976ae5bf665444569757b9176469a2d52bb964757a9b26fdff4
MD5 a1ef36396d4bcc91c5e333f08e13c17f
BLAKE2b-256 594219138308e2607250cccdeb106cce68808ebdf4b40e435c76683c41104be0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page