No project description provided

These details have not been verified by PyPI

Project description

Gooder AI Package

This package provides a streamlined way to ~~evaluate~~ valuate machine learning models on the Gooder AI platform. It exports a simple yet powerful function, valuate_model, which is designed to seamlessly work with a variety of machine learning frameworks -- including scikit-learn, XGBoost, PyTorch, and Catboost models. The function does the following:

Valuates models with Gooder AI.
Validates and uploads Gooder AI configurations and test datasets for secure storage and processing.
Creates or updates a shared "view" on Gooder AI, allowing users to interactively visualize and analyze their model's business performance.

Learn more here:

For a hands-on tutorial, watch this short video here.
For more information, see the Gooder AI hands-on guide.

Installation

Install the package using pip:

pip install gooder_ai

Sample Jupyter Notebook

Running valuate_model on fraud detection models (suitable for Jupyter notebooks)

Function Parameters

`valuate_model(**kwargs)`

The valuate_model function takes the following input arguments:

models: list[ScikitModel]
- Machine learning models that follow the ScikitModel protocol.
- It must have a scoring function (e.g., predict_proba), which is used to generate probability scores for classification.
- It must also have a classes_ attribute, representing the possible target classes.
x_data: ndarray | DataFrame | list[str | int | float] | spmatrix
- A dataset containing the input features for evaluation.
- This is the dataset that will be fed into the model for prediction.
y: ndarray | DataFrame | list[str | int | float] | spmatrix
- A dataset representing the true target values (labels) corresponding to x_data.
- This helps in validating model performance.
config: dict
- A dictionary containing model configuration settings.
- To load a starter configuration provided by Gooder AI, you can use the following example:
```
from gooder_ai.configs import load_starter_config
config = load_starter_config()
```

view_meta: ViewMeta

A dictionary containing metadata about the "view" (shared result visualization) being created or updated.

Structure:

{
    "mode": Optional["public" | "protected" | "private"],  # Access control
    "view_id": Optional[str],  # ID of an existing view (if updating)
    "dataset_name": Optional[str]  # Name of the dataset (defaults to timestamp)
}

If view_id is provided, an existing view is updated; otherwise, a new one is created.

auth_credentials: Credentials
- A dictionary with user authentication details for the Gooder AI platform.
- Structure:
```
{
    "email": str,  # User's email
    "password": str  # User's password
}
```
- These credentials are used for authentication to upload the dataset and configuration. It is optional if upload_data_to_gooder = False and upload_config_to_gooder = False.
model_names: list[str]
- This property is used to label the score columns in the output dataset and configuration.
- If not provided default names are generated based on model class names.
- Example: For a model that outputs binary classification scores, a column named "model1_score, model2_score" will be created.
scorer_names: list[str]
- This property is used to specify the different scorer function.
- If not provided by default it uses predict_proba function.

column_names: ColumnNames = {} (optional)

A dictionary specifying the column names for the dataset and scores.

Structure:

{
    "dataset_column_names": Optional[list[str]],  # Feature names
    "dependent_variable_name": Optional[str]  # Name of the target variable
}

included_columns: list[str] = [] (optional)

An optional list of names specifying which columns to include in the dataset before valuating your models on the Gooder platform. If left unspecified, all columns will be included, which generally results in an unnecessarily large data file, since Gooder typically only makes use of a small number of columns. Even when specified, the following columns will always be included: model scores, dependent variable.

upload_data_to_gooder: Boolean = True (optional)

This flag is used to prevent/allow dataset upload to Gooder AI platform

upload_config_to_gooder: Boolean = True (optional)

This flag is used to prevent/allow config upload to Gooder AI platform

aws_variables: AWSVariables = {} (optional)

A dictionary containing AWS-related variables.
Used for authentication and file uploads.

Structure:

{
    "api_url": Optional[str],
    "app_client_id": Optional[str],
    "identity_pool_id": Optional[str],
    "user_pool_id": Optional[str],
    "bucket_name": Optional[str],
    "base_url": Optional[str],
    "validation_api_url": Optional[str]
}

Defaults to global values if not provided.

max_size_uploaded_data: int = 10 (optional)

Defines the maximum allowed memory size (in megabytes, MB) for the combined dataset when uploading to Gooder AI.
Before uploading, the function calculates the memory usage of the full dataset.
If the dataset exceeds this threshold and upload_data_to_gooder is True, the operation is aborted and an exception is raised.
This is a safety limit to prevent large uploads that could impact performance or exceed platform limits.
Default value is 10MB, which is suitable for most use cases.
Increase this value if you need to work with larger datasets, but be aware of potential performance implications.

max_size_saved_data: int = 1000 (optional)

Defines the maximum allowed memory size (in megabytes, MB) for the combined dataset when saving locally.
Before saving, the function calculates the memory usage of the full dataset.
If the dataset exceeds this threshold and upload_data_to_gooder is False, the operation is aborted and an exception is raised.
This is a safety limit to prevent excessively large local files that could impact system performance.
Default value is 1000MB (approximately 1GB), which allows for much larger local datasets compared to uploads.
Increase this value if you need to work with very large datasets locally, but be aware of system memory constraints.

Summary

The function takes a scikit-learn model, dataset, user credentials, and configuration details.
If either the data or config are set to be uploaded, it authenticates with the Gooder AI platform, validates the config, and uploads the file.
If either the data or config are set to be uploaded, it either creates a new shared view or updates an existing one.
Finally, it returns the view ID and URL, allowing users to access model evaluation results.

Logging configuration

To configure logging in your notebook, add the following code:

import logging
import sys

logging.basicConfig(
    format='%(asctime)s | %(levelname)s : %(message)s',
    level=logging.INFO,
    stream=sys.stdout
)

Log Levels

The logger supports three levels of verbosity:

ERROR: Only prints error logs
INFO: Prints information logs and error logs (default)
DEBUG: Verbose mode that prints all logs, including warnings

By default, sample notebooks are configured to use the INFO level. You can adjust this level based on your requirements.

Custom Model Wrappers

In order to work with PyTorch models, the Gooder AI package provides a ModelWrapper abstract base class (this allows the valuate_model function to internally work with PyTorch models the same way it works with scikit-learn and XGBoost models).

Using ModelWrapper

The ModelWrapper class provides a standardized interface that any model can implement:

from gooder_ai import ModelWrapper

class YourCustomModel(ModelWrapper):
    @abstractmethod
    def predict_proba(self, x):
        """Must return probability predictions as numpy array"""
        pass
    
    @property
    @abstractmethod
    def classes_(self):
        """Must return array of class labels"""
        pass

Sample Workbook: Using valuate_model with PyTorch Models

Common Issues

Mismatch in column names: Ensure that the number of column names matches the dataset shape.
Invalid model type: Ensure that the model conforms to the ScikitModel or XGBoost interface and implements a scoring function e.g predict_proba method.
Authentication failure: Double-check credentials and the Gooder AI endpoint URL.
Dataset size limits: If you encounter size-related errors, adjust the max_size_uploaded_data or max_size_saved_data parameters.
Model naming issues: Ensure that the model_names list has the same length as the models list to avoid default naming.

Running within a Databricks

1. Sample notebook for Databricks

This version is ready for use in Databricks and does not contain any %pip install commands.
The %pip install commands are removed because:
- They can cause cold start issues in Databricks
- They may conflict with cluster-level package management
- They can lead to inconsistent environments across users
- Databricks best practices recommend managing dependencies at the cluster level

2. Setting Up the Databricks Environment

Use Environment version 2
Dependencies:
- Ensure the following packages are added to your Databricks cluster environment (via the UI, not by the notebook):
  - gooder_ai
  - seaborn
  - matplotlib
  - xgboost
  - numpy
  - pandas
  - scikit-learn
Cluster State:
- Wait for the cluster to show a "Connected" state before running any cells.

3. Handling Large Data Files

Databricks workspace has a 500 MB file size limit for uploads and downloads.
For datasets larger than 500 MB:
- Split them into multiple smaller files.
- Upload the split files to Databricks.
- Add a cell in your notebook to combine the files into a single DataFrame.
- Pass the combined DataFrame to valuate_model.
- Operations will fail if they attempt to:
  - Create a file exceeding 500 MB in the workspace
  - Upload a file larger than 500 MB to the workspace
  - Download a file larger than 500 MB into the workspace.

Important:

After successful execution of the notebook, it will provide you:

A configuration file to be used with Gooder AI
A CSV file containing the scored test data

If you have not instructed valuate_model to pass these files through the cloud to the Gooder AI application, you must then download these files locally and then upload them to the Gooder AI application to visualize the business performance of your models.

Note

valuate_model can be configured to reduce the size of the output CSV file by using the included_columns parameter to specify which columns to include.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.4.1

Aug 11, 2025

1.4.0

Aug 6, 2025

1.3.0

Jun 30, 2025

1.2.24

Jun 13, 2025

1.2.23

Jun 3, 2025

1.2.22

May 27, 2025

1.2.21

May 26, 2025

1.2.20

May 26, 2025

1.2.19

May 26, 2025

1.2.18

May 23, 2025

1.2.17

May 21, 2025

1.2.16

May 20, 2025

1.2.15

May 16, 2025

1.2.14

May 15, 2025

1.2.13

May 13, 2025

1.2.12

May 9, 2025

1.2.11

May 5, 2025

1.2.10

Apr 30, 2025

1.2.9

Apr 29, 2025

1.2.8

Apr 28, 2025

1.2.7

Apr 28, 2025

1.2.6

Apr 18, 2025

1.2.5

Apr 18, 2025

1.2.4

Apr 18, 2025

1.2.3

Apr 16, 2025

1.2.2

Apr 10, 2025

1.2.1

Apr 9, 2025

1.2.0

Apr 7, 2025

1.1.10

Apr 7, 2025

1.1.9

Apr 7, 2025

1.1.8

Apr 7, 2025

1.1.7

Apr 4, 2025

1.1.6

Apr 4, 2025

1.1.5

Mar 28, 2025

1.1.4

Mar 25, 2025

1.1.3

Mar 25, 2025

1.1.2

Mar 20, 2025

1.1.1

Mar 20, 2025

1.1.0

Mar 19, 2025

1.0.6

Mar 14, 2025

1.0.5

Mar 8, 2025

1.0.4

Mar 8, 2025

1.0.3

Mar 6, 2025

1.0.2

Mar 5, 2025

1.0.1

Mar 5, 2025

1.0.0

Feb 27, 2025

0.1.8

Feb 27, 2025

0.1.7

Feb 27, 2025

0.1.6

Feb 27, 2025

0.1.5

Feb 27, 2025

0.1.4

Feb 27, 2025

0.1.3

Feb 22, 2025

0.1.2

Feb 20, 2025

0.1.1

Feb 19, 2025

0.1.0

Feb 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gooder_ai-1.4.1.tar.gz (24.3 kB view details)

Uploaded Aug 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gooder_ai-1.4.1-py3-none-any.whl (27.8 kB view details)

Uploaded Aug 11, 2025 Python 3

File details

Details for the file gooder_ai-1.4.1.tar.gz.

File metadata

Download URL: gooder_ai-1.4.1.tar.gz
Upload date: Aug 11, 2025
Size: 24.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.11.9 Windows/10

File hashes

Hashes for gooder_ai-1.4.1.tar.gz
Algorithm	Hash digest
SHA256	`2a21ece908f18dd6f334c46d56e9b988013b126b091c27eccdfcfb1fb7bd28c2`
MD5	`2498a605d289468e63a3a2aa70e66ac1`
BLAKE2b-256	`75f45200108c0a6ca66ecf615dad41a128656aaff3dcfb7a171ce9dcd1c28463`

See more details on using hashes here.

File details

Details for the file gooder_ai-1.4.1-py3-none-any.whl.

File metadata

Download URL: gooder_ai-1.4.1-py3-none-any.whl
Upload date: Aug 11, 2025
Size: 27.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.11.9 Windows/10

File hashes

Hashes for gooder_ai-1.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c1bfd41b1f1407010714c782947b5de6f592d2c033dd0b0387dccb32a177ccce`
MD5	`9b08b73698188516b12e781c4654d24a`
BLAKE2b-256	`ab7574a66474eb7c8b833c3a6073606f21eeafb8a1fd890efcd70446992d337a`

See more details on using hashes here.

gooder-ai 1.4.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Gooder AI Package

Learn more here:

Installation

Sample Jupyter Notebook

Function Parameters

`valuate_model(**kwargs)`

Summary

Logging configuration

Log Levels

Custom Model Wrappers

Using ModelWrapper

Common Issues

Running within a Databricks

1. Sample notebook for Databricks

2. Setting Up the Databricks Environment

3. Handling Large Data Files

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes