Skip to main content

Context Aware Automated Feature Generators with LLMs

Project description

CAAFG - Context Aware Automated Feature Generators

A collection of different context aware automated feature generators.

Usage

Begin by installing

pip3 install caafg

To use this package choose any generator type. Available types can be visited at generators. Each generator requires server backend hosting a language model which can be queried. The connector to different types of language models is linked in this project but is mainly implemented in the remoteinference package. For a list of available models ref here.

Each generator implements the AbstractGenerator interface which provides basic functionality for a generator. Here is an example. First initalize the generator with a corresponding model backend:

import os

from caafg.generators import BlueprintGenerator
from caafg.models import OpenAILLM

model = OpenAILLM(
    api_key=api_key=os.environ.get('OPEANI_API_KEY'),
    model_type='gpt-4o-mini'
    )

generator = BlueprintGenerator(model=model)

Now for each dataset the generator requires the training data as it will include samples of this data in the instructions prompt as well as a dataset name and dataset description if available. All this information should be stored in the Datasetobject. Subsequently the model can be asked to generate n_features simoultaniously:

from caafg.dataset import Dataset

generator = BlueprintGenerator(model=model)

ds = Dataset(
    X=train_X,
    y=train_y,
    dataset_name="Dataset Name",
    dataset_description="Some Description"
    )

features = generator.ask(
    dataset=ds,
    n_features=5,
)

The ask method of the generator will return a dictionary containing all the information that the language model provided for the given generator type. For this example the result will look similar to this:

{
    'blueprint_feature_0':
    {
        'name': 'f5',
        'operator': 'Add',
        'features': ['1', '2'], 'features_combination': 'Add(1, 2)',
        'description': 'Some Description',
        'reasoning': 'Some Reasonong'
        },
    'blueprint_feature_1':
    {
        ...
    },
    ...
    }

This list of proposed features can then be applied to the train and test set by calling the transform method of the generator:

train_X, test_X = generator.transform(
    train_X,
    test_X,
    features
    )

Generators

Blueprint Generator

This generator proposes as new features a combination of existing features and an operator that should be applied to the features to create the new one. Usage:

from caafg.genertors import BlueprintGenerator

generator = BlueprintGenerator()

Models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

caafg-0.0.4.tar.gz (12.9 kB view details)

Uploaded Source

File details

Details for the file caafg-0.0.4.tar.gz.

File metadata

  • Download URL: caafg-0.0.4.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for caafg-0.0.4.tar.gz
Algorithm Hash digest
SHA256 d558d18a43abe616e0e93eae261438a6829ec2f4179d36d2afc427c6004df46a
MD5 bf6ecb19e8a93b5370da8609f7f4829e
BLAKE2b-256 642d5c7b98a8f7ac863789c16a1ff0d64291908888fdc1e1f49f2c4c9e4f5259

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page