Skip to main content

Feature Store for the Daipe AI Platform

Project description

Feature Store bundle

This package is distributed under the "DataSentics SW packages Terms of Use." See license

Feature store bundle allows you to store features with metadata.

Installation

poetry add feature-store-bundle

Getting started

  1. Define entity and custom feature decorator
from pyspark.sql import types as t
from daipecore.decorator.DecoratedDecorator import DecoratedDecorator
from featurestorebundle.entity.Entity import Entity
from featurestorebundle.feature.FeaturesStorage import FeaturesStorage
from featurestorebundle.notebook.decorator.feature import feature

entity = Entity(
    name="client",
    id_column="UserName",
    id_column_type=t.StringType(),
    time_column="run_date",
    time_column_type=t.DateType(),
)

@DecoratedDecorator
class client_feature(feature):  # noqa N081
    def __init__(self, *args, category=None):
        super().__init__(*args, entity=entity, category=category, features_storage=features_storage)
  1. Use the feature decorator to save features as you create them
from pyspark.sql import functions as f
from pyspark.sql import DataFrame
from datalakebundle.imports import transformation, read_table

@transformation(read_table("silver.tbl_loans"), display=True)
@client_feature(
    ("Age", "Client's age"),
    ("Gender", "Client's gender"),
    ("WorkExperience", "Client's work experience"),
    category="personal",
)
def client_personal_features(df: DataFrame):
    return (
        df.select("UserName", "Age", "Gender", "WorkExperience")
        .groupBy("UserName")
        .agg(
            f.max("Age").alias("Age"),
            f.first("Gender").alias("Gender"),
            f.first("WorkExperience").alias("WorkExperience"),
        )
        .withColumn("run_date", f.lit(today))
    )
  1. Write/Merge all features in one go
from datalakebundle.imports import notebook_function
from featurestorebundle.delta.DeltaWriter import DeltaWriter

notebook_function()
def write_features(writer: DeltaWriter):
    writer.write_latest(features_storage)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

feature_store_bundle-1.2.0-py3-none-any.whl (22.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page