A data-designer plugin for creating columns via custom Python functions
Project description
Data Designer Lambda Column Plugin
A plugin for data-designer that allows you to define columns using custom Python functions. This enables you to inject logic, transformations, and computations directly into your data generation pipeline.
Features
- Row-wise Operations: Apply a function to each row (similar to
pandas.DataFrame.apply(axis=1)). - Full DataFrame Operations: Apply transformations to the entire DataFrame (e.g., exploding lists, aggregations, filtering, pivoting).
- Dependency Management: Explicitly declare required columns to ensure execution order.
Installation
This plugin is designed to be used with data-designer.
pip install data-designer-lambda-column
Usage
Basic Row-wise Transformation
Use operation_type="row" (default) to calculate values based on other columns in the same row.
from data_designer_lambda_column.plugin import LambdaColumnConfig
from data_designer.essentials import DataDesignerConfigBuilder, SamplerColumnConfig, CategorySamplerParams
builder = DataDesignerConfigBuilder()
# 1. Add some base data
builder.add_column(
SamplerColumnConfig(
name="quantity",
sampler_type="category",
params=CategorySamplerParams(values=[10, 20, 30]),
)
)
builder.add_column(
SamplerColumnConfig(
name="price",
sampler_type="category",
params=CategorySamplerParams(values=[5.0, 10.0]),
)
)
# 2. Add a computed column using a lambda function
builder.add_column(
LambdaColumnConfig(
name="total_cost",
required_cols=["quantity", "price"],
operation_type="row", # default
column_function=lambda row: row["quantity"] * row["price"]
)
)
Advanced Full DataFrame Transformation
Use operation_type="full" when you need to change the shape of the DataFrame (e.g., explode, melt) or perform operations that require the full context.
Note: When using operation_type="full", your function receives the entire DataFrame and must return the modified DataFrame.
Warning: Operations that change the number of rows (like
explode) may not work as expected in the current version due to validation checks on update records indata_designer.
from data_designer_lambda_column.plugin import LambdaColumnConfig
from data_designer.essentials import DataDesignerConfigBuilder
# Define a function to explode a list column
def explode_items(df):
# Assume 'items_list' is a column containing lists of items
# e.g., [['apple', 'banana'], ['orange']]
# Explode the list so each item gets its own row
expanded_df = df.explode("items_list")
# Ensure dependencies are met
# The new column name 'single_item' must exist in the returned DataFrame
expanded_df["single_item"] = expanded_df["items_list"]
return expanded_df
builder.add_column(
LambdaColumnConfig(
name="single_item",
required_cols=["items_list"],
operation_type="full",
column_function=explode_items
)
)
Configuration
LambdaColumnConfig accepts the following parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
name |
str |
Required | The name of the column to generate. |
column_function |
callable |
Required | The Python function to execute. |
required_cols |
list[str] |
[] |
List of column names that must exist before this column is generated. |
operation_type |
Literal["row", "full"] |
"row" |
Type of operation. "row" passes a Series (row) to the function. "full" passes the entire DataFrame. |
Plugin Registration
This package exposes a standard data_designer plugin entry point:
- Entry Point:
data_designer.plugins - Name:
lambda-column - Impl:
data_designer_lambda_column.plugin.LambdaColumnGenerator
It will be automatically discovered by data-designer when installed in the same environment.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data_designer_lambda_column-0.1.1.tar.gz.
File metadata
- Download URL: data_designer_lambda_column-0.1.1.tar.gz
- Upload date:
- Size: 151.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2bcbd3912e2130dadc921a5a89cd3ad78237b2ec05643a1bdffc5c78be1a0fa
|
|
| MD5 |
a15ad44fb3882ed1f94e5c4c7c178101
|
|
| BLAKE2b-256 |
00adb66082203dd8365b9b282f994979be53941b618e49d51f0227032578cbcc
|
File details
Details for the file data_designer_lambda_column-0.1.1-py3-none-any.whl.
File metadata
- Download URL: data_designer_lambda_column-0.1.1-py3-none-any.whl
- Upload date:
- Size: 4.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
414aa03db8f0e6a61ec23e42f683a2336745fb1281050af724804f075ccce755
|
|
| MD5 |
f827b40d33b727bc90dc00994577c361
|
|
| BLAKE2b-256 |
c33ac0ec867684db069c7a0bf01c6dde70d83b22dc4e5fe69f1d9c596a7743cf
|