Skip to main content

Creates new DataFrame columns by applying strategically selected operations.

Project description

FeaturesCreation

Efficiently creates new DataFrame columns by applying strategically selected operations, optimizing result relevance and significance. It offers a wide range of functions, intelligent operation selection, and seamless integration with popular data analysis libraries, empowering users to enhance data manipulation effortlessly.

How it works: Transformation Process

The FeaturesCreation library offers a powerful transformation process that allows users to efficiently create new DataFrame columns with strategically selected operations.

  1. Instantiation and Fitting:

First, you need to instantiate the FeaturesCreation class and specify the classifier you want to use for selecting operations. For example, fe_cr = FeaturesCreation().

Then, you fit the FeaturesCreation instance to your data by calling fe_cr.fit(x, y, classifier, n_new_features), where x represents the feature data (input), y is the target column (output), classifier is the chosen classifier (e.g., LGBMClassifier), and n_new_features is the desired number of new features to create.

  1. Transformation Selection:

During the fitting process, the FeaturesCreation class intelligently selects the most relevant and significant transformations to apply to the data. It leverages the provided classifier to evaluate the importance of each potential transformation and selects the top operations that yield the best results.

  1. Application of Transformations:

After fitting, the selected transformations are ready to be applied to the original DataFrame. To apply these transformations, call fe_cr.apply_transformation(df, transformations), where df is the original DataFrame, and transformations contains the chosen operations.

  1. Resulting DataFrame:

The apply_transformation method returns a new DataFrame with the original data and the newly created columns resulting from the applied transformations.

DataFrame Before Transformations

Consider the original DataFrame as follows:

sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2

DataFrame After Transformations

Now, let's apply the transformations to the original DataFrame. The resulting DataFrame will be the newly created columns based on the selected operations:

sepal length (cm)__mod__petal length (cm) sepal length (cm)__truediv__petal length (cm) sepal width (cm)__truediv__petal width (cm)
0 0.9 3.642857 17.5
1 0.7 3.500000 15.0
2 0.8 3.615385 16.0
3 0.1 3.066667 15.5
4 0.8 3.571429 18.0

The new columns are named in the format "feature1__operation__feature2" and contain the transformed values generated by applying the specified operations to the original data.

Examples

Examples can be found in examples/.

# Instantiate the FeaturesCreation class and the classifier
fe_cr = FeaturesCreation()
classifier = LGBMClassifier(verbose=-1)

# Define the number of new features to create
n_new_features = 3

# Separate the features (X) and the target column (y)
x, y = df.drop(columns=[target_column]), df[target_column]

# Create new transformations using FeaturesCreation.fit()
transformations = fe_cr.fit(x, y, classifier, n_new_features)

# Apply the transformations to the DataFrame using FeaturesCreation.apply_transformation()
transformed_df = fe_cr.apply_transformation(df, transformations)

# Concatenate the transformed DataFrame with the original DataFrame
transformed_df = pd.concat([df, transformed_df], axis=1)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

features_creation-0.1.0.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

features_creation-0.1.0-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file features_creation-0.1.0.tar.gz.

File metadata

  • Download URL: features_creation-0.1.0.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.7 Darwin/22.5.0

File hashes

Hashes for features_creation-0.1.0.tar.gz
Algorithm Hash digest
SHA256 03cda91b4e7020cb62ee5f731fa70aa02a6d9f08cd25ea52bcdbc86bdb3aaa15
MD5 85dec5f4525d2b8fb0bd77a7e380eee0
BLAKE2b-256 0f3452b91a0d13d6675e1e29fb2552e2e2595ae4efd74bcd521e637cb5e4f7f1

See more details on using hashes here.

File details

Details for the file features_creation-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: features_creation-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.7 Darwin/22.5.0

File hashes

Hashes for features_creation-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ea7b323e41806fe84f9bf931e35f8b6d6c2b4d7baf2d5c75dfd597aa0941292d
MD5 4c498fdb8a07c3b086b3d6c83d56a83c
BLAKE2b-256 2023c25ae4649b1b330379f337120d5b461ed1f61ef81abd312738ec3f97b4a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page