Skip to main content

a library for automated table normalization

Project description



AutoNormalize is a Python library for automated datatable normalization, intended for use with Feature Tools. AutoNormalize allows you to build an EntitySet from a single denormalized table and generate features for machine learning.

Before AutoNormalize:

After AutoNormalize:


pip install autonormalize


pip uninstall autonormalize

API Reference

auto_entityset(df, accuracy=0.98, index=None, name=None, time_index=None)

Creates a normalized entityset from a dataframe.


df (pd.Dataframe) : the dataframe containing data

accuracy (0 < float <= 1.00; default = 0.98) : the accuracy threshold required in order to conclude a dependency (i.e. with accuracy = 0.98, 0.98 of the rows must hold true the dependency LHS --> RHS)

index (str, optional) : name of column that is intended index of df

name (str, optional) : the name of created EntitySet

time_index (str, optional) : name of time column in the dataframe.


entityset (ft.EntitySet) : created entity set

find_dependencies(df, accuracy=0.98, index=None)

Finds dependencies within dataframe with the DFD search algorithm.


dependencies (Dependencies) : the dependencies found in the data within the contraints provided

normalize_dataframe(df, dependencies)

Normalizes dataframe based on the dependencies given.


new_dfs (list[pd.DataFrame]) : list of new dataframes

make_entityset(df, dependencies, name=None, time_index=None):

Creates a normalized EntitySet from dataframe based on the dependencies given.


entityset (ft.EntitySet) : created EntitySet

Feature Labs


AutoNormalize is an open source project created by Feature Labs. To see the other open source projects we're working on visit Feature Labs Open Source. If building impactful data science pipelines is important to you or your business, please get in touch.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for autonormalize, version 0.0.0
Filename, size File type Python version Upload date Hashes
Filename, size autonormalize-0.0.0-py3-none-any.whl (611.9 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size autonormalize-0.0.0.tar.gz (585.0 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page