Skip to main content

A simplified version of featuretools for Spark

Project description

featuretoolsOnSpark

Featuretools is a python library for automated feature engineering.

This repo is a simplified version of featuretools,using automatic feature generation framework of featuretools.Instead of the fussy back-end architecture of featuretools,We mainly use Spark DataFrame to achieve faster feature generation process(speed up 10x+).

Installation

Install with pip

pip install featuretoolsOnSpark

Install from source

git clone https://github.com/giantcroc/featuretoolsOnSpark.git
cd featuretoolsOnSpark
python setup.py install

Example

Below is an example of how to use apis of this repo.We Choose the dataset from Kaggle's competition(Home-Credit-Default-Risk).The relationships between tables are shown in the picture below.

featuretoolsOnSpark

First,you should guarantee that all csv files needed have been saved as Spark DataFrame format.

1. Create Spark Context

>> from pyspark.sql import SparkSession

>> spark = SparkSession \
   	.builder \
   	.appName("home-credit") \
   	.enableHiveSupport()\
   	.getOrCreate()

2. Get Spark DataFrame

>> app_train = spark.sql(''' select * from home_credit.app_train ''')

>> bureau = spark.sql(''' select * from home_credit.bureau ''')

>> bureau_balance = spark.sql(''' select * from home_credit.bureau_balance ''')

>> cash = spark.sql(''' select * from home_credit.cash ''')

>> credit = spark.sql(''' select * from home_credit.credit ''')

>> installments = spark.sql(''' select * from home_credit.installments ''')

>> previous = spark.sql(''' select * from home_credit.previous ''')

3. Create TableSet

>> import featuretoolsOnSpark as fts

>> ts = fts.TableSet("home_credit",no_change_columns=["SK_ID_PREV","SK_ID_CURR","SK_ID_BUREAU"],verbose=False)

4. Create Tables From Spark DataFrame

>> ts.table_from_dataframe(table_id="bureau_balance",dataframe=bureau_balance,index='bureau_balance_id',make_index = True)

>> ts.table_from_dataframe(table_id="app_train",dataframe=app_train,index='SK_ID_CURR')

>> ts.table_from_dataframe(table_id="bureau",dataframe=bureau,index='SK_ID_BUREAU')

>> ts.table_from_dataframe(table_id="cash",dataframe=cash,index='cash_id',make_index = True)

>> ts.table_from_dataframe(table_id="credit",dataframe=credit,index='credit_id',make_index = True)

>> ts.table_from_dataframe(table_id="installments",dataframe=installments,index='installments_id',make_index = True)

>> ts.table_from_dataframe(table_id="previous",dataframe=previous,index='SK_ID_PREV')

5. Add Relationships of Tables

>> re1 = Relationship(ts["app_train"]["SK_ID_CURR"],ts["bureau"]["SK_ID_CURR"])

>> re2 = Relationship(ts["bureau"]["SK_ID_BUREAU"],ts["bureau_balance"]["SK_ID_BUREAU"])

>> re3 = Relationship(ts["app_train"]["SK_ID_CURR"],ts["previous"]["SK_ID_CURR"])

>> re4 = Relationship(ts["previous"]["SK_ID_PREV"],ts["cash"]["SK_ID_PREV"])

>> re5 = Relationship(ts["previous"]["SK_ID_PREV"],ts["credit"]["SK_ID_PREV"])

>> re6 = Relationship(ts["previous"]["SK_ID_PREV"],ts["installments"]["SK_ID_PREV"])

>> ts.add_relationships([re1,re2,re3,re4,re5,re6])

6. Run DFS To Generate Features

>> fts.dfs(tableset=ts, agg_primitives=["sum",'min','max','avg'],target_table='app_train',max_depth=2,verbose=False)

7. Get Features

>> new_app_train = ts["app_train"].df

>> old_len = ts["app_train"].old_len

>> print('new_generate_feature_len:{}'.format(len(new_app_train.columns)-old_len))

>> print(new_app_train.columns[old_len:])
new_generate_feature_len:636
['sum_max_bureau_balance_MONTHS_BALANCE', 'sum_CREDIT_DAY_OVERDUE', 'sum_CNT_CREDIT_PROLONG', 'sum_DAYS_CREDIT_ENDDATE', 'sum_DAYS_CREDIT_UPDATE', 'sum_min_bureau_balance_MONTHS_BALANCE', 'sum_sum_bureau_balance_MONTHS_BALANCE', 'sum_AMT_CREDIT_MAX_OVERDUE', 'sum_bureau_AMT_ANNUITY', 'sum_DAYS_ENDDATE_FACT', 'sum_AMT_CREDIT_SUM_LIMIT', 'sum_DAYS_CREDIT', 'sum_avg_bureau_balance_MONTHS_BALANCE', 'sum_AMT_CREDIT_SUM_DEBT', 'sum_AMT_CREDIT_SUM', 'sum_AMT_CREDIT_SUM_OVERDUE', 'min_max_bureau_balance_MONTHS_BALANCE', 'min_CREDIT_DAY_OVERDUE', 'min_CNT_CREDIT_PROLONG', 'min_DAYS_CREDIT_ENDDATE', 'min_DAYS_CREDIT_UPDATE', 'min_min_bureau_balance_MONTHS_BALANCE', 'min_sum_bureau_balance_MONTHS_BALANCE', 'min_AMT_CREDIT_MAX_OVERDUE', 'min_bureau_AMT_ANNUITY', 'min_DAYS_ENDDATE_FACT', 'min_AMT_CREDIT_SUM_LIMIT', 'min_DAYS_CREDIT', 'min_avg_bureau_balance_MONTHS_BALANCE', 'min_AMT_CREDIT_SUM_DEBT', 'min_AMT_CREDIT_SUM', 'min_AMT_CREDIT_SUM_OVERDUE', 'max_max_bureau_balance_MONTHS_BALANCE', 'max_CREDIT_DAY_OVERDUE', 'max_CNT_CREDIT_PROLONG', 'max_DAYS_CREDIT_ENDDATE', 'max_DAYS_CREDIT_UPDATE', 'max_min_bureau_balance_MONTHS_BALANCE', 'max_sum_bureau_balance_MONTHS_BALANCE', 'max_AMT_CREDIT_MAX_OVERDUE', 'max_bureau_AMT_ANNUITY', 'max_DAYS_ENDDATE_FACT', 'max_AMT_CREDIT_SUM_LIMIT', 'max_DAYS_CREDIT', 'max_avg_bureau_balance_MONTHS_BALANCE', 'max_AMT_CREDIT_SUM_DEBT', 'max_AMT_CREDIT_SUM', 'max_AMT_CREDIT_SUM_OVERDUE', 'avg_max_bureau_balance_MONTHS_BALANCE', 'avg_CREDIT_DAY_OVERDUE', 'avg_CNT_CREDIT_PROLONG', 'avg_DAYS_CREDIT_ENDDATE', 'avg_DAYS_CREDIT_UPDATE', 'avg_min_bureau_balance_MONTHS_BALANCE', 'avg_sum_bureau_balance_MONTHS_BALANCE', 'avg_AMT_CREDIT_MAX_OVERDUE', 'avg_bureau_AMT_ANNUITY', 'avg_DAYS_ENDDATE_FACT', 'avg_AMT_CREDIT_SUM_LIMIT', 'avg_DAYS_CREDIT', 'avg_avg_bureau_balance_MONTHS_BALANCE', 'avg_AMT_CREDIT_SUM_DEBT', 'avg_AMT_CREDIT_SUM', 'avg_AMT_CREDIT_SUM_OVERDUE', 'sum_min_AMT_CREDIT_LIMIT_ACTUAL', 'sum_max_DAYS_ENTRY_PAYMENT', 'sum_sum_NUM_INSTALMENT_VERSION', 'sum_sum_AMT_PAYMENT', 'sum_avg_AMT_PAYMENT_TOTAL_CURRENT', 'sum_max_CNT_DRAWINGS_POS_CURRENT', 'sum_max_AMT_BALANCE', 'sum_min_DAYS_INSTALMENT', 'sum_min_AMT_INSTALMENT', 'sum_min_AMT_RECEIVABLE_PRINCIPAL', 'sum_max_AMT_RECIVABLE', 'sum_DAYS_LAST_DUE_1ST_VERSION', 'sum_avg_AMT_INST_MIN_REGULARITY', 'sum_avg_CNT_DRAWINGS_OTHER_CURRENT', 'sum_max_AMT_TOTAL_RECEIVABLE', 'sum_min_AMT_DRAWINGS_OTHER_CURRENT', 'sum_sum_AMT_PAYMENT_TOTAL_CURRENT', 'sum_min_AMT_PAYMENT', 'sum_sum_CNT_INSTALMENT', 'sum_min_AMT_PAYMENT_TOTAL_CURRENT', 'sum_DAYS_FIRST_DRAWING', 'sum_DAYS_TERMINATION', 'sum_sum_cash_MONTHS_BALANCE', 'sum_sum_credit_SK_DPD', 'sum_min_AMT_TOTAL_RECEIVABLE', 'sum_avg_AMT_DRAWINGS_POS_CURRENT', 'sum_max_NUM_INSTALMENT_NUMBER', 'sum_avg_AMT_DRAWINGS_CURRENT', 'sum_sum_cash_SK_DPD_DEF', 'sum_avg_AMT_PAYMENT_CURRENT', 'sum_avg_AMT_RECIVABLE', 'sum_min_cash_SK_DPD_DEF', 'sum_min_CNT_INSTALMENT_FUTURE', 'sum_max_NUM_INSTALMENT_VERSION', 'sum_sum_DAYS_ENTRY_PAYMENT', 'sum_max_CNT_INSTALMENT_MATURE_CUM', 'sum_avg_DAYS_INSTALMENT', 'sum_min_CNT_INSTALMENT', 'sum_max_AMT_PAYMENT_CURRENT', 'sum_avg_CNT_INSTALMENT', 'sum_avg_cash_SK_DPD_DEF', 'sum_sum_AMT_TOTAL_RECEIVABLE', 'sum_avg_AMT_TOTAL_RECEIVABLE', 'sum_min_AMT_PAYMENT_CURRENT', 'sum_avg_CNT_DRAWINGS_POS_CURRENT', 'sum_avg_AMT_PAYMENT', 'sum_min_DAYS_ENTRY_PAYMENT', 'sum_max_CNT_DRAWINGS_OTHER_CURRENT', 'sum_avg_AMT_RECEIVABLE_PRINCIPAL', 'sum_CNT_PAYMENT', 'sum_sum_CNT_DRAWINGS_ATM_CURRENT', 'sum_DAYS_FIRST_DUE', 'sum_sum_AMT_INST_MIN_REGULARITY', 'sum_min_CNT_INSTALMENT_MATURE_CUM', 'sum_sum_AMT_DRAWINGS_OTHER_CURRENT', 'sum_previous_AMT_CREDIT', 'sum_min_AMT_DRAWINGS_CURRENT', 'sum_avg_MONTHS_BALANCE', 'sum_DAYS_DECISION', 'sum_min_CNT_DRAWINGS_OTHER_CURRENT', 'sum_sum_credit_SK_DPD_DEF', 'sum_max_MONTHS_BALANCE', 'sum_RATE_INTEREST_PRIMARY', 'sum_max_CNT_DRAWINGS_CURRENT', 'sum_avg_credit_SK_DPD', 'sum_sum_AMT_BALANCE', 'sum_min_AMT_BALANCE', 'sum_avg_AMT_DRAWINGS_ATM_CURRENT', 'sum_sum_CNT_DRAWINGS_OTHER_CURRENT', 'sum_max_CNT_INSTALMENT_FUTURE', 'sum_max_AMT_DRAWINGS_POS_CURRENT', 'sum_max_credit_SK_DPD', 'sum_avg_AMT_BALANCE', 'sum_AMT_DOWN_PAYMENT', 'sum_sum_CNT_DRAWINGS_POS_CURRENT', 'sum_min_credit_SK_DPD_DEF', 'sum_min_CNT_DRAWINGS_POS_CURRENT', 'sum_max_cash_SK_DPD_DEF', 'sum_avg_cash_MONTHS_BALANCE', 'sum_avg_CNT_DRAWINGS_ATM_CURRENT', 'sum_max_credit_SK_DPD_DEF', 'sum_sum_AMT_DRAWINGS_CURRENT', 'sum_max_AMT_DRAWINGS_CURRENT', 'sum_min_AMT_DRAWINGS_ATM_CURRENT', 'sum_sum_AMT_DRAWINGS_POS_CURRENT', 'sum_sum_AMT_RECEIVABLE_PRINCIPAL', 'sum_sum_CNT_DRAWINGS_CURRENT', 'sum_max_DAYS_INSTALMENT', 'sum_max_AMT_CREDIT_LIMIT_ACTUAL', 'sum_avg_credit_SK_DPD_DEF', 'sum_AMT_ANNUITY', 'sum_min_CNT_DRAWINGS_CURRENT', 'sum_sum_NUM_INSTALMENT_NUMBER', 'sum_avg_DAYS_ENTRY_PAYMENT', 'sum_min_AMT_INST_MIN_REGULARITY', 'sum_sum_cash_SK_DPD', 'sum_min_MONTHS_BALANCE', 'sum_avg_NUM_INSTALMENT_NUMBER', 'sum_min_cash_MONTHS_BALANCE', 'sum_max_AMT_PAYMENT_TOTAL_CURRENT', 'sum_min_AMT_RECIVABLE', 'sum_sum_CNT_INSTALMENT_FUTURE', 'sum_avg_cash_SK_DPD', 'sum_previous_AMT_GOODS_PRICE', 'sum_min_NUM_INSTALMENT_NUMBER', 'sum_sum_AMT_INSTALMENT', 'sum_max_cash_SK_DPD', 'sum_avg_AMT_INSTALMENT', 'sum_max_AMT_RECEIVABLE_PRINCIPAL', 'sum_RATE_DOWN_PAYMENT', 'sum_sum_AMT_RECIVABLE', 'sum_sum_MONTHS_BALANCE', 'sum_avg_AMT_CREDIT_LIMIT_ACTUAL', 'sum_max_AMT_INST_MIN_REGULARITY', 'sum_min_NUM_INSTALMENT_VERSION', 'sum_avg_CNT_DRAWINGS_CURRENT', 'sum_max_AMT_DRAWINGS_OTHER_CURRENT', 'sum_sum_AMT_CREDIT_LIMIT_ACTUAL', 'sum_max_CNT_INSTALMENT', 'sum_max_AMT_PAYMENT', 'sum_RATE_INTEREST_PRIVILEGED', 'sum_max_AMT_INSTALMENT', 'sum_max_AMT_DRAWINGS_ATM_CURRENT', 'sum_NFLAG_LAST_APPL_IN_DAY', 'sum_NFLAG_INSURED_ON_APPROVAL', 'sum_min_cash_SK_DPD', 'sum_avg_CNT_INSTALMENT_FUTURE', 'sum_sum_AMT_DRAWINGS_ATM_CURRENT', 'sum_SELLERPLACE_AREA', 'sum_sum_AMT_PAYMENT_CURRENT', 'sum_avg_NUM_INSTALMENT_VERSION', 'sum_max_cash_MONTHS_BALANCE', 'sum_min_AMT_DRAWINGS_POS_CURRENT', 'sum_sum_CNT_INSTALMENT_MATURE_CUM', 'sum_AMT_APPLICATION', 'sum_DAYS_LAST_DUE', 'sum_avg_CNT_INSTALMENT_MATURE_CUM', 'sum_max_CNT_DRAWINGS_ATM_CURRENT', 'sum_previous_HOUR_APPR_PROCESS_START', 'sum_avg_AMT_DRAWINGS_OTHER_CURRENT', 'sum_min_credit_SK_DPD', 'sum_min_CNT_DRAWINGS_ATM_CURRENT', 'sum_sum_DAYS_INSTALMENT', 'min_min_AMT_CREDIT_LIMIT_ACTUAL', 'min_max_DAYS_ENTRY_PAYMENT', 'min_sum_NUM_INSTALMENT_VERSION', 'min_sum_AMT_PAYMENT', 'min_avg_AMT_PAYMENT_TOTAL_CURRENT', 'min_max_CNT_DRAWINGS_POS_CURRENT', 'min_max_AMT_BALANCE', 'min_min_DAYS_INSTALMENT', 'min_min_AMT_INSTALMENT', 'min_min_AMT_RECEIVABLE_PRINCIPAL', 'min_max_AMT_RECIVABLE', 'min_DAYS_LAST_DUE_1ST_VERSION', 'min_avg_AMT_INST_MIN_REGULARITY', 'min_avg_CNT_DRAWINGS_OTHER_CURRENT', 'min_max_AMT_TOTAL_RECEIVABLE', 'min_min_AMT_DRAWINGS_OTHER_CURRENT', 'min_sum_AMT_PAYMENT_TOTAL_CURRENT', 'min_min_AMT_PAYMENT', 'min_sum_CNT_INSTALMENT', 'min_min_AMT_PAYMENT_TOTAL_CURRENT', 'min_DAYS_FIRST_DRAWING', 'min_DAYS_TERMINATION', 'min_sum_cash_MONTHS_BALANCE', 'min_sum_credit_SK_DPD', 'min_min_AMT_TOTAL_RECEIVABLE', 'min_avg_AMT_DRAWINGS_POS_CURRENT', 'min_max_NUM_INSTALMENT_NUMBER', 'min_avg_AMT_DRAWINGS_CURRENT', 'min_sum_cash_SK_DPD_DEF', 'min_avg_AMT_PAYMENT_CURRENT', 'min_avg_AMT_RECIVABLE', 'min_min_cash_SK_DPD_DEF', 'min_min_CNT_INSTALMENT_FUTURE', 'min_max_NUM_INSTALMENT_VERSION', 'min_sum_DAYS_ENTRY_PAYMENT', 'min_max_CNT_INSTALMENT_MATURE_CUM', 'min_avg_DAYS_INSTALMENT', 'min_min_CNT_INSTALMENT', 'min_max_AMT_PAYMENT_CURRENT', 'min_avg_CNT_INSTALMENT', 'min_avg_cash_SK_DPD_DEF', 'min_sum_AMT_TOTAL_RECEIVABLE', 'min_avg_AMT_TOTAL_RECEIVABLE', 'min_min_AMT_PAYMENT_CURRENT', 'min_avg_CNT_DRAWINGS_POS_CURRENT', 'min_avg_AMT_PAYMENT', 'min_min_DAYS_ENTRY_PAYMENT', 'min_max_CNT_DRAWINGS_OTHER_CURRENT', 'min_avg_AMT_RECEIVABLE_PRINCIPAL', 'min_CNT_PAYMENT', 'min_sum_CNT_DRAWINGS_ATM_CURRENT', 'min_DAYS_FIRST_DUE', 'min_sum_AMT_INST_MIN_REGULARITY', 'min_min_CNT_INSTALMENT_MATURE_CUM', 'min_sum_AMT_DRAWINGS_OTHER_CURRENT', 'min_previous_AMT_CREDIT', 'min_min_AMT_DRAWINGS_CURRENT', 'min_avg_MONTHS_BALANCE', 'min_DAYS_DECISION', 'min_min_CNT_DRAWINGS_OTHER_CURRENT', 'min_sum_credit_SK_DPD_DEF', 'min_max_MONTHS_BALANCE', 'min_RATE_INTEREST_PRIMARY', 'min_max_CNT_DRAWINGS_CURRENT', 'min_avg_credit_SK_DPD', 'min_sum_AMT_BALANCE', 'min_min_AMT_BALANCE', 'min_avg_AMT_DRAWINGS_ATM_CURRENT', 'min_sum_CNT_DRAWINGS_OTHER_CURRENT', 'min_max_CNT_INSTALMENT_FUTURE', 'min_max_AMT_DRAWINGS_POS_CURRENT', 'min_max_credit_SK_DPD', 'min_avg_AMT_BALANCE', 'min_AMT_DOWN_PAYMENT', 'min_sum_CNT_DRAWINGS_POS_CURRENT', 'min_min_credit_SK_DPD_DEF', 'min_min_CNT_DRAWINGS_POS_CURRENT', 'min_max_cash_SK_DPD_DEF', 'min_avg_cash_MONTHS_BALANCE', 'min_avg_CNT_DRAWINGS_ATM_CURRENT', 'min_max_credit_SK_DPD_DEF', 'min_sum_AMT_DRAWINGS_CURRENT', 'min_max_AMT_DRAWINGS_CURRENT', 'min_min_AMT_DRAWINGS_ATM_CURRENT', 'min_sum_AMT_DRAWINGS_POS_CURRENT', 'min_sum_AMT_RECEIVABLE_PRINCIPAL', 'min_sum_CNT_DRAWINGS_CURRENT', 'min_max_DAYS_INSTALMENT', 'min_max_AMT_CREDIT_LIMIT_ACTUAL', 'min_avg_credit_SK_DPD_DEF', 'min_AMT_ANNUITY', 'min_min_CNT_DRAWINGS_CURRENT', 'min_sum_NUM_INSTALMENT_NUMBER', 'min_avg_DAYS_ENTRY_PAYMENT', 'min_min_AMT_INST_MIN_REGULARITY', 'min_sum_cash_SK_DPD', 'min_min_MONTHS_BALANCE', 'min_avg_NUM_INSTALMENT_NUMBER', 'min_min_cash_MONTHS_BALANCE', 'min_max_AMT_PAYMENT_TOTAL_CURRENT', 'min_min_AMT_RECIVABLE', 'min_sum_CNT_INSTALMENT_FUTURE', 'min_avg_cash_SK_DPD', 'min_previous_AMT_GOODS_PRICE', 'min_min_NUM_INSTALMENT_NUMBER', 'min_sum_AMT_INSTALMENT', 'min_max_cash_SK_DPD', 'min_avg_AMT_INSTALMENT', 'min_max_AMT_RECEIVABLE_PRINCIPAL', 'min_RATE_DOWN_PAYMENT', 'min_sum_AMT_RECIVABLE', 'min_sum_MONTHS_BALANCE', 'min_avg_AMT_CREDIT_LIMIT_ACTUAL', 'min_max_AMT_INST_MIN_REGULARITY', 'min_min_NUM_INSTALMENT_VERSION', 'min_avg_CNT_DRAWINGS_CURRENT', 'min_max_AMT_DRAWINGS_OTHER_CURRENT', 'min_sum_AMT_CREDIT_LIMIT_ACTUAL', 'min_max_CNT_INSTALMENT', 'min_max_AMT_PAYMENT', 'min_RATE_INTEREST_PRIVILEGED', 'min_max_AMT_INSTALMENT', 'min_max_AMT_DRAWINGS_ATM_CURRENT', 'min_NFLAG_LAST_APPL_IN_DAY', 'min_NFLAG_INSURED_ON_APPROVAL', 'min_min_cash_SK_DPD', 'min_avg_CNT_INSTALMENT_FUTURE', 'min_sum_AMT_DRAWINGS_ATM_CURRENT', 'min_SELLERPLACE_AREA', 'min_sum_AMT_PAYMENT_CURRENT', 'min_avg_NUM_INSTALMENT_VERSION', 'min_max_cash_MONTHS_BALANCE', 'min_min_AMT_DRAWINGS_POS_CURRENT', 'min_sum_CNT_INSTALMENT_MATURE_CUM', 'min_AMT_APPLICATION', 'min_DAYS_LAST_DUE', 'min_avg_CNT_INSTALMENT_MATURE_CUM', 'min_max_CNT_DRAWINGS_ATM_CURRENT', 'min_previous_HOUR_APPR_PROCESS_START', 'min_avg_AMT_DRAWINGS_OTHER_CURRENT', 'min_min_credit_SK_DPD', 'min_min_CNT_DRAWINGS_ATM_CURRENT', 'min_sum_DAYS_INSTALMENT', 'max_min_AMT_CREDIT_LIMIT_ACTUAL', 'max_max_DAYS_ENTRY_PAYMENT', 'max_sum_NUM_INSTALMENT_VERSION', 'max_sum_AMT_PAYMENT', 'max_avg_AMT_PAYMENT_TOTAL_CURRENT', 'max_max_CNT_DRAWINGS_POS_CURRENT', 'max_max_AMT_BALANCE', 'max_min_DAYS_INSTALMENT', 'max_min_AMT_INSTALMENT', 'max_min_AMT_RECEIVABLE_PRINCIPAL', 'max_max_AMT_RECIVABLE', 'max_DAYS_LAST_DUE_1ST_VERSION', 'max_avg_AMT_INST_MIN_REGULARITY', 'max_avg_CNT_DRAWINGS_OTHER_CURRENT', 'max_max_AMT_TOTAL_RECEIVABLE', 'max_min_AMT_DRAWINGS_OTHER_CURRENT', 'max_sum_AMT_PAYMENT_TOTAL_CURRENT', 'max_min_AMT_PAYMENT', 'max_sum_CNT_INSTALMENT', 'max_min_AMT_PAYMENT_TOTAL_CURRENT', 'max_DAYS_FIRST_DRAWING', 'max_DAYS_TERMINATION', 'max_sum_cash_MONTHS_BALANCE', 'max_sum_credit_SK_DPD', 'max_min_AMT_TOTAL_RECEIVABLE', 'max_avg_AMT_DRAWINGS_POS_CURRENT', 'max_max_NUM_INSTALMENT_NUMBER', 'max_avg_AMT_DRAWINGS_CURRENT', 'max_sum_cash_SK_DPD_DEF', 'max_avg_AMT_PAYMENT_CURRENT', 'max_avg_AMT_RECIVABLE', 'max_min_cash_SK_DPD_DEF', 'max_min_CNT_INSTALMENT_FUTURE', 'max_max_NUM_INSTALMENT_VERSION', 'max_sum_DAYS_ENTRY_PAYMENT', 'max_max_CNT_INSTALMENT_MATURE_CUM', 'max_avg_DAYS_INSTALMENT', 'max_min_CNT_INSTALMENT', 'max_max_AMT_PAYMENT_CURRENT', 'max_avg_CNT_INSTALMENT', 'max_avg_cash_SK_DPD_DEF', 'max_sum_AMT_TOTAL_RECEIVABLE', 'max_avg_AMT_TOTAL_RECEIVABLE', 'max_min_AMT_PAYMENT_CURRENT', 'max_avg_CNT_DRAWINGS_POS_CURRENT', 'max_avg_AMT_PAYMENT', 'max_min_DAYS_ENTRY_PAYMENT', 'max_max_CNT_DRAWINGS_OTHER_CURRENT', 'max_avg_AMT_RECEIVABLE_PRINCIPAL', 'max_CNT_PAYMENT', 'max_sum_CNT_DRAWINGS_ATM_CURRENT', 'max_DAYS_FIRST_DUE', 'max_sum_AMT_INST_MIN_REGULARITY', 'max_min_CNT_INSTALMENT_MATURE_CUM', 'max_sum_AMT_DRAWINGS_OTHER_CURRENT', 'max_previous_AMT_CREDIT', 'max_min_AMT_DRAWINGS_CURRENT', 'max_avg_MONTHS_BALANCE', 'max_DAYS_DECISION', 'max_min_CNT_DRAWINGS_OTHER_CURRENT', 'max_sum_credit_SK_DPD_DEF', 'max_max_MONTHS_BALANCE', 'max_RATE_INTEREST_PRIMARY', 'max_max_CNT_DRAWINGS_CURRENT', 'max_avg_credit_SK_DPD', 'max_sum_AMT_BALANCE', 'max_min_AMT_BALANCE', 'max_avg_AMT_DRAWINGS_ATM_CURRENT', 'max_sum_CNT_DRAWINGS_OTHER_CURRENT', 'max_max_CNT_INSTALMENT_FUTURE', 'max_max_AMT_DRAWINGS_POS_CURRENT', 'max_max_credit_SK_DPD', 'max_avg_AMT_BALANCE', 'max_AMT_DOWN_PAYMENT', 'max_sum_CNT_DRAWINGS_POS_CURRENT', 'max_min_credit_SK_DPD_DEF', 'max_min_CNT_DRAWINGS_POS_CURRENT', 'max_max_cash_SK_DPD_DEF', 'max_avg_cash_MONTHS_BALANCE', 'max_avg_CNT_DRAWINGS_ATM_CURRENT', 'max_max_credit_SK_DPD_DEF', 'max_sum_AMT_DRAWINGS_CURRENT', 'max_max_AMT_DRAWINGS_CURRENT', 'max_min_AMT_DRAWINGS_ATM_CURRENT', 'max_sum_AMT_DRAWINGS_POS_CURRENT', 'max_sum_AMT_RECEIVABLE_PRINCIPAL', 'max_sum_CNT_DRAWINGS_CURRENT', 'max_max_DAYS_INSTALMENT', 'max_max_AMT_CREDIT_LIMIT_ACTUAL', 'max_avg_credit_SK_DPD_DEF', 'max_AMT_ANNUITY', 'max_min_CNT_DRAWINGS_CURRENT', 'max_sum_NUM_INSTALMENT_NUMBER', 'max_avg_DAYS_ENTRY_PAYMENT', 'max_min_AMT_INST_MIN_REGULARITY', 'max_sum_cash_SK_DPD', 'max_min_MONTHS_BALANCE', 'max_avg_NUM_INSTALMENT_NUMBER', 'max_min_cash_MONTHS_BALANCE', 'max_max_AMT_PAYMENT_TOTAL_CURRENT', 'max_min_AMT_RECIVABLE', 'max_sum_CNT_INSTALMENT_FUTURE', 'max_avg_cash_SK_DPD', 'max_previous_AMT_GOODS_PRICE', 'max_min_NUM_INSTALMENT_NUMBER', 'max_sum_AMT_INSTALMENT', 'max_max_cash_SK_DPD', 'max_avg_AMT_INSTALMENT', 'max_max_AMT_RECEIVABLE_PRINCIPAL', 'max_RATE_DOWN_PAYMENT', 'max_sum_AMT_RECIVABLE', 'max_sum_MONTHS_BALANCE', 'max_avg_AMT_CREDIT_LIMIT_ACTUAL', 'max_max_AMT_INST_MIN_REGULARITY', 'max_min_NUM_INSTALMENT_VERSION', 'max_avg_CNT_DRAWINGS_CURRENT', 'max_max_AMT_DRAWINGS_OTHER_CURRENT', 'max_sum_AMT_CREDIT_LIMIT_ACTUAL', 'max_max_CNT_INSTALMENT', 'max_max_AMT_PAYMENT', 'max_RATE_INTEREST_PRIVILEGED', 'max_max_AMT_INSTALMENT', 'max_max_AMT_DRAWINGS_ATM_CURRENT', 'max_NFLAG_LAST_APPL_IN_DAY', 'max_NFLAG_INSURED_ON_APPROVAL', 'max_min_cash_SK_DPD', 'max_avg_CNT_INSTALMENT_FUTURE', 'max_sum_AMT_DRAWINGS_ATM_CURRENT', 'max_SELLERPLACE_AREA', 'max_sum_AMT_PAYMENT_CURRENT', 'max_avg_NUM_INSTALMENT_VERSION', 'max_max_cash_MONTHS_BALANCE', 'max_min_AMT_DRAWINGS_POS_CURRENT', 'max_sum_CNT_INSTALMENT_MATURE_CUM', 'max_AMT_APPLICATION', 'max_DAYS_LAST_DUE', 'max_avg_CNT_INSTALMENT_MATURE_CUM', 'max_max_CNT_DRAWINGS_ATM_CURRENT', 'max_previous_HOUR_APPR_PROCESS_START', 'max_avg_AMT_DRAWINGS_OTHER_CURRENT', 'max_min_credit_SK_DPD', 'max_min_CNT_DRAWINGS_ATM_CURRENT', 'max_sum_DAYS_INSTALMENT', 'avg_min_AMT_CREDIT_LIMIT_ACTUAL', 'avg_max_DAYS_ENTRY_PAYMENT', 'avg_sum_NUM_INSTALMENT_VERSION', 'avg_sum_AMT_PAYMENT', 'avg_avg_AMT_PAYMENT_TOTAL_CURRENT', 'avg_max_CNT_DRAWINGS_POS_CURRENT', 'avg_max_AMT_BALANCE', 'avg_min_DAYS_INSTALMENT', 'avg_min_AMT_INSTALMENT', 'avg_min_AMT_RECEIVABLE_PRINCIPAL', 'avg_max_AMT_RECIVABLE', 'avg_DAYS_LAST_DUE_1ST_VERSION', 'avg_avg_AMT_INST_MIN_REGULARITY', 'avg_avg_CNT_DRAWINGS_OTHER_CURRENT', 'avg_max_AMT_TOTAL_RECEIVABLE', 'avg_min_AMT_DRAWINGS_OTHER_CURRENT', 'avg_sum_AMT_PAYMENT_TOTAL_CURRENT', 'avg_min_AMT_PAYMENT', 'avg_sum_CNT_INSTALMENT', 'avg_min_AMT_PAYMENT_TOTAL_CURRENT', 'avg_DAYS_FIRST_DRAWING', 'avg_DAYS_TERMINATION', 'avg_sum_cash_MONTHS_BALANCE', 'avg_sum_credit_SK_DPD', 'avg_min_AMT_TOTAL_RECEIVABLE', 'avg_avg_AMT_DRAWINGS_POS_CURRENT', 'avg_max_NUM_INSTALMENT_NUMBER', 'avg_avg_AMT_DRAWINGS_CURRENT', 'avg_sum_cash_SK_DPD_DEF', 'avg_avg_AMT_PAYMENT_CURRENT', 'avg_avg_AMT_RECIVABLE', 'avg_min_cash_SK_DPD_DEF', 'avg_min_CNT_INSTALMENT_FUTURE', 'avg_max_NUM_INSTALMENT_VERSION', 'avg_sum_DAYS_ENTRY_PAYMENT', 'avg_max_CNT_INSTALMENT_MATURE_CUM', 'avg_avg_DAYS_INSTALMENT', 'avg_min_CNT_INSTALMENT', 'avg_max_AMT_PAYMENT_CURRENT', 'avg_avg_CNT_INSTALMENT', 'avg_avg_cash_SK_DPD_DEF', 'avg_sum_AMT_TOTAL_RECEIVABLE', 'avg_avg_AMT_TOTAL_RECEIVABLE', 'avg_min_AMT_PAYMENT_CURRENT', 'avg_avg_CNT_DRAWINGS_POS_CURRENT', 'avg_avg_AMT_PAYMENT', 'avg_min_DAYS_ENTRY_PAYMENT', 'avg_max_CNT_DRAWINGS_OTHER_CURRENT', 'avg_avg_AMT_RECEIVABLE_PRINCIPAL', 'avg_CNT_PAYMENT', 'avg_sum_CNT_DRAWINGS_ATM_CURRENT', 'avg_DAYS_FIRST_DUE', 'avg_sum_AMT_INST_MIN_REGULARITY', 'avg_min_CNT_INSTALMENT_MATURE_CUM', 'avg_sum_AMT_DRAWINGS_OTHER_CURRENT', 'avg_previous_AMT_CREDIT', 'avg_min_AMT_DRAWINGS_CURRENT', 'avg_avg_MONTHS_BALANCE', 'avg_DAYS_DECISION', 'avg_min_CNT_DRAWINGS_OTHER_CURRENT', 'avg_sum_credit_SK_DPD_DEF', 'avg_max_MONTHS_BALANCE', 'avg_RATE_INTEREST_PRIMARY', 'avg_max_CNT_DRAWINGS_CURRENT', 'avg_avg_credit_SK_DPD', 'avg_sum_AMT_BALANCE', 'avg_min_AMT_BALANCE', 'avg_avg_AMT_DRAWINGS_ATM_CURRENT', 'avg_sum_CNT_DRAWINGS_OTHER_CURRENT', 'avg_max_CNT_INSTALMENT_FUTURE', 'avg_max_AMT_DRAWINGS_POS_CURRENT', 'avg_max_credit_SK_DPD', 'avg_avg_AMT_BALANCE', 'avg_AMT_DOWN_PAYMENT', 'avg_sum_CNT_DRAWINGS_POS_CURRENT', 'avg_min_credit_SK_DPD_DEF', 'avg_min_CNT_DRAWINGS_POS_CURRENT', 'avg_max_cash_SK_DPD_DEF', 'avg_avg_cash_MONTHS_BALANCE', 'avg_avg_CNT_DRAWINGS_ATM_CURRENT', 'avg_max_credit_SK_DPD_DEF', 'avg_sum_AMT_DRAWINGS_CURRENT', 'avg_max_AMT_DRAWINGS_CURRENT', 'avg_min_AMT_DRAWINGS_ATM_CURRENT', 'avg_sum_AMT_DRAWINGS_POS_CURRENT', 'avg_sum_AMT_RECEIVABLE_PRINCIPAL', 'avg_sum_CNT_DRAWINGS_CURRENT', 'avg_max_DAYS_INSTALMENT', 'avg_max_AMT_CREDIT_LIMIT_ACTUAL', 'avg_avg_credit_SK_DPD_DEF', 'avg_AMT_ANNUITY', 'avg_min_CNT_DRAWINGS_CURRENT', 'avg_sum_NUM_INSTALMENT_NUMBER', 'avg_avg_DAYS_ENTRY_PAYMENT', 'avg_min_AMT_INST_MIN_REGULARITY', 'avg_sum_cash_SK_DPD', 'avg_min_MONTHS_BALANCE', 'avg_avg_NUM_INSTALMENT_NUMBER', 'avg_min_cash_MONTHS_BALANCE', 'avg_max_AMT_PAYMENT_TOTAL_CURRENT', 'avg_min_AMT_RECIVABLE', 'avg_sum_CNT_INSTALMENT_FUTURE', 'avg_avg_cash_SK_DPD', 'avg_previous_AMT_GOODS_PRICE', 'avg_min_NUM_INSTALMENT_NUMBER', 'avg_sum_AMT_INSTALMENT', 'avg_max_cash_SK_DPD', 'avg_avg_AMT_INSTALMENT', 'avg_max_AMT_RECEIVABLE_PRINCIPAL', 'avg_RATE_DOWN_PAYMENT', 'avg_sum_AMT_RECIVABLE', 'avg_sum_MONTHS_BALANCE', 'avg_avg_AMT_CREDIT_LIMIT_ACTUAL', 'avg_max_AMT_INST_MIN_REGULARITY', 'avg_min_NUM_INSTALMENT_VERSION', 'avg_avg_CNT_DRAWINGS_CURRENT', 'avg_max_AMT_DRAWINGS_OTHER_CURRENT', 'avg_sum_AMT_CREDIT_LIMIT_ACTUAL', 'avg_max_CNT_INSTALMENT', 'avg_max_AMT_PAYMENT', 'avg_RATE_INTEREST_PRIVILEGED', 'avg_max_AMT_INSTALMENT', 'avg_max_AMT_DRAWINGS_ATM_CURRENT', 'avg_NFLAG_LAST_APPL_IN_DAY', 'avg_NFLAG_INSURED_ON_APPROVAL', 'avg_min_cash_SK_DPD', 'avg_avg_CNT_INSTALMENT_FUTURE', 'avg_sum_AMT_DRAWINGS_ATM_CURRENT', 'avg_SELLERPLACE_AREA', 'avg_sum_AMT_PAYMENT_CURRENT', 'avg_avg_NUM_INSTALMENT_VERSION', 'avg_max_cash_MONTHS_BALANCE', 'avg_min_AMT_DRAWINGS_POS_CURRENT', 'avg_sum_CNT_INSTALMENT_MATURE_CUM', 'avg_AMT_APPLICATION', 'avg_DAYS_LAST_DUE', 'avg_avg_CNT_INSTALMENT_MATURE_CUM', 'avg_max_CNT_DRAWINGS_ATM_CURRENT', 'avg_previous_HOUR_APPR_PROCESS_START', 'avg_avg_AMT_DRAWINGS_OTHER_CURRENT', 'avg_min_credit_SK_DPD', 'avg_min_CNT_DRAWINGS_ATM_CURRENT', 'avg_sum_DAYS_INSTALMENT']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

featuretoolsOnSpark-0.1.1.tar.gz (17.1 kB view hashes)

Uploaded Source

Built Distribution

featuretoolsOnSpark-0.1.1-py3-none-any.whl (19.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page