Skip to main content

A simplified version of featuretools for Spark

Project description

featuretoolsOnSpark

Featuretools is a python library for automated feature engineering.

This repo is a simplified version of featuretools,using automatic feature generation framework of featuretools.Instead of the fussy back-end architecture of featuretools,We mainly use Spark DataFrame to achieve faster feature generation process(speed up 10x+).

Installation

Install with pip

pip install featuretoolsOnSpark

Install from source

git clone https://github.com/giantcroc/featuretoolsOnSpark.git
cd featuretoolsOnSpark
python setup.py install

Example

Below is an example of how to use apis of this repo.We Choose the dataset from Kaggle's competition(Home-Credit-Default-Risk).The relationships between tables are shown in the picture below.

featuretoolsOnSpark

First,you should guarantee that all csv files needed have been saved as Spark DataFrame format.

1. Create Spark Context

>> from pyspark.sql import SparkSession

>> spark = SparkSession \
   	.builder \
   	.appName("home-credit") \
   	.enableHiveSupport()\
   	.getOrCreate()

2. Get Spark DataFrame

>> app_train = spark.sql(''' select * from home_credit.app_train ''')

>> bureau = spark.sql(''' select * from home_credit.bureau ''')

>> bureau_balance = spark.sql(''' select * from home_credit.bureau_balance ''')

>> cash = spark.sql(''' select * from home_credit.cash ''')

>> credit = spark.sql(''' select * from home_credit.credit ''')

>> installments = spark.sql(''' select * from home_credit.installments ''')

>> previous = spark.sql(''' select * from home_credit.previous ''')

3. Create TableSet

>> import featuretoolsOnSpark as fts

>> ts = fts.TableSet("home_credit",verbose=False)

4. Create Tables From Spark DataFrame

>> ts.table_from_dataframe(table_id="bureau_balance",dataframe=bureau_balance,index='bureau_balance_id',make_index = True)

>> ts.table_from_dataframe(table_id="app_train",dataframe=app_train,index='SK_ID_CURR')

>> ts.table_from_dataframe(table_id="bureau",dataframe=bureau,index='SK_ID_BUREAU')

>> ts.table_from_dataframe(table_id="cash",dataframe=cash,index='cash_id',make_index = True)

>> ts.table_from_dataframe(table_id="credit",dataframe=credit,index='credit_id',make_index = True)

>> ts.table_from_dataframe(table_id="installments",dataframe=installments,index='installments_id',make_index = True)

>> ts.table_from_dataframe(table_id="previous",dataframe=previous,index='SK_ID_PREV')

5. Add Relationships of Tables

>> re1 = Relationship(ts["app_train"]["SK_ID_CURR"],ts["bureau"]["SK_ID_CURR"])

>> re2 = Relationship(ts["bureau"]["SK_ID_BUREAU"],ts["bureau_balance"]["SK_ID_BUREAU"])

>> re3 = Relationship(ts["app_train"]["SK_ID_CURR"],ts["previous"]["SK_ID_CURR"])

>> re4 = Relationship(ts["previous"]["SK_ID_PREV"],ts["cash"]["SK_ID_PREV"])

>> re5 = Relationship(ts["previous"]["SK_ID_PREV"],ts["credit"]["SK_ID_PREV"])

>> re6 = Relationship(ts["previous"]["SK_ID_PREV"],ts["installments"]["SK_ID_PREV"])

>> ts.add_relationships([re1,re2,re3,re4,re5,re6])

6. Print Available Agg_primitives

>> fts.print_agg_prims()
['avg', 'count', 'kurtosis', 'skewness', 'stddev', 'min', 'max', 'sum']

7. Run DFS To Generate Features

>> fts.dfs(tableset=ts, agg_primitives=["sum",'min','max','avg'],target_table='app_train',max_depth=2,verbose=False)

8. Get Features

>> new_app_train = ts["app_train"].df

>> old_len = ts["app_train"].old_len

>> print('new_generate_feature_len:{}'.format(len(new_app_train.columns)-old_len))

>> print(new_app_train.columns[old_len:])
new_generate_feature_len:636
['sum_bureau_max_bureau_balance_MONTHS_BALANCE', 'sum_bureau_CREDIT_DAY_OVERDUE', 'sum_bureau_CNT_CREDIT_PROLONG', 'sum_bureau_DAYS_CREDIT_ENDDATE', 'sum_bureau_DAYS_CREDIT_UPDATE', 'sum_bureau_min_bureau_balance_MONTHS_BALANCE', 'sum_bureau_sum_bureau_balance_MONTHS_BALANCE', 'sum_bureau_AMT_CREDIT_MAX_OVERDUE', 'sum_bureau_DAYS_ENDDATE_FACT', 'sum_bureau_AMT_CREDIT_SUM_LIMIT', 'sum_bureau_DAYS_CREDIT', 'sum_bureau_avg_bureau_balance_MONTHS_BALANCE', 'sum_bureau_AMT_CREDIT_SUM_DEBT', 'sum_bureau_AMT_CREDIT_SUM', 'sum_bureau_AMT_ANNUITY', 'sum_bureau_AMT_CREDIT_SUM_OVERDUE', 'min_bureau_max_bureau_balance_MONTHS_BALANCE', 'min_bureau_CREDIT_DAY_OVERDUE', 'min_bureau_CNT_CREDIT_PROLONG', 'min_bureau_DAYS_CREDIT_ENDDATE', 'min_bureau_DAYS_CREDIT_UPDATE', 'min_bureau_min_bureau_balance_MONTHS_BALANCE', 'min_bureau_sum_bureau_balance_MONTHS_BALANCE', 'min_bureau_AMT_CREDIT_MAX_OVERDUE', 'min_bureau_DAYS_ENDDATE_FACT', 'min_bureau_AMT_CREDIT_SUM_LIMIT', 'min_bureau_DAYS_CREDIT', 'min_bureau_avg_bureau_balance_MONTHS_BALANCE', 'min_bureau_AMT_CREDIT_SUM_DEBT', 'min_bureau_AMT_CREDIT_SUM', 'min_bureau_AMT_ANNUITY', 'min_bureau_AMT_CREDIT_SUM_OVERDUE', 'max_bureau_max_bureau_balance_MONTHS_BALANCE', 'max_bureau_CREDIT_DAY_OVERDUE', 'max_bureau_CNT_CREDIT_PROLONG', 'max_bureau_DAYS_CREDIT_ENDDATE', 'max_bureau_DAYS_CREDIT_UPDATE', 'max_bureau_min_bureau_balance_MONTHS_BALANCE', 'max_bureau_sum_bureau_balance_MONTHS_BALANCE', 'max_bureau_AMT_CREDIT_MAX_OVERDUE', 'max_bureau_DAYS_ENDDATE_FACT', 'max_bureau_AMT_CREDIT_SUM_LIMIT', 'max_bureau_DAYS_CREDIT', 'max_bureau_avg_bureau_balance_MONTHS_BALANCE', 'max_bureau_AMT_CREDIT_SUM_DEBT', 'max_bureau_AMT_CREDIT_SUM', 'max_bureau_AMT_ANNUITY', 'max_bureau_AMT_CREDIT_SUM_OVERDUE', 'avg_bureau_max_bureau_balance_MONTHS_BALANCE', 'avg_bureau_CREDIT_DAY_OVERDUE', 'avg_bureau_CNT_CREDIT_PROLONG', 'avg_bureau_DAYS_CREDIT_ENDDATE', 'avg_bureau_DAYS_CREDIT_UPDATE', 'avg_bureau_min_bureau_balance_MONTHS_BALANCE', 'avg_bureau_sum_bureau_balance_MONTHS_BALANCE', 'avg_bureau_AMT_CREDIT_MAX_OVERDUE', 'avg_bureau_DAYS_ENDDATE_FACT', 'avg_bureau_AMT_CREDIT_SUM_LIMIT', 'avg_bureau_DAYS_CREDIT', 'avg_bureau_avg_bureau_balance_MONTHS_BALANCE', 'avg_bureau_AMT_CREDIT_SUM_DEBT', 'avg_bureau_AMT_CREDIT_SUM', 'avg_bureau_AMT_ANNUITY', 'avg_bureau_AMT_CREDIT_SUM_OVERDUE', 'sum_previous_sum_credit_AMT_INST_MIN_REGULARITY', 'sum_previous_min_credit_AMT_RECIVABLE', 'sum_previous_sum_credit_AMT_PAYMENT_CURRENT', 'sum_previous_min_credit_CNT_DRAWINGS_POS_CURRENT', 'sum_previous_HOUR_APPR_PROCESS_START', 'sum_previous_min_credit_AMT_CREDIT_LIMIT_ACTUAL', 'sum_previous_min_installments_AMT_PAYMENT', 'sum_previous_sum_credit_CNT_DRAWINGS_OTHER_CURRENT', 'sum_previous_min_credit_AMT_RECEIVABLE_PRINCIPAL', 'sum_previous_avg_credit_AMT_DRAWINGS_OTHER_CURRENT', 'sum_previous_sum_installments_DAYS_INSTALMENT', 'sum_previous_DAYS_LAST_DUE_1ST_VERSION', 'sum_previous_min_credit_CNT_DRAWINGS_CURRENT', 'sum_previous_avg_credit_AMT_TOTAL_RECEIVABLE', 'sum_previous_max_credit_AMT_PAYMENT_CURRENT', 'sum_previous_sum_credit_CNT_DRAWINGS_CURRENT', 'sum_previous_min_cash_CNT_INSTALMENT', 'sum_previous_sum_credit_AMT_CREDIT_LIMIT_ACTUAL', 'sum_previous_DAYS_FIRST_DRAWING', 'sum_previous_avg_credit_AMT_BALANCE', 'sum_previous_sum_credit_AMT_TOTAL_RECEIVABLE', 'sum_previous_sum_credit_SK_DPD', 'sum_previous_max_installments_NUM_INSTALMENT_VERSION', 'sum_previous_sum_cash_CNT_INSTALMENT_FUTURE', 'sum_previous_max_cash_MONTHS_BALANCE', 'sum_previous_sum_credit_CNT_INSTALMENT_MATURE_CUM', 'sum_previous_AMT_APPLICATION', 'sum_previous_avg_credit_AMT_DRAWINGS_POS_CURRENT', 'sum_previous_avg_credit_AMT_PAYMENT_TOTAL_CURRENT', 'sum_previous_sum_cash_SK_DPD_DEF', 'sum_previous_max_credit_CNT_INSTALMENT_MATURE_CUM', 'sum_previous_avg_cash_CNT_INSTALMENT_FUTURE', 'sum_previous_min_cash_SK_DPD_DEF', 'sum_previous_sum_credit_MONTHS_BALANCE', 'sum_previous_max_credit_CNT_DRAWINGS_CURRENT', 'sum_previous_DAYS_DECISION', 'sum_previous_NFLAG_LAST_APPL_IN_DAY', 'sum_previous_avg_credit_AMT_RECEIVABLE_PRINCIPAL', 'sum_previous_avg_cash_CNT_INSTALMENT', 'sum_previous_sum_credit_AMT_DRAWINGS_POS_CURRENT', 'sum_previous_avg_cash_SK_DPD_DEF', 'sum_previous_min_credit_MONTHS_BALANCE', 'sum_previous_sum_credit_SK_DPD_DEF', 'sum_previous_max_credit_AMT_PAYMENT_TOTAL_CURRENT', 'sum_previous_max_credit_CNT_DRAWINGS_OTHER_CURRENT', 'sum_previous_avg_credit_AMT_CREDIT_LIMIT_ACTUAL', 'sum_previous_sum_installments_NUM_INSTALMENT_NUMBER', 'sum_previous_avg_credit_MONTHS_BALANCE', 'sum_previous_CNT_PAYMENT', 'sum_previous_DAYS_FIRST_DUE', 'sum_previous_sum_credit_CNT_DRAWINGS_ATM_CURRENT', 'sum_previous_max_cash_CNT_INSTALMENT_FUTURE', 'sum_previous_max_credit_CNT_DRAWINGS_POS_CURRENT', 'sum_previous_max_credit_AMT_CREDIT_LIMIT_ACTUAL', 'sum_previous_max_credit_AMT_DRAWINGS_ATM_CURRENT', 'sum_previous_DAYS_TERMINATION', 'sum_previous_max_credit_AMT_DRAWINGS_CURRENT', 'sum_previous_max_credit_AMT_TOTAL_RECEIVABLE', 'sum_previous_min_credit_CNT_DRAWINGS_OTHER_CURRENT', 'sum_previous_avg_installments_NUM_INSTALMENT_VERSION', 'sum_previous_min_installments_DAYS_INSTALMENT', 'sum_previous_min_installments_NUM_INSTALMENT_VERSION', 'sum_previous_min_credit_CNT_INSTALMENT_MATURE_CUM', 'sum_previous_RATE_INTEREST_PRIMARY', 'sum_previous_avg_credit_SK_DPD', 'sum_previous_min_installments_AMT_INSTALMENT', 'sum_previous_max_credit_CNT_DRAWINGS_ATM_CURRENT', 'sum_previous_min_credit_AMT_DRAWINGS_POS_CURRENT', 'sum_previous_avg_credit_CNT_INSTALMENT_MATURE_CUM', 'sum_previous_max_credit_SK_DPD_DEF', 'sum_previous_avg_credit_AMT_RECIVABLE', 'sum_previous_avg_credit_AMT_DRAWINGS_ATM_CURRENT', 'sum_previous_max_credit_AMT_INST_MIN_REGULARITY', 'sum_previous_avg_installments_NUM_INSTALMENT_NUMBER', 'sum_previous_max_credit_AMT_DRAWINGS_OTHER_CURRENT', 'sum_previous_max_credit_SK_DPD', 'sum_previous_AMT_DOWN_PAYMENT', 'sum_previous_avg_credit_CNT_DRAWINGS_POS_CURRENT', 'sum_previous_min_credit_SK_DPD_DEF', 'sum_previous_max_credit_AMT_BALANCE', 'sum_previous_sum_installments_AMT_PAYMENT', 'sum_previous_max_installments_AMT_PAYMENT', 'sum_previous_min_credit_AMT_PAYMENT_CURRENT', 'sum_previous_max_cash_SK_DPD_DEF', 'sum_previous_min_installments_DAYS_ENTRY_PAYMENT', 'sum_previous_max_credit_AMT_DRAWINGS_POS_CURRENT', 'sum_previous_min_cash_CNT_INSTALMENT_FUTURE', 'sum_previous_min_credit_AMT_PAYMENT_TOTAL_CURRENT', 'sum_previous_max_credit_AMT_RECEIVABLE_PRINCIPAL', 'sum_previous_sum_cash_CNT_INSTALMENT', 'sum_previous_min_credit_AMT_DRAWINGS_OTHER_CURRENT', 'sum_previous_sum_credit_AMT_BALANCE', 'sum_previous_avg_credit_AMT_PAYMENT_CURRENT', 'sum_previous_avg_credit_SK_DPD_DEF', 'sum_previous_AMT_ANNUITY', 'sum_previous_min_credit_AMT_TOTAL_RECEIVABLE', 'sum_previous_max_installments_DAYS_ENTRY_PAYMENT', 'sum_previous_sum_cash_SK_DPD', 'sum_previous_sum_credit_CNT_DRAWINGS_POS_CURRENT', 'sum_previous_min_cash_MONTHS_BALANCE', 'sum_previous_sum_installments_AMT_INSTALMENT', 'sum_previous_avg_cash_MONTHS_BALANCE', 'sum_previous_min_credit_AMT_DRAWINGS_CURRENT', 'sum_previous_avg_cash_SK_DPD', 'sum_previous_sum_installments_DAYS_ENTRY_PAYMENT', 'sum_previous_sum_credit_AMT_RECEIVABLE_PRINCIPAL', 'sum_previous_min_credit_AMT_INST_MIN_REGULARITY', 'sum_previous_avg_credit_AMT_INST_MIN_REGULARITY', 'sum_previous_max_cash_SK_DPD', 'sum_previous_avg_credit_CNT_DRAWINGS_OTHER_CURRENT', 'sum_previous_max_cash_CNT_INSTALMENT', 'sum_previous_avg_installments_DAYS_INSTALMENT', 'sum_previous_sum_cash_MONTHS_BALANCE', 'sum_previous_min_credit_AMT_DRAWINGS_ATM_CURRENT', 'sum_previous_AMT_CREDIT', 'sum_previous_RATE_INTEREST_PRIVILEGED', 'sum_previous_max_installments_AMT_INSTALMENT', 'sum_previous_avg_credit_AMT_DRAWINGS_CURRENT', 'sum_previous_NFLAG_INSURED_ON_APPROVAL', 'sum_previous_avg_installments_DAYS_ENTRY_PAYMENT', 'sum_previous_min_credit_AMT_BALANCE', 'sum_previous_sum_credit_AMT_PAYMENT_TOTAL_CURRENT', 'sum_previous_min_cash_SK_DPD', 'sum_previous_sum_credit_AMT_DRAWINGS_ATM_CURRENT', 'sum_previous_avg_installments_AMT_INSTALMENT', 'sum_previous_sum_credit_AMT_RECIVABLE', 'sum_previous_sum_installments_NUM_INSTALMENT_VERSION', 'sum_previous_SELLERPLACE_AREA', 'sum_previous_max_credit_MONTHS_BALANCE', 'sum_previous_sum_credit_AMT_DRAWINGS_CURRENT', 'sum_previous_avg_installments_AMT_PAYMENT', 'sum_previous_avg_credit_CNT_DRAWINGS_ATM_CURRENT', 'sum_previous_max_installments_NUM_INSTALMENT_NUMBER', 'sum_previous_DAYS_LAST_DUE', 'sum_previous_max_installments_DAYS_INSTALMENT', 'sum_previous_avg_credit_CNT_DRAWINGS_CURRENT', 'sum_previous_sum_credit_AMT_DRAWINGS_OTHER_CURRENT', 'sum_previous_min_installments_NUM_INSTALMENT_NUMBER', 'sum_previous_AMT_GOODS_PRICE', 'sum_previous_max_credit_AMT_RECIVABLE', 'sum_previous_RATE_DOWN_PAYMENT', 'sum_previous_min_credit_SK_DPD', 'sum_previous_min_credit_CNT_DRAWINGS_ATM_CURRENT', 'min_previous_sum_credit_AMT_INST_MIN_REGULARITY', 'min_previous_min_credit_AMT_RECIVABLE', 'min_previous_sum_credit_AMT_PAYMENT_CURRENT', 'min_previous_min_credit_CNT_DRAWINGS_POS_CURRENT', 'min_previous_HOUR_APPR_PROCESS_START', 'min_previous_min_credit_AMT_CREDIT_LIMIT_ACTUAL', 'min_previous_min_installments_AMT_PAYMENT', 'min_previous_sum_credit_CNT_DRAWINGS_OTHER_CURRENT', 'min_previous_min_credit_AMT_RECEIVABLE_PRINCIPAL', 'min_previous_avg_credit_AMT_DRAWINGS_OTHER_CURRENT', 'min_previous_sum_installments_DAYS_INSTALMENT', 'min_previous_DAYS_LAST_DUE_1ST_VERSION', 'min_previous_min_credit_CNT_DRAWINGS_CURRENT', 'min_previous_avg_credit_AMT_TOTAL_RECEIVABLE', 'min_previous_max_credit_AMT_PAYMENT_CURRENT', 'min_previous_sum_credit_CNT_DRAWINGS_CURRENT', 'min_previous_min_cash_CNT_INSTALMENT', 'min_previous_sum_credit_AMT_CREDIT_LIMIT_ACTUAL', 'min_previous_DAYS_FIRST_DRAWING', 'min_previous_avg_credit_AMT_BALANCE', 'min_previous_sum_credit_AMT_TOTAL_RECEIVABLE', 'min_previous_sum_credit_SK_DPD', 'min_previous_max_installments_NUM_INSTALMENT_VERSION', 'min_previous_sum_cash_CNT_INSTALMENT_FUTURE', 'min_previous_max_cash_MONTHS_BALANCE', 'min_previous_sum_credit_CNT_INSTALMENT_MATURE_CUM', 'min_previous_AMT_APPLICATION', 'min_previous_avg_credit_AMT_DRAWINGS_POS_CURRENT', 'min_previous_avg_credit_AMT_PAYMENT_TOTAL_CURRENT', 'min_previous_sum_cash_SK_DPD_DEF', 'min_previous_max_credit_CNT_INSTALMENT_MATURE_CUM', 'min_previous_avg_cash_CNT_INSTALMENT_FUTURE', 'min_previous_min_cash_SK_DPD_DEF', 'min_previous_sum_credit_MONTHS_BALANCE', 'min_previous_max_credit_CNT_DRAWINGS_CURRENT', 'min_previous_DAYS_DECISION', 'min_previous_NFLAG_LAST_APPL_IN_DAY', 'min_previous_avg_credit_AMT_RECEIVABLE_PRINCIPAL', 'min_previous_avg_cash_CNT_INSTALMENT', 'min_previous_sum_credit_AMT_DRAWINGS_POS_CURRENT', 'min_previous_avg_cash_SK_DPD_DEF', 'min_previous_min_credit_MONTHS_BALANCE', 'min_previous_sum_credit_SK_DPD_DEF', 'min_previous_max_credit_AMT_PAYMENT_TOTAL_CURRENT', 'min_previous_max_credit_CNT_DRAWINGS_OTHER_CURRENT', 'min_previous_avg_credit_AMT_CREDIT_LIMIT_ACTUAL', 'min_previous_sum_installments_NUM_INSTALMENT_NUMBER', 'min_previous_avg_credit_MONTHS_BALANCE', 'min_previous_CNT_PAYMENT', 'min_previous_DAYS_FIRST_DUE', 'min_previous_sum_credit_CNT_DRAWINGS_ATM_CURRENT', 'min_previous_max_cash_CNT_INSTALMENT_FUTURE', 'min_previous_max_credit_CNT_DRAWINGS_POS_CURRENT', 'min_previous_max_credit_AMT_CREDIT_LIMIT_ACTUAL', 'min_previous_max_credit_AMT_DRAWINGS_ATM_CURRENT', 'min_previous_DAYS_TERMINATION', 'min_previous_max_credit_AMT_DRAWINGS_CURRENT', 'min_previous_max_credit_AMT_TOTAL_RECEIVABLE', 'min_previous_min_credit_CNT_DRAWINGS_OTHER_CURRENT', 'min_previous_avg_installments_NUM_INSTALMENT_VERSION', 'min_previous_min_installments_DAYS_INSTALMENT', 'min_previous_min_installments_NUM_INSTALMENT_VERSION', 'min_previous_min_credit_CNT_INSTALMENT_MATURE_CUM', 'min_previous_RATE_INTEREST_PRIMARY', 'min_previous_avg_credit_SK_DPD', 'min_previous_min_installments_AMT_INSTALMENT', 'min_previous_max_credit_CNT_DRAWINGS_ATM_CURRENT', 'min_previous_min_credit_AMT_DRAWINGS_POS_CURRENT', 'min_previous_avg_credit_CNT_INSTALMENT_MATURE_CUM', 'min_previous_max_credit_SK_DPD_DEF', 'min_previous_avg_credit_AMT_RECIVABLE', 'min_previous_avg_credit_AMT_DRAWINGS_ATM_CURRENT', 'min_previous_max_credit_AMT_INST_MIN_REGULARITY', 'min_previous_avg_installments_NUM_INSTALMENT_NUMBER', 'min_previous_max_credit_AMT_DRAWINGS_OTHER_CURRENT', 'min_previous_max_credit_SK_DPD', 'min_previous_AMT_DOWN_PAYMENT', 'min_previous_avg_credit_CNT_DRAWINGS_POS_CURRENT', 'min_previous_min_credit_SK_DPD_DEF', 'min_previous_max_credit_AMT_BALANCE', 'min_previous_sum_installments_AMT_PAYMENT', 'min_previous_max_installments_AMT_PAYMENT', 'min_previous_min_credit_AMT_PAYMENT_CURRENT', 'min_previous_max_cash_SK_DPD_DEF', 'min_previous_min_installments_DAYS_ENTRY_PAYMENT', 'min_previous_max_credit_AMT_DRAWINGS_POS_CURRENT', 'min_previous_min_cash_CNT_INSTALMENT_FUTURE', 'min_previous_min_credit_AMT_PAYMENT_TOTAL_CURRENT', 'min_previous_max_credit_AMT_RECEIVABLE_PRINCIPAL', 'min_previous_sum_cash_CNT_INSTALMENT', 'min_previous_min_credit_AMT_DRAWINGS_OTHER_CURRENT', 'min_previous_sum_credit_AMT_BALANCE', 'min_previous_avg_credit_AMT_PAYMENT_CURRENT', 'min_previous_avg_credit_SK_DPD_DEF', 'min_previous_AMT_ANNUITY', 'min_previous_min_credit_AMT_TOTAL_RECEIVABLE', 'min_previous_max_installments_DAYS_ENTRY_PAYMENT', 'min_previous_sum_cash_SK_DPD', 'min_previous_sum_credit_CNT_DRAWINGS_POS_CURRENT', 'min_previous_min_cash_MONTHS_BALANCE', 'min_previous_sum_installments_AMT_INSTALMENT', 'min_previous_avg_cash_MONTHS_BALANCE', 'min_previous_min_credit_AMT_DRAWINGS_CURRENT', 'min_previous_avg_cash_SK_DPD', 'min_previous_sum_installments_DAYS_ENTRY_PAYMENT', 'min_previous_sum_credit_AMT_RECEIVABLE_PRINCIPAL', 'min_previous_min_credit_AMT_INST_MIN_REGULARITY', 'min_previous_avg_credit_AMT_INST_MIN_REGULARITY', 'min_previous_max_cash_SK_DPD', 'min_previous_avg_credit_CNT_DRAWINGS_OTHER_CURRENT', 'min_previous_max_cash_CNT_INSTALMENT', 'min_previous_avg_installments_DAYS_INSTALMENT', 'min_previous_sum_cash_MONTHS_BALANCE', 'min_previous_min_credit_AMT_DRAWINGS_ATM_CURRENT', 'min_previous_AMT_CREDIT', 'min_previous_RATE_INTEREST_PRIVILEGED', 'min_previous_max_installments_AMT_INSTALMENT', 'min_previous_avg_credit_AMT_DRAWINGS_CURRENT', 'min_previous_NFLAG_INSURED_ON_APPROVAL', 'min_previous_avg_installments_DAYS_ENTRY_PAYMENT', 'min_previous_min_credit_AMT_BALANCE', 'min_previous_sum_credit_AMT_PAYMENT_TOTAL_CURRENT', 'min_previous_min_cash_SK_DPD', 'min_previous_sum_credit_AMT_DRAWINGS_ATM_CURRENT', 'min_previous_avg_installments_AMT_INSTALMENT', 'min_previous_sum_credit_AMT_RECIVABLE', 'min_previous_sum_installments_NUM_INSTALMENT_VERSION', 'min_previous_SELLERPLACE_AREA', 'min_previous_max_credit_MONTHS_BALANCE', 'min_previous_sum_credit_AMT_DRAWINGS_CURRENT', 'min_previous_avg_installments_AMT_PAYMENT', 'min_previous_avg_credit_CNT_DRAWINGS_ATM_CURRENT', 'min_previous_max_installments_NUM_INSTALMENT_NUMBER', 'min_previous_DAYS_LAST_DUE', 'min_previous_max_installments_DAYS_INSTALMENT', 'min_previous_avg_credit_CNT_DRAWINGS_CURRENT', 'min_previous_sum_credit_AMT_DRAWINGS_OTHER_CURRENT', 'min_previous_min_installments_NUM_INSTALMENT_NUMBER', 'min_previous_AMT_GOODS_PRICE', 'min_previous_max_credit_AMT_RECIVABLE', 'min_previous_RATE_DOWN_PAYMENT', 'min_previous_min_credit_SK_DPD', 'min_previous_min_credit_CNT_DRAWINGS_ATM_CURRENT', 'max_previous_sum_credit_AMT_INST_MIN_REGULARITY', 'max_previous_min_credit_AMT_RECIVABLE', 'max_previous_sum_credit_AMT_PAYMENT_CURRENT', 'max_previous_min_credit_CNT_DRAWINGS_POS_CURRENT', 'max_previous_HOUR_APPR_PROCESS_START', 'max_previous_min_credit_AMT_CREDIT_LIMIT_ACTUAL', 'max_previous_min_installments_AMT_PAYMENT', 'max_previous_sum_credit_CNT_DRAWINGS_OTHER_CURRENT', 'max_previous_min_credit_AMT_RECEIVABLE_PRINCIPAL', 'max_previous_avg_credit_AMT_DRAWINGS_OTHER_CURRENT', 'max_previous_sum_installments_DAYS_INSTALMENT', 'max_previous_DAYS_LAST_DUE_1ST_VERSION', 'max_previous_min_credit_CNT_DRAWINGS_CURRENT', 'max_previous_avg_credit_AMT_TOTAL_RECEIVABLE', 'max_previous_max_credit_AMT_PAYMENT_CURRENT', 'max_previous_sum_credit_CNT_DRAWINGS_CURRENT', 'max_previous_min_cash_CNT_INSTALMENT', 'max_previous_sum_credit_AMT_CREDIT_LIMIT_ACTUAL', 'max_previous_DAYS_FIRST_DRAWING', 'max_previous_avg_credit_AMT_BALANCE', 'max_previous_sum_credit_AMT_TOTAL_RECEIVABLE', 'max_previous_sum_credit_SK_DPD', 'max_previous_max_installments_NUM_INSTALMENT_VERSION', 'max_previous_sum_cash_CNT_INSTALMENT_FUTURE', 'max_previous_max_cash_MONTHS_BALANCE', 'max_previous_sum_credit_CNT_INSTALMENT_MATURE_CUM', 'max_previous_AMT_APPLICATION', 'max_previous_avg_credit_AMT_DRAWINGS_POS_CURRENT', 'max_previous_avg_credit_AMT_PAYMENT_TOTAL_CURRENT', 'max_previous_sum_cash_SK_DPD_DEF', 'max_previous_max_credit_CNT_INSTALMENT_MATURE_CUM', 'max_previous_avg_cash_CNT_INSTALMENT_FUTURE', 'max_previous_min_cash_SK_DPD_DEF', 'max_previous_sum_credit_MONTHS_BALANCE', 'max_previous_max_credit_CNT_DRAWINGS_CURRENT', 'max_previous_DAYS_DECISION', 'max_previous_NFLAG_LAST_APPL_IN_DAY', 'max_previous_avg_credit_AMT_RECEIVABLE_PRINCIPAL', 'max_previous_avg_cash_CNT_INSTALMENT', 'max_previous_sum_credit_AMT_DRAWINGS_POS_CURRENT', 'max_previous_avg_cash_SK_DPD_DEF', 'max_previous_min_credit_MONTHS_BALANCE', 'max_previous_sum_credit_SK_DPD_DEF', 'max_previous_max_credit_AMT_PAYMENT_TOTAL_CURRENT', 'max_previous_max_credit_CNT_DRAWINGS_OTHER_CURRENT', 'max_previous_avg_credit_AMT_CREDIT_LIMIT_ACTUAL', 'max_previous_sum_installments_NUM_INSTALMENT_NUMBER', 'max_previous_avg_credit_MONTHS_BALANCE', 'max_previous_CNT_PAYMENT', 'max_previous_DAYS_FIRST_DUE', 'max_previous_sum_credit_CNT_DRAWINGS_ATM_CURRENT', 'max_previous_max_cash_CNT_INSTALMENT_FUTURE', 'max_previous_max_credit_CNT_DRAWINGS_POS_CURRENT', 'max_previous_max_credit_AMT_CREDIT_LIMIT_ACTUAL', 'max_previous_max_credit_AMT_DRAWINGS_ATM_CURRENT', 'max_previous_DAYS_TERMINATION', 'max_previous_max_credit_AMT_DRAWINGS_CURRENT', 'max_previous_max_credit_AMT_TOTAL_RECEIVABLE', 'max_previous_min_credit_CNT_DRAWINGS_OTHER_CURRENT', 'max_previous_avg_installments_NUM_INSTALMENT_VERSION', 'max_previous_min_installments_DAYS_INSTALMENT', 'max_previous_min_installments_NUM_INSTALMENT_VERSION', 'max_previous_min_credit_CNT_INSTALMENT_MATURE_CUM', 'max_previous_RATE_INTEREST_PRIMARY', 'max_previous_avg_credit_SK_DPD', 'max_previous_min_installments_AMT_INSTALMENT', 'max_previous_max_credit_CNT_DRAWINGS_ATM_CURRENT', 'max_previous_min_credit_AMT_DRAWINGS_POS_CURRENT', 'max_previous_avg_credit_CNT_INSTALMENT_MATURE_CUM', 'max_previous_max_credit_SK_DPD_DEF', 'max_previous_avg_credit_AMT_RECIVABLE', 'max_previous_avg_credit_AMT_DRAWINGS_ATM_CURRENT', 'max_previous_max_credit_AMT_INST_MIN_REGULARITY', 'max_previous_avg_installments_NUM_INSTALMENT_NUMBER', 'max_previous_max_credit_AMT_DRAWINGS_OTHER_CURRENT', 'max_previous_max_credit_SK_DPD', 'max_previous_AMT_DOWN_PAYMENT', 'max_previous_avg_credit_CNT_DRAWINGS_POS_CURRENT', 'max_previous_min_credit_SK_DPD_DEF', 'max_previous_max_credit_AMT_BALANCE', 'max_previous_sum_installments_AMT_PAYMENT', 'max_previous_max_installments_AMT_PAYMENT', 'max_previous_min_credit_AMT_PAYMENT_CURRENT', 'max_previous_max_cash_SK_DPD_DEF', 'max_previous_min_installments_DAYS_ENTRY_PAYMENT', 'max_previous_max_credit_AMT_DRAWINGS_POS_CURRENT', 'max_previous_min_cash_CNT_INSTALMENT_FUTURE', 'max_previous_min_credit_AMT_PAYMENT_TOTAL_CURRENT', 'max_previous_max_credit_AMT_RECEIVABLE_PRINCIPAL', 'max_previous_sum_cash_CNT_INSTALMENT', 'max_previous_min_credit_AMT_DRAWINGS_OTHER_CURRENT', 'max_previous_sum_credit_AMT_BALANCE', 'max_previous_avg_credit_AMT_PAYMENT_CURRENT', 'max_previous_avg_credit_SK_DPD_DEF', 'max_previous_AMT_ANNUITY', 'max_previous_min_credit_AMT_TOTAL_RECEIVABLE', 'max_previous_max_installments_DAYS_ENTRY_PAYMENT', 'max_previous_sum_cash_SK_DPD', 'max_previous_sum_credit_CNT_DRAWINGS_POS_CURRENT', 'max_previous_min_cash_MONTHS_BALANCE', 'max_previous_sum_installments_AMT_INSTALMENT', 'max_previous_avg_cash_MONTHS_BALANCE', 'max_previous_min_credit_AMT_DRAWINGS_CURRENT', 'max_previous_avg_cash_SK_DPD', 'max_previous_sum_installments_DAYS_ENTRY_PAYMENT', 'max_previous_sum_credit_AMT_RECEIVABLE_PRINCIPAL', 'max_previous_min_credit_AMT_INST_MIN_REGULARITY', 'max_previous_avg_credit_AMT_INST_MIN_REGULARITY', 'max_previous_max_cash_SK_DPD', 'max_previous_avg_credit_CNT_DRAWINGS_OTHER_CURRENT', 'max_previous_max_cash_CNT_INSTALMENT', 'max_previous_avg_installments_DAYS_INSTALMENT', 'max_previous_sum_cash_MONTHS_BALANCE', 'max_previous_min_credit_AMT_DRAWINGS_ATM_CURRENT', 'max_previous_AMT_CREDIT', 'max_previous_RATE_INTEREST_PRIVILEGED', 'max_previous_max_installments_AMT_INSTALMENT', 'max_previous_avg_credit_AMT_DRAWINGS_CURRENT', 'max_previous_NFLAG_INSURED_ON_APPROVAL', 'max_previous_avg_installments_DAYS_ENTRY_PAYMENT', 'max_previous_min_credit_AMT_BALANCE', 'max_previous_sum_credit_AMT_PAYMENT_TOTAL_CURRENT', 'max_previous_min_cash_SK_DPD', 'max_previous_sum_credit_AMT_DRAWINGS_ATM_CURRENT', 'max_previous_avg_installments_AMT_INSTALMENT', 'max_previous_sum_credit_AMT_RECIVABLE', 'max_previous_sum_installments_NUM_INSTALMENT_VERSION', 'max_previous_SELLERPLACE_AREA', 'max_previous_max_credit_MONTHS_BALANCE', 'max_previous_sum_credit_AMT_DRAWINGS_CURRENT', 'max_previous_avg_installments_AMT_PAYMENT', 'max_previous_avg_credit_CNT_DRAWINGS_ATM_CURRENT', 'max_previous_max_installments_NUM_INSTALMENT_NUMBER', 'max_previous_DAYS_LAST_DUE', 'max_previous_max_installments_DAYS_INSTALMENT', 'max_previous_avg_credit_CNT_DRAWINGS_CURRENT', 'max_previous_sum_credit_AMT_DRAWINGS_OTHER_CURRENT', 'max_previous_min_installments_NUM_INSTALMENT_NUMBER', 'max_previous_AMT_GOODS_PRICE', 'max_previous_max_credit_AMT_RECIVABLE', 'max_previous_RATE_DOWN_PAYMENT', 'max_previous_min_credit_SK_DPD', 'max_previous_min_credit_CNT_DRAWINGS_ATM_CURRENT', 'avg_previous_sum_credit_AMT_INST_MIN_REGULARITY', 'avg_previous_min_credit_AMT_RECIVABLE', 'avg_previous_sum_credit_AMT_PAYMENT_CURRENT', 'avg_previous_min_credit_CNT_DRAWINGS_POS_CURRENT', 'avg_previous_HOUR_APPR_PROCESS_START', 'avg_previous_min_credit_AMT_CREDIT_LIMIT_ACTUAL', 'avg_previous_min_installments_AMT_PAYMENT', 'avg_previous_sum_credit_CNT_DRAWINGS_OTHER_CURRENT', 'avg_previous_min_credit_AMT_RECEIVABLE_PRINCIPAL', 'avg_previous_avg_credit_AMT_DRAWINGS_OTHER_CURRENT', 'avg_previous_sum_installments_DAYS_INSTALMENT', 'avg_previous_DAYS_LAST_DUE_1ST_VERSION', 'avg_previous_min_credit_CNT_DRAWINGS_CURRENT', 'avg_previous_avg_credit_AMT_TOTAL_RECEIVABLE', 'avg_previous_max_credit_AMT_PAYMENT_CURRENT', 'avg_previous_sum_credit_CNT_DRAWINGS_CURRENT', 'avg_previous_min_cash_CNT_INSTALMENT', 'avg_previous_sum_credit_AMT_CREDIT_LIMIT_ACTUAL', 'avg_previous_DAYS_FIRST_DRAWING', 'avg_previous_avg_credit_AMT_BALANCE', 'avg_previous_sum_credit_AMT_TOTAL_RECEIVABLE', 'avg_previous_sum_credit_SK_DPD', 'avg_previous_max_installments_NUM_INSTALMENT_VERSION', 'avg_previous_sum_cash_CNT_INSTALMENT_FUTURE', 'avg_previous_max_cash_MONTHS_BALANCE', 'avg_previous_sum_credit_CNT_INSTALMENT_MATURE_CUM', 'avg_previous_AMT_APPLICATION', 'avg_previous_avg_credit_AMT_DRAWINGS_POS_CURRENT', 'avg_previous_avg_credit_AMT_PAYMENT_TOTAL_CURRENT', 'avg_previous_sum_cash_SK_DPD_DEF', 'avg_previous_max_credit_CNT_INSTALMENT_MATURE_CUM', 'avg_previous_avg_cash_CNT_INSTALMENT_FUTURE', 'avg_previous_min_cash_SK_DPD_DEF', 'avg_previous_sum_credit_MONTHS_BALANCE', 'avg_previous_max_credit_CNT_DRAWINGS_CURRENT', 'avg_previous_DAYS_DECISION', 'avg_previous_NFLAG_LAST_APPL_IN_DAY', 'avg_previous_avg_credit_AMT_RECEIVABLE_PRINCIPAL', 'avg_previous_avg_cash_CNT_INSTALMENT', 'avg_previous_sum_credit_AMT_DRAWINGS_POS_CURRENT', 'avg_previous_avg_cash_SK_DPD_DEF', 'avg_previous_min_credit_MONTHS_BALANCE', 'avg_previous_sum_credit_SK_DPD_DEF', 'avg_previous_max_credit_AMT_PAYMENT_TOTAL_CURRENT', 'avg_previous_max_credit_CNT_DRAWINGS_OTHER_CURRENT', 'avg_previous_avg_credit_AMT_CREDIT_LIMIT_ACTUAL', 'avg_previous_sum_installments_NUM_INSTALMENT_NUMBER', 'avg_previous_avg_credit_MONTHS_BALANCE', 'avg_previous_CNT_PAYMENT', 'avg_previous_DAYS_FIRST_DUE', 'avg_previous_sum_credit_CNT_DRAWINGS_ATM_CURRENT', 'avg_previous_max_cash_CNT_INSTALMENT_FUTURE', 'avg_previous_max_credit_CNT_DRAWINGS_POS_CURRENT', 'avg_previous_max_credit_AMT_CREDIT_LIMIT_ACTUAL', 'avg_previous_max_credit_AMT_DRAWINGS_ATM_CURRENT', 'avg_previous_DAYS_TERMINATION', 'avg_previous_max_credit_AMT_DRAWINGS_CURRENT', 'avg_previous_max_credit_AMT_TOTAL_RECEIVABLE', 'avg_previous_min_credit_CNT_DRAWINGS_OTHER_CURRENT', 'avg_previous_avg_installments_NUM_INSTALMENT_VERSION', 'avg_previous_min_installments_DAYS_INSTALMENT', 'avg_previous_min_installments_NUM_INSTALMENT_VERSION', 'avg_previous_min_credit_CNT_INSTALMENT_MATURE_CUM', 'avg_previous_RATE_INTEREST_PRIMARY', 'avg_previous_avg_credit_SK_DPD', 'avg_previous_min_installments_AMT_INSTALMENT', 'avg_previous_max_credit_CNT_DRAWINGS_ATM_CURRENT', 'avg_previous_min_credit_AMT_DRAWINGS_POS_CURRENT', 'avg_previous_avg_credit_CNT_INSTALMENT_MATURE_CUM', 'avg_previous_max_credit_SK_DPD_DEF', 'avg_previous_avg_credit_AMT_RECIVABLE', 'avg_previous_avg_credit_AMT_DRAWINGS_ATM_CURRENT', 'avg_previous_max_credit_AMT_INST_MIN_REGULARITY', 'avg_previous_avg_installments_NUM_INSTALMENT_NUMBER', 'avg_previous_max_credit_AMT_DRAWINGS_OTHER_CURRENT', 'avg_previous_max_credit_SK_DPD', 'avg_previous_AMT_DOWN_PAYMENT', 'avg_previous_avg_credit_CNT_DRAWINGS_POS_CURRENT', 'avg_previous_min_credit_SK_DPD_DEF', 'avg_previous_max_credit_AMT_BALANCE', 'avg_previous_sum_installments_AMT_PAYMENT', 'avg_previous_max_installments_AMT_PAYMENT', 'avg_previous_min_credit_AMT_PAYMENT_CURRENT', 'avg_previous_max_cash_SK_DPD_DEF', 'avg_previous_min_installments_DAYS_ENTRY_PAYMENT', 'avg_previous_max_credit_AMT_DRAWINGS_POS_CURRENT', 'avg_previous_min_cash_CNT_INSTALMENT_FUTURE', 'avg_previous_min_credit_AMT_PAYMENT_TOTAL_CURRENT', 'avg_previous_max_credit_AMT_RECEIVABLE_PRINCIPAL', 'avg_previous_sum_cash_CNT_INSTALMENT', 'avg_previous_min_credit_AMT_DRAWINGS_OTHER_CURRENT', 'avg_previous_sum_credit_AMT_BALANCE', 'avg_previous_avg_credit_AMT_PAYMENT_CURRENT', 'avg_previous_avg_credit_SK_DPD_DEF', 'avg_previous_AMT_ANNUITY', 'avg_previous_min_credit_AMT_TOTAL_RECEIVABLE', 'avg_previous_max_installments_DAYS_ENTRY_PAYMENT', 'avg_previous_sum_cash_SK_DPD', 'avg_previous_sum_credit_CNT_DRAWINGS_POS_CURRENT', 'avg_previous_min_cash_MONTHS_BALANCE', 'avg_previous_sum_installments_AMT_INSTALMENT', 'avg_previous_avg_cash_MONTHS_BALANCE', 'avg_previous_min_credit_AMT_DRAWINGS_CURRENT', 'avg_previous_avg_cash_SK_DPD', 'avg_previous_sum_installments_DAYS_ENTRY_PAYMENT', 'avg_previous_sum_credit_AMT_RECEIVABLE_PRINCIPAL', 'avg_previous_min_credit_AMT_INST_MIN_REGULARITY', 'avg_previous_avg_credit_AMT_INST_MIN_REGULARITY', 'avg_previous_max_cash_SK_DPD', 'avg_previous_avg_credit_CNT_DRAWINGS_OTHER_CURRENT', 'avg_previous_max_cash_CNT_INSTALMENT', 'avg_previous_avg_installments_DAYS_INSTALMENT', 'avg_previous_sum_cash_MONTHS_BALANCE', 'avg_previous_min_credit_AMT_DRAWINGS_ATM_CURRENT', 'avg_previous_AMT_CREDIT', 'avg_previous_RATE_INTEREST_PRIVILEGED', 'avg_previous_max_installments_AMT_INSTALMENT', 'avg_previous_avg_credit_AMT_DRAWINGS_CURRENT', 'avg_previous_NFLAG_INSURED_ON_APPROVAL', 'avg_previous_avg_installments_DAYS_ENTRY_PAYMENT', 'avg_previous_min_credit_AMT_BALANCE', 'avg_previous_sum_credit_AMT_PAYMENT_TOTAL_CURRENT', 'avg_previous_min_cash_SK_DPD', 'avg_previous_sum_credit_AMT_DRAWINGS_ATM_CURRENT', 'avg_previous_avg_installments_AMT_INSTALMENT', 'avg_previous_sum_credit_AMT_RECIVABLE', 'avg_previous_sum_installments_NUM_INSTALMENT_VERSION', 'avg_previous_SELLERPLACE_AREA', 'avg_previous_max_credit_MONTHS_BALANCE', 'avg_previous_sum_credit_AMT_DRAWINGS_CURRENT', 'avg_previous_avg_installments_AMT_PAYMENT', 'avg_previous_avg_credit_CNT_DRAWINGS_ATM_CURRENT', 'avg_previous_max_installments_NUM_INSTALMENT_NUMBER', 'avg_previous_DAYS_LAST_DUE', 'avg_previous_max_installments_DAYS_INSTALMENT', 'avg_previous_avg_credit_CNT_DRAWINGS_CURRENT', 'avg_previous_sum_credit_AMT_DRAWINGS_OTHER_CURRENT', 'avg_previous_min_installments_NUM_INSTALMENT_NUMBER', 'avg_previous_AMT_GOODS_PRICE', 'avg_previous_max_credit_AMT_RECIVABLE', 'avg_previous_RATE_DOWN_PAYMENT', 'avg_previous_min_credit_SK_DPD', 'avg_previous_min_credit_CNT_DRAWINGS_ATM_CURRENT']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

featuretoolsOnSpark-0.1.2.tar.gz (24.9 kB view hashes)

Uploaded Source

Built Distribution

featuretoolsOnSpark-0.1.2-py3-none-any.whl (19.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page