A simplified version of featuretools for Spark
Project description
featuretoolsOnSpark
Featuretools is a python library for automated feature engineering.
This repo is a simplified version of featuretools,using automatic feature generation framework of featuretools.Instead of the fussy back-end architecture of featuretools,We mainly use Spark DataFrame to achieve faster feature generation process(speed up 10x+).
Installation
Install with pip
pip install featuretoolsOnSpark
Install from source
git clone https://github.com/giantcroc/featuretoolsOnSpark.git
cd featuretoolsOnSpark
python setup.py install
Example
Below is an example of how to use apis of this repo.We Choose the dataset from Kaggle's competition(Home-Credit-Default-Risk).The relationships between tables are shown in the picture below.
First,you should guarantee that all csv files needed have been saved as Spark DataFrame format.
1. Create Spark Context
>> from pyspark.sql import SparkSession
>> spark = SparkSession \
.builder \
.appName("home-credit") \
.enableHiveSupport()\
.getOrCreate()
2. Get Spark DataFrame
>> app_train = spark.sql(''' select * from home_credit.app_train ''')
>> bureau = spark.sql(''' select * from home_credit.bureau ''')
>> bureau_balance = spark.sql(''' select * from home_credit.bureau_balance ''')
>> cash = spark.sql(''' select * from home_credit.cash ''')
>> credit = spark.sql(''' select * from home_credit.credit ''')
>> installments = spark.sql(''' select * from home_credit.installments ''')
>> previous = spark.sql(''' select * from home_credit.previous ''')
3. Create TableSet
>> import featuretoolsOnSpark as fts
>> ts = fts.TableSet("home_credit",no_change_columns=["SK_ID_PREV","SK_ID_CURR","SK_ID_BUREAU"],verbose=False)
4. Create Tables From Spark DataFrame
>> ts.table_from_dataframe(table_id="bureau_balance",dataframe=bureau_balance,index='bureau_balance_id',make_index = True)
>> ts.table_from_dataframe(table_id="app_train",dataframe=app_train,index='SK_ID_CURR')
>> ts.table_from_dataframe(table_id="bureau",dataframe=bureau,index='SK_ID_BUREAU')
>> ts.table_from_dataframe(table_id="cash",dataframe=cash,index='cash_id',make_index = True)
>> ts.table_from_dataframe(table_id="credit",dataframe=credit,index='credit_id',make_index = True)
>> ts.table_from_dataframe(table_id="installments",dataframe=installments,index='installments_id',make_index = True)
>> ts.table_from_dataframe(table_id="previous",dataframe=previous,index='SK_ID_PREV')
5. Add Relationships of Tables
>> re1 = Relationship(ts["app_train"]["SK_ID_CURR"],ts["bureau"]["SK_ID_CURR"])
>> re2 = Relationship(ts["bureau"]["SK_ID_BUREAU"],ts["bureau_balance"]["SK_ID_BUREAU"])
>> re3 = Relationship(ts["app_train"]["SK_ID_CURR"],ts["previous"]["SK_ID_CURR"])
>> re4 = Relationship(ts["previous"]["SK_ID_PREV"],ts["cash"]["SK_ID_PREV"])
>> re5 = Relationship(ts["previous"]["SK_ID_PREV"],ts["credit"]["SK_ID_PREV"])
>> re6 = Relationship(ts["previous"]["SK_ID_PREV"],ts["installments"]["SK_ID_PREV"])
>> ts.add_relationships([re1,re2,re3,re4,re5,re6])
6. Run DFS To Generate Features
>> fts.dfs(tableset=ts, agg_primitives=["sum",'min','max','avg'],target_table='app_train',max_depth=2,verbose=False)
7. Get Features
>> new_app_train = ts["app_train"].df
>> old_len = ts["app_train"].old_len
>> print('new_generate_feature_len:{}'.format(len(new_app_train.columns)-old_len))
>> print(new_app_train.columns[old_len:])
new_generate_feature_len:636
['sum_max_bureau_balance_MONTHS_BALANCE', 'sum_CREDIT_DAY_OVERDUE', 'sum_CNT_CREDIT_PROLONG', 'sum_DAYS_CREDIT_ENDDATE', 'sum_DAYS_CREDIT_UPDATE', 'sum_min_bureau_balance_MONTHS_BALANCE', 'sum_sum_bureau_balance_MONTHS_BALANCE', 'sum_AMT_CREDIT_MAX_OVERDUE', 'sum_bureau_AMT_ANNUITY', 'sum_DAYS_ENDDATE_FACT', 'sum_AMT_CREDIT_SUM_LIMIT', 'sum_DAYS_CREDIT', 'sum_avg_bureau_balance_MONTHS_BALANCE', 'sum_AMT_CREDIT_SUM_DEBT', 'sum_AMT_CREDIT_SUM', 'sum_AMT_CREDIT_SUM_OVERDUE', 'min_max_bureau_balance_MONTHS_BALANCE', 'min_CREDIT_DAY_OVERDUE', 'min_CNT_CREDIT_PROLONG', 'min_DAYS_CREDIT_ENDDATE', 'min_DAYS_CREDIT_UPDATE', 'min_min_bureau_balance_MONTHS_BALANCE', 'min_sum_bureau_balance_MONTHS_BALANCE', 'min_AMT_CREDIT_MAX_OVERDUE', 'min_bureau_AMT_ANNUITY', 'min_DAYS_ENDDATE_FACT', 'min_AMT_CREDIT_SUM_LIMIT', 'min_DAYS_CREDIT', 'min_avg_bureau_balance_MONTHS_BALANCE', 'min_AMT_CREDIT_SUM_DEBT', 'min_AMT_CREDIT_SUM', 'min_AMT_CREDIT_SUM_OVERDUE', 'max_max_bureau_balance_MONTHS_BALANCE', 'max_CREDIT_DAY_OVERDUE', 'max_CNT_CREDIT_PROLONG', 'max_DAYS_CREDIT_ENDDATE', 'max_DAYS_CREDIT_UPDATE', 'max_min_bureau_balance_MONTHS_BALANCE', 'max_sum_bureau_balance_MONTHS_BALANCE', 'max_AMT_CREDIT_MAX_OVERDUE', 'max_bureau_AMT_ANNUITY', 'max_DAYS_ENDDATE_FACT', 'max_AMT_CREDIT_SUM_LIMIT', 'max_DAYS_CREDIT', 'max_avg_bureau_balance_MONTHS_BALANCE', 'max_AMT_CREDIT_SUM_DEBT', 'max_AMT_CREDIT_SUM', 'max_AMT_CREDIT_SUM_OVERDUE', 'avg_max_bureau_balance_MONTHS_BALANCE', 'avg_CREDIT_DAY_OVERDUE', 'avg_CNT_CREDIT_PROLONG', 'avg_DAYS_CREDIT_ENDDATE', 'avg_DAYS_CREDIT_UPDATE', 'avg_min_bureau_balance_MONTHS_BALANCE', 'avg_sum_bureau_balance_MONTHS_BALANCE', 'avg_AMT_CREDIT_MAX_OVERDUE', 'avg_bureau_AMT_ANNUITY', 'avg_DAYS_ENDDATE_FACT', 'avg_AMT_CREDIT_SUM_LIMIT', 'avg_DAYS_CREDIT', 'avg_avg_bureau_balance_MONTHS_BALANCE', 'avg_AMT_CREDIT_SUM_DEBT', 'avg_AMT_CREDIT_SUM', 'avg_AMT_CREDIT_SUM_OVERDUE', 'sum_min_AMT_CREDIT_LIMIT_ACTUAL', 'sum_max_DAYS_ENTRY_PAYMENT', 'sum_sum_NUM_INSTALMENT_VERSION', 'sum_sum_AMT_PAYMENT', 'sum_avg_AMT_PAYMENT_TOTAL_CURRENT', 'sum_max_CNT_DRAWINGS_POS_CURRENT', 'sum_max_AMT_BALANCE', 'sum_min_DAYS_INSTALMENT', 'sum_min_AMT_INSTALMENT', 'sum_min_AMT_RECEIVABLE_PRINCIPAL', 'sum_max_AMT_RECIVABLE', 'sum_DAYS_LAST_DUE_1ST_VERSION', 'sum_avg_AMT_INST_MIN_REGULARITY', 'sum_avg_CNT_DRAWINGS_OTHER_CURRENT', 'sum_max_AMT_TOTAL_RECEIVABLE', 'sum_min_AMT_DRAWINGS_OTHER_CURRENT', 'sum_sum_AMT_PAYMENT_TOTAL_CURRENT', 'sum_min_AMT_PAYMENT', 'sum_sum_CNT_INSTALMENT', 'sum_min_AMT_PAYMENT_TOTAL_CURRENT', 'sum_DAYS_FIRST_DRAWING', 'sum_DAYS_TERMINATION', 'sum_sum_cash_MONTHS_BALANCE', 'sum_sum_credit_SK_DPD', 'sum_min_AMT_TOTAL_RECEIVABLE', 'sum_avg_AMT_DRAWINGS_POS_CURRENT', 'sum_max_NUM_INSTALMENT_NUMBER', 'sum_avg_AMT_DRAWINGS_CURRENT', 'sum_sum_cash_SK_DPD_DEF', 'sum_avg_AMT_PAYMENT_CURRENT', 'sum_avg_AMT_RECIVABLE', 'sum_min_cash_SK_DPD_DEF', 'sum_min_CNT_INSTALMENT_FUTURE', 'sum_max_NUM_INSTALMENT_VERSION', 'sum_sum_DAYS_ENTRY_PAYMENT', 'sum_max_CNT_INSTALMENT_MATURE_CUM', 'sum_avg_DAYS_INSTALMENT', 'sum_min_CNT_INSTALMENT', 'sum_max_AMT_PAYMENT_CURRENT', 'sum_avg_CNT_INSTALMENT', 'sum_avg_cash_SK_DPD_DEF', 'sum_sum_AMT_TOTAL_RECEIVABLE', 'sum_avg_AMT_TOTAL_RECEIVABLE', 'sum_min_AMT_PAYMENT_CURRENT', 'sum_avg_CNT_DRAWINGS_POS_CURRENT', 'sum_avg_AMT_PAYMENT', 'sum_min_DAYS_ENTRY_PAYMENT', 'sum_max_CNT_DRAWINGS_OTHER_CURRENT', 'sum_avg_AMT_RECEIVABLE_PRINCIPAL', 'sum_CNT_PAYMENT', 'sum_sum_CNT_DRAWINGS_ATM_CURRENT', 'sum_DAYS_FIRST_DUE', 'sum_sum_AMT_INST_MIN_REGULARITY', 'sum_min_CNT_INSTALMENT_MATURE_CUM', 'sum_sum_AMT_DRAWINGS_OTHER_CURRENT', 'sum_previous_AMT_CREDIT', 'sum_min_AMT_DRAWINGS_CURRENT', 'sum_avg_MONTHS_BALANCE', 'sum_DAYS_DECISION', 'sum_min_CNT_DRAWINGS_OTHER_CURRENT', 'sum_sum_credit_SK_DPD_DEF', 'sum_max_MONTHS_BALANCE', 'sum_RATE_INTEREST_PRIMARY', 'sum_max_CNT_DRAWINGS_CURRENT', 'sum_avg_credit_SK_DPD', 'sum_sum_AMT_BALANCE', 'sum_min_AMT_BALANCE', 'sum_avg_AMT_DRAWINGS_ATM_CURRENT', 'sum_sum_CNT_DRAWINGS_OTHER_CURRENT', 'sum_max_CNT_INSTALMENT_FUTURE', 'sum_max_AMT_DRAWINGS_POS_CURRENT', 'sum_max_credit_SK_DPD', 'sum_avg_AMT_BALANCE', 'sum_AMT_DOWN_PAYMENT', 'sum_sum_CNT_DRAWINGS_POS_CURRENT', 'sum_min_credit_SK_DPD_DEF', 'sum_min_CNT_DRAWINGS_POS_CURRENT', 'sum_max_cash_SK_DPD_DEF', 'sum_avg_cash_MONTHS_BALANCE', 'sum_avg_CNT_DRAWINGS_ATM_CURRENT', 'sum_max_credit_SK_DPD_DEF', 'sum_sum_AMT_DRAWINGS_CURRENT', 'sum_max_AMT_DRAWINGS_CURRENT', 'sum_min_AMT_DRAWINGS_ATM_CURRENT', 'sum_sum_AMT_DRAWINGS_POS_CURRENT', 'sum_sum_AMT_RECEIVABLE_PRINCIPAL', 'sum_sum_CNT_DRAWINGS_CURRENT', 'sum_max_DAYS_INSTALMENT', 'sum_max_AMT_CREDIT_LIMIT_ACTUAL', 'sum_avg_credit_SK_DPD_DEF', 'sum_AMT_ANNUITY', 'sum_min_CNT_DRAWINGS_CURRENT', 'sum_sum_NUM_INSTALMENT_NUMBER', 'sum_avg_DAYS_ENTRY_PAYMENT', 'sum_min_AMT_INST_MIN_REGULARITY', 'sum_sum_cash_SK_DPD', 'sum_min_MONTHS_BALANCE', 'sum_avg_NUM_INSTALMENT_NUMBER', 'sum_min_cash_MONTHS_BALANCE', 'sum_max_AMT_PAYMENT_TOTAL_CURRENT', 'sum_min_AMT_RECIVABLE', 'sum_sum_CNT_INSTALMENT_FUTURE', 'sum_avg_cash_SK_DPD', 'sum_previous_AMT_GOODS_PRICE', 'sum_min_NUM_INSTALMENT_NUMBER', 'sum_sum_AMT_INSTALMENT', 'sum_max_cash_SK_DPD', 'sum_avg_AMT_INSTALMENT', 'sum_max_AMT_RECEIVABLE_PRINCIPAL', 'sum_RATE_DOWN_PAYMENT', 'sum_sum_AMT_RECIVABLE', 'sum_sum_MONTHS_BALANCE', 'sum_avg_AMT_CREDIT_LIMIT_ACTUAL', 'sum_max_AMT_INST_MIN_REGULARITY', 'sum_min_NUM_INSTALMENT_VERSION', 'sum_avg_CNT_DRAWINGS_CURRENT', 'sum_max_AMT_DRAWINGS_OTHER_CURRENT', 'sum_sum_AMT_CREDIT_LIMIT_ACTUAL', 'sum_max_CNT_INSTALMENT', 'sum_max_AMT_PAYMENT', 'sum_RATE_INTEREST_PRIVILEGED', 'sum_max_AMT_INSTALMENT', 'sum_max_AMT_DRAWINGS_ATM_CURRENT', 'sum_NFLAG_LAST_APPL_IN_DAY', 'sum_NFLAG_INSURED_ON_APPROVAL', 'sum_min_cash_SK_DPD', 'sum_avg_CNT_INSTALMENT_FUTURE', 'sum_sum_AMT_DRAWINGS_ATM_CURRENT', 'sum_SELLERPLACE_AREA', 'sum_sum_AMT_PAYMENT_CURRENT', 'sum_avg_NUM_INSTALMENT_VERSION', 'sum_max_cash_MONTHS_BALANCE', 'sum_min_AMT_DRAWINGS_POS_CURRENT', 'sum_sum_CNT_INSTALMENT_MATURE_CUM', 'sum_AMT_APPLICATION', 'sum_DAYS_LAST_DUE', 'sum_avg_CNT_INSTALMENT_MATURE_CUM', 'sum_max_CNT_DRAWINGS_ATM_CURRENT', 'sum_previous_HOUR_APPR_PROCESS_START', 'sum_avg_AMT_DRAWINGS_OTHER_CURRENT', 'sum_min_credit_SK_DPD', 'sum_min_CNT_DRAWINGS_ATM_CURRENT', 'sum_sum_DAYS_INSTALMENT', 'min_min_AMT_CREDIT_LIMIT_ACTUAL', 'min_max_DAYS_ENTRY_PAYMENT', 'min_sum_NUM_INSTALMENT_VERSION', 'min_sum_AMT_PAYMENT', 'min_avg_AMT_PAYMENT_TOTAL_CURRENT', 'min_max_CNT_DRAWINGS_POS_CURRENT', 'min_max_AMT_BALANCE', 'min_min_DAYS_INSTALMENT', 'min_min_AMT_INSTALMENT', 'min_min_AMT_RECEIVABLE_PRINCIPAL', 'min_max_AMT_RECIVABLE', 'min_DAYS_LAST_DUE_1ST_VERSION', 'min_avg_AMT_INST_MIN_REGULARITY', 'min_avg_CNT_DRAWINGS_OTHER_CURRENT', 'min_max_AMT_TOTAL_RECEIVABLE', 'min_min_AMT_DRAWINGS_OTHER_CURRENT', 'min_sum_AMT_PAYMENT_TOTAL_CURRENT', 'min_min_AMT_PAYMENT', 'min_sum_CNT_INSTALMENT', 'min_min_AMT_PAYMENT_TOTAL_CURRENT', 'min_DAYS_FIRST_DRAWING', 'min_DAYS_TERMINATION', 'min_sum_cash_MONTHS_BALANCE', 'min_sum_credit_SK_DPD', 'min_min_AMT_TOTAL_RECEIVABLE', 'min_avg_AMT_DRAWINGS_POS_CURRENT', 'min_max_NUM_INSTALMENT_NUMBER', 'min_avg_AMT_DRAWINGS_CURRENT', 'min_sum_cash_SK_DPD_DEF', 'min_avg_AMT_PAYMENT_CURRENT', 'min_avg_AMT_RECIVABLE', 'min_min_cash_SK_DPD_DEF', 'min_min_CNT_INSTALMENT_FUTURE', 'min_max_NUM_INSTALMENT_VERSION', 'min_sum_DAYS_ENTRY_PAYMENT', 'min_max_CNT_INSTALMENT_MATURE_CUM', 'min_avg_DAYS_INSTALMENT', 'min_min_CNT_INSTALMENT', 'min_max_AMT_PAYMENT_CURRENT', 'min_avg_CNT_INSTALMENT', 'min_avg_cash_SK_DPD_DEF', 'min_sum_AMT_TOTAL_RECEIVABLE', 'min_avg_AMT_TOTAL_RECEIVABLE', 'min_min_AMT_PAYMENT_CURRENT', 'min_avg_CNT_DRAWINGS_POS_CURRENT', 'min_avg_AMT_PAYMENT', 'min_min_DAYS_ENTRY_PAYMENT', 'min_max_CNT_DRAWINGS_OTHER_CURRENT', 'min_avg_AMT_RECEIVABLE_PRINCIPAL', 'min_CNT_PAYMENT', 'min_sum_CNT_DRAWINGS_ATM_CURRENT', 'min_DAYS_FIRST_DUE', 'min_sum_AMT_INST_MIN_REGULARITY', 'min_min_CNT_INSTALMENT_MATURE_CUM', 'min_sum_AMT_DRAWINGS_OTHER_CURRENT', 'min_previous_AMT_CREDIT', 'min_min_AMT_DRAWINGS_CURRENT', 'min_avg_MONTHS_BALANCE', 'min_DAYS_DECISION', 'min_min_CNT_DRAWINGS_OTHER_CURRENT', 'min_sum_credit_SK_DPD_DEF', 'min_max_MONTHS_BALANCE', 'min_RATE_INTEREST_PRIMARY', 'min_max_CNT_DRAWINGS_CURRENT', 'min_avg_credit_SK_DPD', 'min_sum_AMT_BALANCE', 'min_min_AMT_BALANCE', 'min_avg_AMT_DRAWINGS_ATM_CURRENT', 'min_sum_CNT_DRAWINGS_OTHER_CURRENT', 'min_max_CNT_INSTALMENT_FUTURE', 'min_max_AMT_DRAWINGS_POS_CURRENT', 'min_max_credit_SK_DPD', 'min_avg_AMT_BALANCE', 'min_AMT_DOWN_PAYMENT', 'min_sum_CNT_DRAWINGS_POS_CURRENT', 'min_min_credit_SK_DPD_DEF', 'min_min_CNT_DRAWINGS_POS_CURRENT', 'min_max_cash_SK_DPD_DEF', 'min_avg_cash_MONTHS_BALANCE', 'min_avg_CNT_DRAWINGS_ATM_CURRENT', 'min_max_credit_SK_DPD_DEF', 'min_sum_AMT_DRAWINGS_CURRENT', 'min_max_AMT_DRAWINGS_CURRENT', 'min_min_AMT_DRAWINGS_ATM_CURRENT', 'min_sum_AMT_DRAWINGS_POS_CURRENT', 'min_sum_AMT_RECEIVABLE_PRINCIPAL', 'min_sum_CNT_DRAWINGS_CURRENT', 'min_max_DAYS_INSTALMENT', 'min_max_AMT_CREDIT_LIMIT_ACTUAL', 'min_avg_credit_SK_DPD_DEF', 'min_AMT_ANNUITY', 'min_min_CNT_DRAWINGS_CURRENT', 'min_sum_NUM_INSTALMENT_NUMBER', 'min_avg_DAYS_ENTRY_PAYMENT', 'min_min_AMT_INST_MIN_REGULARITY', 'min_sum_cash_SK_DPD', 'min_min_MONTHS_BALANCE', 'min_avg_NUM_INSTALMENT_NUMBER', 'min_min_cash_MONTHS_BALANCE', 'min_max_AMT_PAYMENT_TOTAL_CURRENT', 'min_min_AMT_RECIVABLE', 'min_sum_CNT_INSTALMENT_FUTURE', 'min_avg_cash_SK_DPD', 'min_previous_AMT_GOODS_PRICE', 'min_min_NUM_INSTALMENT_NUMBER', 'min_sum_AMT_INSTALMENT', 'min_max_cash_SK_DPD', 'min_avg_AMT_INSTALMENT', 'min_max_AMT_RECEIVABLE_PRINCIPAL', 'min_RATE_DOWN_PAYMENT', 'min_sum_AMT_RECIVABLE', 'min_sum_MONTHS_BALANCE', 'min_avg_AMT_CREDIT_LIMIT_ACTUAL', 'min_max_AMT_INST_MIN_REGULARITY', 'min_min_NUM_INSTALMENT_VERSION', 'min_avg_CNT_DRAWINGS_CURRENT', 'min_max_AMT_DRAWINGS_OTHER_CURRENT', 'min_sum_AMT_CREDIT_LIMIT_ACTUAL', 'min_max_CNT_INSTALMENT', 'min_max_AMT_PAYMENT', 'min_RATE_INTEREST_PRIVILEGED', 'min_max_AMT_INSTALMENT', 'min_max_AMT_DRAWINGS_ATM_CURRENT', 'min_NFLAG_LAST_APPL_IN_DAY', 'min_NFLAG_INSURED_ON_APPROVAL', 'min_min_cash_SK_DPD', 'min_avg_CNT_INSTALMENT_FUTURE', 'min_sum_AMT_DRAWINGS_ATM_CURRENT', 'min_SELLERPLACE_AREA', 'min_sum_AMT_PAYMENT_CURRENT', 'min_avg_NUM_INSTALMENT_VERSION', 'min_max_cash_MONTHS_BALANCE', 'min_min_AMT_DRAWINGS_POS_CURRENT', 'min_sum_CNT_INSTALMENT_MATURE_CUM', 'min_AMT_APPLICATION', 'min_DAYS_LAST_DUE', 'min_avg_CNT_INSTALMENT_MATURE_CUM', 'min_max_CNT_DRAWINGS_ATM_CURRENT', 'min_previous_HOUR_APPR_PROCESS_START', 'min_avg_AMT_DRAWINGS_OTHER_CURRENT', 'min_min_credit_SK_DPD', 'min_min_CNT_DRAWINGS_ATM_CURRENT', 'min_sum_DAYS_INSTALMENT', 'max_min_AMT_CREDIT_LIMIT_ACTUAL', 'max_max_DAYS_ENTRY_PAYMENT', 'max_sum_NUM_INSTALMENT_VERSION', 'max_sum_AMT_PAYMENT', 'max_avg_AMT_PAYMENT_TOTAL_CURRENT', 'max_max_CNT_DRAWINGS_POS_CURRENT', 'max_max_AMT_BALANCE', 'max_min_DAYS_INSTALMENT', 'max_min_AMT_INSTALMENT', 'max_min_AMT_RECEIVABLE_PRINCIPAL', 'max_max_AMT_RECIVABLE', 'max_DAYS_LAST_DUE_1ST_VERSION', 'max_avg_AMT_INST_MIN_REGULARITY', 'max_avg_CNT_DRAWINGS_OTHER_CURRENT', 'max_max_AMT_TOTAL_RECEIVABLE', 'max_min_AMT_DRAWINGS_OTHER_CURRENT', 'max_sum_AMT_PAYMENT_TOTAL_CURRENT', 'max_min_AMT_PAYMENT', 'max_sum_CNT_INSTALMENT', 'max_min_AMT_PAYMENT_TOTAL_CURRENT', 'max_DAYS_FIRST_DRAWING', 'max_DAYS_TERMINATION', 'max_sum_cash_MONTHS_BALANCE', 'max_sum_credit_SK_DPD', 'max_min_AMT_TOTAL_RECEIVABLE', 'max_avg_AMT_DRAWINGS_POS_CURRENT', 'max_max_NUM_INSTALMENT_NUMBER', 'max_avg_AMT_DRAWINGS_CURRENT', 'max_sum_cash_SK_DPD_DEF', 'max_avg_AMT_PAYMENT_CURRENT', 'max_avg_AMT_RECIVABLE', 'max_min_cash_SK_DPD_DEF', 'max_min_CNT_INSTALMENT_FUTURE', 'max_max_NUM_INSTALMENT_VERSION', 'max_sum_DAYS_ENTRY_PAYMENT', 'max_max_CNT_INSTALMENT_MATURE_CUM', 'max_avg_DAYS_INSTALMENT', 'max_min_CNT_INSTALMENT', 'max_max_AMT_PAYMENT_CURRENT', 'max_avg_CNT_INSTALMENT', 'max_avg_cash_SK_DPD_DEF', 'max_sum_AMT_TOTAL_RECEIVABLE', 'max_avg_AMT_TOTAL_RECEIVABLE', 'max_min_AMT_PAYMENT_CURRENT', 'max_avg_CNT_DRAWINGS_POS_CURRENT', 'max_avg_AMT_PAYMENT', 'max_min_DAYS_ENTRY_PAYMENT', 'max_max_CNT_DRAWINGS_OTHER_CURRENT', 'max_avg_AMT_RECEIVABLE_PRINCIPAL', 'max_CNT_PAYMENT', 'max_sum_CNT_DRAWINGS_ATM_CURRENT', 'max_DAYS_FIRST_DUE', 'max_sum_AMT_INST_MIN_REGULARITY', 'max_min_CNT_INSTALMENT_MATURE_CUM', 'max_sum_AMT_DRAWINGS_OTHER_CURRENT', 'max_previous_AMT_CREDIT', 'max_min_AMT_DRAWINGS_CURRENT', 'max_avg_MONTHS_BALANCE', 'max_DAYS_DECISION', 'max_min_CNT_DRAWINGS_OTHER_CURRENT', 'max_sum_credit_SK_DPD_DEF', 'max_max_MONTHS_BALANCE', 'max_RATE_INTEREST_PRIMARY', 'max_max_CNT_DRAWINGS_CURRENT', 'max_avg_credit_SK_DPD', 'max_sum_AMT_BALANCE', 'max_min_AMT_BALANCE', 'max_avg_AMT_DRAWINGS_ATM_CURRENT', 'max_sum_CNT_DRAWINGS_OTHER_CURRENT', 'max_max_CNT_INSTALMENT_FUTURE', 'max_max_AMT_DRAWINGS_POS_CURRENT', 'max_max_credit_SK_DPD', 'max_avg_AMT_BALANCE', 'max_AMT_DOWN_PAYMENT', 'max_sum_CNT_DRAWINGS_POS_CURRENT', 'max_min_credit_SK_DPD_DEF', 'max_min_CNT_DRAWINGS_POS_CURRENT', 'max_max_cash_SK_DPD_DEF', 'max_avg_cash_MONTHS_BALANCE', 'max_avg_CNT_DRAWINGS_ATM_CURRENT', 'max_max_credit_SK_DPD_DEF', 'max_sum_AMT_DRAWINGS_CURRENT', 'max_max_AMT_DRAWINGS_CURRENT', 'max_min_AMT_DRAWINGS_ATM_CURRENT', 'max_sum_AMT_DRAWINGS_POS_CURRENT', 'max_sum_AMT_RECEIVABLE_PRINCIPAL', 'max_sum_CNT_DRAWINGS_CURRENT', 'max_max_DAYS_INSTALMENT', 'max_max_AMT_CREDIT_LIMIT_ACTUAL', 'max_avg_credit_SK_DPD_DEF', 'max_AMT_ANNUITY', 'max_min_CNT_DRAWINGS_CURRENT', 'max_sum_NUM_INSTALMENT_NUMBER', 'max_avg_DAYS_ENTRY_PAYMENT', 'max_min_AMT_INST_MIN_REGULARITY', 'max_sum_cash_SK_DPD', 'max_min_MONTHS_BALANCE', 'max_avg_NUM_INSTALMENT_NUMBER', 'max_min_cash_MONTHS_BALANCE', 'max_max_AMT_PAYMENT_TOTAL_CURRENT', 'max_min_AMT_RECIVABLE', 'max_sum_CNT_INSTALMENT_FUTURE', 'max_avg_cash_SK_DPD', 'max_previous_AMT_GOODS_PRICE', 'max_min_NUM_INSTALMENT_NUMBER', 'max_sum_AMT_INSTALMENT', 'max_max_cash_SK_DPD', 'max_avg_AMT_INSTALMENT', 'max_max_AMT_RECEIVABLE_PRINCIPAL', 'max_RATE_DOWN_PAYMENT', 'max_sum_AMT_RECIVABLE', 'max_sum_MONTHS_BALANCE', 'max_avg_AMT_CREDIT_LIMIT_ACTUAL', 'max_max_AMT_INST_MIN_REGULARITY', 'max_min_NUM_INSTALMENT_VERSION', 'max_avg_CNT_DRAWINGS_CURRENT', 'max_max_AMT_DRAWINGS_OTHER_CURRENT', 'max_sum_AMT_CREDIT_LIMIT_ACTUAL', 'max_max_CNT_INSTALMENT', 'max_max_AMT_PAYMENT', 'max_RATE_INTEREST_PRIVILEGED', 'max_max_AMT_INSTALMENT', 'max_max_AMT_DRAWINGS_ATM_CURRENT', 'max_NFLAG_LAST_APPL_IN_DAY', 'max_NFLAG_INSURED_ON_APPROVAL', 'max_min_cash_SK_DPD', 'max_avg_CNT_INSTALMENT_FUTURE', 'max_sum_AMT_DRAWINGS_ATM_CURRENT', 'max_SELLERPLACE_AREA', 'max_sum_AMT_PAYMENT_CURRENT', 'max_avg_NUM_INSTALMENT_VERSION', 'max_max_cash_MONTHS_BALANCE', 'max_min_AMT_DRAWINGS_POS_CURRENT', 'max_sum_CNT_INSTALMENT_MATURE_CUM', 'max_AMT_APPLICATION', 'max_DAYS_LAST_DUE', 'max_avg_CNT_INSTALMENT_MATURE_CUM', 'max_max_CNT_DRAWINGS_ATM_CURRENT', 'max_previous_HOUR_APPR_PROCESS_START', 'max_avg_AMT_DRAWINGS_OTHER_CURRENT', 'max_min_credit_SK_DPD', 'max_min_CNT_DRAWINGS_ATM_CURRENT', 'max_sum_DAYS_INSTALMENT', 'avg_min_AMT_CREDIT_LIMIT_ACTUAL', 'avg_max_DAYS_ENTRY_PAYMENT', 'avg_sum_NUM_INSTALMENT_VERSION', 'avg_sum_AMT_PAYMENT', 'avg_avg_AMT_PAYMENT_TOTAL_CURRENT', 'avg_max_CNT_DRAWINGS_POS_CURRENT', 'avg_max_AMT_BALANCE', 'avg_min_DAYS_INSTALMENT', 'avg_min_AMT_INSTALMENT', 'avg_min_AMT_RECEIVABLE_PRINCIPAL', 'avg_max_AMT_RECIVABLE', 'avg_DAYS_LAST_DUE_1ST_VERSION', 'avg_avg_AMT_INST_MIN_REGULARITY', 'avg_avg_CNT_DRAWINGS_OTHER_CURRENT', 'avg_max_AMT_TOTAL_RECEIVABLE', 'avg_min_AMT_DRAWINGS_OTHER_CURRENT', 'avg_sum_AMT_PAYMENT_TOTAL_CURRENT', 'avg_min_AMT_PAYMENT', 'avg_sum_CNT_INSTALMENT', 'avg_min_AMT_PAYMENT_TOTAL_CURRENT', 'avg_DAYS_FIRST_DRAWING', 'avg_DAYS_TERMINATION', 'avg_sum_cash_MONTHS_BALANCE', 'avg_sum_credit_SK_DPD', 'avg_min_AMT_TOTAL_RECEIVABLE', 'avg_avg_AMT_DRAWINGS_POS_CURRENT', 'avg_max_NUM_INSTALMENT_NUMBER', 'avg_avg_AMT_DRAWINGS_CURRENT', 'avg_sum_cash_SK_DPD_DEF', 'avg_avg_AMT_PAYMENT_CURRENT', 'avg_avg_AMT_RECIVABLE', 'avg_min_cash_SK_DPD_DEF', 'avg_min_CNT_INSTALMENT_FUTURE', 'avg_max_NUM_INSTALMENT_VERSION', 'avg_sum_DAYS_ENTRY_PAYMENT', 'avg_max_CNT_INSTALMENT_MATURE_CUM', 'avg_avg_DAYS_INSTALMENT', 'avg_min_CNT_INSTALMENT', 'avg_max_AMT_PAYMENT_CURRENT', 'avg_avg_CNT_INSTALMENT', 'avg_avg_cash_SK_DPD_DEF', 'avg_sum_AMT_TOTAL_RECEIVABLE', 'avg_avg_AMT_TOTAL_RECEIVABLE', 'avg_min_AMT_PAYMENT_CURRENT', 'avg_avg_CNT_DRAWINGS_POS_CURRENT', 'avg_avg_AMT_PAYMENT', 'avg_min_DAYS_ENTRY_PAYMENT', 'avg_max_CNT_DRAWINGS_OTHER_CURRENT', 'avg_avg_AMT_RECEIVABLE_PRINCIPAL', 'avg_CNT_PAYMENT', 'avg_sum_CNT_DRAWINGS_ATM_CURRENT', 'avg_DAYS_FIRST_DUE', 'avg_sum_AMT_INST_MIN_REGULARITY', 'avg_min_CNT_INSTALMENT_MATURE_CUM', 'avg_sum_AMT_DRAWINGS_OTHER_CURRENT', 'avg_previous_AMT_CREDIT', 'avg_min_AMT_DRAWINGS_CURRENT', 'avg_avg_MONTHS_BALANCE', 'avg_DAYS_DECISION', 'avg_min_CNT_DRAWINGS_OTHER_CURRENT', 'avg_sum_credit_SK_DPD_DEF', 'avg_max_MONTHS_BALANCE', 'avg_RATE_INTEREST_PRIMARY', 'avg_max_CNT_DRAWINGS_CURRENT', 'avg_avg_credit_SK_DPD', 'avg_sum_AMT_BALANCE', 'avg_min_AMT_BALANCE', 'avg_avg_AMT_DRAWINGS_ATM_CURRENT', 'avg_sum_CNT_DRAWINGS_OTHER_CURRENT', 'avg_max_CNT_INSTALMENT_FUTURE', 'avg_max_AMT_DRAWINGS_POS_CURRENT', 'avg_max_credit_SK_DPD', 'avg_avg_AMT_BALANCE', 'avg_AMT_DOWN_PAYMENT', 'avg_sum_CNT_DRAWINGS_POS_CURRENT', 'avg_min_credit_SK_DPD_DEF', 'avg_min_CNT_DRAWINGS_POS_CURRENT', 'avg_max_cash_SK_DPD_DEF', 'avg_avg_cash_MONTHS_BALANCE', 'avg_avg_CNT_DRAWINGS_ATM_CURRENT', 'avg_max_credit_SK_DPD_DEF', 'avg_sum_AMT_DRAWINGS_CURRENT', 'avg_max_AMT_DRAWINGS_CURRENT', 'avg_min_AMT_DRAWINGS_ATM_CURRENT', 'avg_sum_AMT_DRAWINGS_POS_CURRENT', 'avg_sum_AMT_RECEIVABLE_PRINCIPAL', 'avg_sum_CNT_DRAWINGS_CURRENT', 'avg_max_DAYS_INSTALMENT', 'avg_max_AMT_CREDIT_LIMIT_ACTUAL', 'avg_avg_credit_SK_DPD_DEF', 'avg_AMT_ANNUITY', 'avg_min_CNT_DRAWINGS_CURRENT', 'avg_sum_NUM_INSTALMENT_NUMBER', 'avg_avg_DAYS_ENTRY_PAYMENT', 'avg_min_AMT_INST_MIN_REGULARITY', 'avg_sum_cash_SK_DPD', 'avg_min_MONTHS_BALANCE', 'avg_avg_NUM_INSTALMENT_NUMBER', 'avg_min_cash_MONTHS_BALANCE', 'avg_max_AMT_PAYMENT_TOTAL_CURRENT', 'avg_min_AMT_RECIVABLE', 'avg_sum_CNT_INSTALMENT_FUTURE', 'avg_avg_cash_SK_DPD', 'avg_previous_AMT_GOODS_PRICE', 'avg_min_NUM_INSTALMENT_NUMBER', 'avg_sum_AMT_INSTALMENT', 'avg_max_cash_SK_DPD', 'avg_avg_AMT_INSTALMENT', 'avg_max_AMT_RECEIVABLE_PRINCIPAL', 'avg_RATE_DOWN_PAYMENT', 'avg_sum_AMT_RECIVABLE', 'avg_sum_MONTHS_BALANCE', 'avg_avg_AMT_CREDIT_LIMIT_ACTUAL', 'avg_max_AMT_INST_MIN_REGULARITY', 'avg_min_NUM_INSTALMENT_VERSION', 'avg_avg_CNT_DRAWINGS_CURRENT', 'avg_max_AMT_DRAWINGS_OTHER_CURRENT', 'avg_sum_AMT_CREDIT_LIMIT_ACTUAL', 'avg_max_CNT_INSTALMENT', 'avg_max_AMT_PAYMENT', 'avg_RATE_INTEREST_PRIVILEGED', 'avg_max_AMT_INSTALMENT', 'avg_max_AMT_DRAWINGS_ATM_CURRENT', 'avg_NFLAG_LAST_APPL_IN_DAY', 'avg_NFLAG_INSURED_ON_APPROVAL', 'avg_min_cash_SK_DPD', 'avg_avg_CNT_INSTALMENT_FUTURE', 'avg_sum_AMT_DRAWINGS_ATM_CURRENT', 'avg_SELLERPLACE_AREA', 'avg_sum_AMT_PAYMENT_CURRENT', 'avg_avg_NUM_INSTALMENT_VERSION', 'avg_max_cash_MONTHS_BALANCE', 'avg_min_AMT_DRAWINGS_POS_CURRENT', 'avg_sum_CNT_INSTALMENT_MATURE_CUM', 'avg_AMT_APPLICATION', 'avg_DAYS_LAST_DUE', 'avg_avg_CNT_INSTALMENT_MATURE_CUM', 'avg_max_CNT_DRAWINGS_ATM_CURRENT', 'avg_previous_HOUR_APPR_PROCESS_START', 'avg_avg_AMT_DRAWINGS_OTHER_CURRENT', 'avg_min_credit_SK_DPD', 'avg_min_CNT_DRAWINGS_ATM_CURRENT', 'avg_sum_DAYS_INSTALMENT']
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for featuretoolsOnSpark-0.1.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | b55152c50d43c62016a47fa52972897e48da618bff1e393764bb3eb801a9e4f2 |
|
MD5 | 058550fe1374247adbafd60a13cb7024 |
|
BLAKE2b-256 | 1a8bbe74757437f254328d9f83bb10855024f49e102c296d56b7f93967560475 |
Hashes for featuretoolsOnSpark-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 837ca1eca829674b945fe77e84d3630c9fb2e66de7780386ba525314b31b1053 |
|
MD5 | d91c5c588fed9f39186afc27cf66798a |
|
BLAKE2b-256 | 1d3b2a44e17d934e78c313ad74b1d1ec9c6e9d6e61506a912d2fbc7c2cf3e4cf |