Skip to main content

A description of your project

Project description

Time Series Feature Engineering

Time series feature generator.

Install

pip install timeseries_feature_engineering

How to use

Add Date Parts

df = pd.DataFrame({'date': ['2019-12-04', None, '2019-11-15', '2019-10-24']})
df = add_datepart(df, 'date')
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Year Month Week Day Dayofweek Dayofyear Is_month_end Is_month_start Is_quarter_end Is_quarter_start Is_year_end Is_year_start Elapsed
0 2019.0 12.0 49.0 4.0 2.0 338.0 False False False False False False 1575417600
1 NaN NaN NaN NaN NaN NaN False False False False False False None
2 2019.0 11.0 46.0 15.0 4.0 319.0 False False False False False False 1573776000
3 2019.0 10.0 43.0 24.0 3.0 297.0 False False False False False False 1571875200

Add Moving Average Features

With weighted average.

Recency in an important factor in a time series. Values closer to the current date would hold more information.

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_moving_average_features(df, 'sales', windows=[3,5], weighted=True)
df.head(10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_MA sales_5p_MA
0 2019-12-01 155 NaN NaN
1 2019-12-02 437 NaN NaN
2 2019-12-03 361 352.000000 NaN
3 2019-12-04 356 371.166667 NaN
4 2019-12-05 490 423.833333 399.066667
5 2019-12-06 222 333.666667 353.133333
6 2019-12-07 197 254.166667 294.400000
7 2019-12-08 390 297.666667 316.000000
8 2019-12-09 159 242.333333 258.666667
9 2019-12-10 470 353.000000 318.133333

Without weighted average.

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_moving_average_features(df, 'sales', windows=[3,5], weighted=True)
df.head(10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_MA sales_5p_MA
0 2019-12-01 167 NaN NaN
1 2019-12-02 458 NaN NaN
2 2019-12-03 260 310.500000 NaN
3 2019-12-04 174 250.000000 NaN
4 2019-12-05 392 297.333333 301.266667
5 2019-12-06 401 360.166667 338.200000
6 2019-12-07 460 429.000000 379.200000
7 2019-12-08 381 410.666667 393.733333
8 2019-12-09 349 378.166667 389.533333
9 2019-12-10 365 362.333333 379.000000

Add Expanding Features

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_expanding_features(df, 'sales', period=3)
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_expanding
0 2019-12-01 178 NaN
1 2019-12-02 398 NaN
2 2019-12-03 399 325.0
3 2019-12-04 385 340.0
4 2019-12-05 136 299.2

Add Trend Features

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_trend_features(df, 'sales', windows=[3,7])
df.head(10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_trend sales_7p_trend
0 2019-12-01 237 0.000000 0.000000
1 2019-12-02 388 0.000000 0.000000
2 2019-12-03 384 0.000000 0.000000
3 2019-12-04 498 87.000000 0.000000
4 2019-12-05 275 -37.666667 0.000000
5 2019-12-06 382 -0.666667 0.000000
6 2019-12-07 132 -122.000000 0.000000
7 2019-12-08 337 20.666667 14.285714
8 2019-12-09 496 38.000000 15.428571
9 2019-12-10 216 28.000000 -24.000000

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

timeseries_feature_engineering-0.0.1.tar.gz (13.4 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page