A description of your project
Project description
Time Series Feature Engineering
Time series feature generator.
Install
pip install timeseries_feature_engineering
How to use
Add Date Parts
df = pd.DataFrame({'date': ['2019-12-04', None, '2019-11-15', '2019-10-24']})
df = add_datepart(df, 'date')
df.head()
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
Year | Month | Week | Day | Dayofweek | Dayofyear | Is_month_end | Is_month_start | Is_quarter_end | Is_quarter_start | Is_year_end | Is_year_start | Elapsed | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2019.0 | 12.0 | 49.0 | 4.0 | 2.0 | 338.0 | False | False | False | False | False | False | 1575417600 |
1 | NaN | NaN | NaN | NaN | NaN | NaN | False | False | False | False | False | False | None |
2 | 2019.0 | 11.0 | 46.0 | 15.0 | 4.0 | 319.0 | False | False | False | False | False | False | 1573776000 |
3 | 2019.0 | 10.0 | 43.0 | 24.0 | 3.0 | 297.0 | False | False | False | False | False | False | 1571875200 |
Add Moving Average Features
With weighted average.
Recency in an important factor in a time series. Values closer to the current date would hold more information.
df = pd.DataFrame({
'date': pd.date_range('2019-12-01', '2019-12-10'),
'sales': np.random.randint(100, 500, size=10)
})
df = add_moving_average_features(df, 'sales', windows=[3,5], weighted=True)
df.head(10)
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
date | sales | sales_3p_MA | sales_5p_MA | |
---|---|---|---|---|
0 | 2019-12-01 | 155 | NaN | NaN |
1 | 2019-12-02 | 437 | NaN | NaN |
2 | 2019-12-03 | 361 | 352.000000 | NaN |
3 | 2019-12-04 | 356 | 371.166667 | NaN |
4 | 2019-12-05 | 490 | 423.833333 | 399.066667 |
5 | 2019-12-06 | 222 | 333.666667 | 353.133333 |
6 | 2019-12-07 | 197 | 254.166667 | 294.400000 |
7 | 2019-12-08 | 390 | 297.666667 | 316.000000 |
8 | 2019-12-09 | 159 | 242.333333 | 258.666667 |
9 | 2019-12-10 | 470 | 353.000000 | 318.133333 |
Without weighted average.
df = pd.DataFrame({
'date': pd.date_range('2019-12-01', '2019-12-10'),
'sales': np.random.randint(100, 500, size=10)
})
df = add_moving_average_features(df, 'sales', windows=[3,5], weighted=True)
df.head(10)
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
date | sales | sales_3p_MA | sales_5p_MA | |
---|---|---|---|---|
0 | 2019-12-01 | 167 | NaN | NaN |
1 | 2019-12-02 | 458 | NaN | NaN |
2 | 2019-12-03 | 260 | 310.500000 | NaN |
3 | 2019-12-04 | 174 | 250.000000 | NaN |
4 | 2019-12-05 | 392 | 297.333333 | 301.266667 |
5 | 2019-12-06 | 401 | 360.166667 | 338.200000 |
6 | 2019-12-07 | 460 | 429.000000 | 379.200000 |
7 | 2019-12-08 | 381 | 410.666667 | 393.733333 |
8 | 2019-12-09 | 349 | 378.166667 | 389.533333 |
9 | 2019-12-10 | 365 | 362.333333 | 379.000000 |
Add Expanding Features
df = pd.DataFrame({
'date': pd.date_range('2019-12-01', '2019-12-10'),
'sales': np.random.randint(100, 500, size=10)
})
df = add_expanding_features(df, 'sales', period=3)
df.head()
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
date | sales | sales_3p_expanding | |
---|---|---|---|
0 | 2019-12-01 | 178 | NaN |
1 | 2019-12-02 | 398 | NaN |
2 | 2019-12-03 | 399 | 325.0 |
3 | 2019-12-04 | 385 | 340.0 |
4 | 2019-12-05 | 136 | 299.2 |
Add Trend Features
df = pd.DataFrame({
'date': pd.date_range('2019-12-01', '2019-12-10'),
'sales': np.random.randint(100, 500, size=10)
})
df = add_trend_features(df, 'sales', windows=[3,7])
df.head(10)
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
date | sales | sales_3p_trend | sales_7p_trend | |
---|---|---|---|---|
0 | 2019-12-01 | 237 | 0.000000 | 0.000000 |
1 | 2019-12-02 | 388 | 0.000000 | 0.000000 |
2 | 2019-12-03 | 384 | 0.000000 | 0.000000 |
3 | 2019-12-04 | 498 | 87.000000 | 0.000000 |
4 | 2019-12-05 | 275 | -37.666667 | 0.000000 |
5 | 2019-12-06 | 382 | -0.666667 | 0.000000 |
6 | 2019-12-07 | 132 | -122.000000 | 0.000000 |
7 | 2019-12-08 | 337 | 20.666667 | 14.285714 |
8 | 2019-12-09 | 496 | 38.000000 | 15.428571 |
9 | 2019-12-10 | 216 | 28.000000 | -24.000000 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for timeseries_feature_engineering-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0cf810de2a60ce6569544e979db5ebe28be398318d7854dffd8dd9cbfa788979 |
|
MD5 | a687224259180b4e6f5f26aac26028d5 |
|
BLAKE2b-256 | b6e1007212232c8d0c3b36b762559ac1f3904d54afda1c16862bad95c751b44a |
Close
Hashes for timeseries_feature_engineering-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a5547165b4c22d75e4cd9be496a648b8487985534479492a81b8199771be83c6 |
|
MD5 | 826101fc8d443a8809c3b7540093eab5 |
|
BLAKE2b-256 | 5434e0d8ad2643d8f324859fdb8d2a68a310eb3b954770bbd1168f7674bdf265 |