Skip to main content

A description of your project

Project description

Time Series Feature Engineering

Time series feature generator.

Install

pip install timeseries_feature_engineering

How to use

Add Date Parts

df = pd.DataFrame({'date': ['2019-12-04', None, '2019-11-15', '2019-10-24']})
df = add_datepart(df, 'date')
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Year Month Week Day Dayofweek Dayofyear Is_month_end Is_month_start Is_quarter_end Is_quarter_start Is_year_end Is_year_start Elapsed
0 2019.0 12.0 49.0 4.0 2.0 338.0 False False False False False False 1575417600
1 NaN NaN NaN NaN NaN NaN False False False False False False None
2 2019.0 11.0 46.0 15.0 4.0 319.0 False False False False False False 1573776000
3 2019.0 10.0 43.0 24.0 3.0 297.0 False False False False False False 1571875200

Add Moving Average Features

With weighted average.

Recency in an important factor in a time series. Values closer to the current date would hold more information.

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_moving_average_features(df, 'sales', windows=[3,5], weighted=True)
df.head(10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_MA sales_5p_MA
0 2019-12-01 155 NaN NaN
1 2019-12-02 437 NaN NaN
2 2019-12-03 361 352.000000 NaN
3 2019-12-04 356 371.166667 NaN
4 2019-12-05 490 423.833333 399.066667
5 2019-12-06 222 333.666667 353.133333
6 2019-12-07 197 254.166667 294.400000
7 2019-12-08 390 297.666667 316.000000
8 2019-12-09 159 242.333333 258.666667
9 2019-12-10 470 353.000000 318.133333

Without weighted average.

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_moving_average_features(df, 'sales', windows=[3,5], weighted=True)
df.head(10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_MA sales_5p_MA
0 2019-12-01 167 NaN NaN
1 2019-12-02 458 NaN NaN
2 2019-12-03 260 310.500000 NaN
3 2019-12-04 174 250.000000 NaN
4 2019-12-05 392 297.333333 301.266667
5 2019-12-06 401 360.166667 338.200000
6 2019-12-07 460 429.000000 379.200000
7 2019-12-08 381 410.666667 393.733333
8 2019-12-09 349 378.166667 389.533333
9 2019-12-10 365 362.333333 379.000000

Add Expanding Features

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_expanding_features(df, 'sales', period=3)
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_expanding
0 2019-12-01 178 NaN
1 2019-12-02 398 NaN
2 2019-12-03 399 325.0
3 2019-12-04 385 340.0
4 2019-12-05 136 299.2

Add Trend Features

df = pd.DataFrame({
    'date': pd.date_range('2019-12-01', '2019-12-10'), 
    'sales': np.random.randint(100, 500, size=10)
})
df = add_trend_features(df, 'sales', windows=[3,7])
df.head(10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
date sales sales_3p_trend sales_7p_trend
0 2019-12-01 237 0.000000 0.000000
1 2019-12-02 388 0.000000 0.000000
2 2019-12-03 384 0.000000 0.000000
3 2019-12-04 498 87.000000 0.000000
4 2019-12-05 275 -37.666667 0.000000
5 2019-12-06 382 -0.666667 0.000000
6 2019-12-07 132 -122.000000 0.000000
7 2019-12-08 337 20.666667 14.285714
8 2019-12-09 496 38.000000 15.428571
9 2019-12-10 216 28.000000 -24.000000

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

timeseries_feature_engineering-0.0.1.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file timeseries_feature_engineering-0.0.1.tar.gz.

File metadata

  • Download URL: timeseries_feature_engineering-0.0.1.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for timeseries_feature_engineering-0.0.1.tar.gz
Algorithm Hash digest
SHA256 0cf810de2a60ce6569544e979db5ebe28be398318d7854dffd8dd9cbfa788979
MD5 a687224259180b4e6f5f26aac26028d5
BLAKE2b-256 b6e1007212232c8d0c3b36b762559ac1f3904d54afda1c16862bad95c751b44a

See more details on using hashes here.

File details

Details for the file timeseries_feature_engineering-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: timeseries_feature_engineering-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for timeseries_feature_engineering-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a5547165b4c22d75e4cd9be496a648b8487985534479492a81b8199771be83c6
MD5 826101fc8d443a8809c3b7540093eab5
BLAKE2b-256 5434e0d8ad2643d8f324859fdb8d2a68a310eb3b954770bbd1168f7674bdf265

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page