An autocorrelation function-based seasonality periods automatic finder for univariate time series.
Project description
auto-period-finder
About auto-period-finder
auto-period-finder
is an autocorrelation function (ACF) based seasonality periods automatic finder for univariate time series.
Installation
To install the latest version of auto-period-finder
, simply run:
pip install auto-period-finder
Example
Start by loading a timeseries dataset with a frequency. We can use co2
emissions sample dataset from statsmodels
from statsmodels.datasets import co2
data = co2.load().data
You can resample the data to whatever frequency you want.
data = data.resample("ME").mean().ffill()
Use AutoPeriodFinder
to find the list of seasonality periods based on ACF.
from auto_period_finder import AutoPeriodFinder
period_finder = AutoPeriodFinder(data)
periods = period_finder.fit()
You can also find the most prominent period either ACF-wise:
strongest_period_acf = period_finder.fit_find_strongest_acf()
or variance-wise:
strongest_period_var = period_finder.fit_find_strongest_var()
You can learn more about calculating seasonality component through variance from here.
How to Get Started
This project is built and published using Poetry. To setup development environment for this project you can follow these steps:
- First, you need to install Python of one of the compatible versions indicated above.
- Install Poetry. You can follow this guide and use their official installer.
- Navigate to the root folder and install dependencies in a virtual environment:
poetry install
- If everything worked properly, you should have
auto-period-finder-geinoPPi-py3.10
environment activated. You can verify this by running:
poetry env list
- You can run tests using the command:
poetry run pytest
- To export the list detailed list of dependencies, run the following command:
poetry self add poetry-plugin-export
poetry export --output requirements.txt
ACF-Based Seasonality Period Detection Explained
An easy and quick way to find seasonality periods of a univariate time series is to check its autocorrelation function (ACF) and look for specific charecteristics in lag values that we will detail in a second. You can read more information about time series ACF here, but intuitively, An autocorrelation coefficient $r_k$ measures the the linear relationship between $k$-lagged values of a given time series. In simpler terms, $r_k$ measures how similar/dissimilar time series values that $k$-length apart from each other. The set of $r_k$ values for each lag $k$ makes ACF. Equipped with this information, I developed a package for finding time series seasonality periods automatically using ACF information.
Simply put, given a univariate time series $T$, the algorithm finds, iteratively, lag values $k$ such that:
- $1 \lt k \leq \frac{\lvert T \rvert}{2}$
- Autocorrelation coefficients $r_q$ are local maxima where $q \in {k, 2k, 3k, ...}$
- $\forall p \in P, \forall n \in \mathbb{N}, k \neq n \times p$, where $P$ is the list of already found periods.
The list of such $k$ values constitute the set of found seasonality periods $P$. To understand this further, consider this hypothetical time series of hourly frequency that has clear weekly seasonality below
Now let's look at the corresponding ACF for the time series above:
You can see that the autocorrelation coefficient for lag value 168 hours (i.e. one week) is a local maximum (red-border square). Similarly, autocorrelation coefficient for lag values that are multiples of 168 (gray-border squares). We can therefore conclude that this time series has a weekly seasonality period.
Notes
- The first condition is needed because a seasonality period cannot neither be 1 (a trivial case), nor greater than half the length of the target time series (by definition, a seasonality has to manifest itself at least twice in a given time series).
- The third condition favors eliminating redundant seasonality periods that are multiples of each others. The algorithm does allow, however, finding seasonality periods that divide already found seasonality periods.
- The periods detection uses
argmax
on the ACF to select seasonality period candidates before checking they satisfy the conditions discussed above. Therefore, the list of seasonality periods are returned in the descending order of their corresponding ACF coefficients.
References
- [1] Hyndman, R.J., & Athanasopoulos, G. (2021) Forecasting: principles and practice, 3rd edition, OTexts: Melbourne, Australia. OTexts.com/fpp3. Accessed on 12-25-2023.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for auto_period_finder-0.0.11.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | e96cef737bfae3bbfb292724daaa74d905368488c3d1b6da208a0c8b5197e100 |
|
MD5 | 4cab6a0ea43778957059e336fdba715a |
|
BLAKE2b-256 | d8a7fb9de46970cdce39370422745e0ac1b144fe3dea12006e448edb08010b11 |
Hashes for auto_period_finder-0.0.11-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d18bec069485ee00ea312cfaebc80b0278ef2da3b084fd08c70ebcfdea7abb41 |
|
MD5 | 64a63cf410bcff636cf28046d0759395 |
|
BLAKE2b-256 | e584d2f06bd12cc8a1d9989632b8a3002b0b99ab1b98e6b1120ad5a9d52bad7b |