Skip to main content

Comprehensive time series stationarity analysis with rigorous statistical testing across trend, variance, and seasonality

Project description

StationarityToolkit

DOI

A comprehensive Python library for time series stationarity analysis. Provides 13 statistical tests with clear diagnostics and actionable transformation guidance.

What This Does

StationarityToolkit runs comprehensive stationarity checks across three dimensions: trend (unit roots, deterministic trends, structural breaks), variance (heteroskedasticity, volatility clustering), and seasonality (periodic patterns, seasonal unit roots). You get 4 trend tests, 4 variance tests, and 3 seasonal tests—all in one run. Each test returns not just pass/fail, but actionable notes indicating the appropriate transformation: differencing for unit roots, detrending for deterministic trends, Box-Cox for variance instability, seasonal differencing for seasonality, or GARCH modeling for volatility clustering.

Why Use This

Most toolkits provide limited testing and require manual interpretation of results. This toolkit runs a comprehensive suite of tests, identifies the specific type of non-stationarity (unit root vs deterministic trend vs variance instability), and suggests potential transformations. More importantly, it reveals what's actually happening in your data at each step, enabling iterative test-transform-retest workflows where you make informed decisions based on your specific use case.

Installation

pip install stationarity-toolkit

Quick Start

import pandas as pd
from stationarity_toolkit import StationarityToolkit

# Load your time series data
df = pd.read_csv('your_data.csv')
ts = pd.Series(df['value'].values, index=pd.DatetimeIndex(df['date'], freq='D'))

# Initialize toolkit
toolkit = StationarityToolkit(alpha=0.05)

# Run detection
result = toolkit.detect(ts, verbosity='detailed')

# View summary
print(result.summary)

# View detailed results as DataFrame
result.report()

# Save report to file
result.report(filepath='stationarity_report.md')

Examples

  • Basic Usage: Simple example showing core functionality
  • Detailed Usage: Comprehensive notebook demonstrating toolkit validation, transformation workflows, and real-world applications

Statistical Tests

The toolkit runs 13 tests across three categories. Here's what each test does:

Trend Tests (4 tests) check for constant mean. ADF (Augmented Dickey-Fuller) tests for unit roots vs deterministic trends using 4-case logic. KPSS (Kwiatkowski-Phillips-Schmidt-Shin) complements ADF with a reverse null hypothesis. Phillips-Perron is a non-parametric unit root test robust to heteroskedasticity. Zivot-Andrews detects structural breaks (discrete regime changes, not smooth trends) and pinpoints their location.

Variance Tests (4 tests) check for constant variance. Levene's Test detects variance changes across segmented series. Bartlett's Test is similar to Levene but assumes normality. White's Test detects time-dependent variance patterns (heteroskedasticity). ARCH Test detects volatility clustering (time-varying conditional variance) - may trigger on trend/seasonality if present.

Seasonal Tests (3 tests) check for periodic patterns; contextual lag periods used based on given time-series freq. ACF/PACF Peak Detection identifies seasonality through autocorrelation peaks at contextual periods (may trigger on trend/variance). STL Decomposition tests the significance of seasonal components via variance comparison (detects deterministic seasonality). OCSB is a regression-based seasonal unit root test (detects stochastic seasonality, not fixed patterns - use STL for deterministic).

Workflow: How to Use the Toolkit

Step 1: Run Initial Detection

Start by running the toolkit on your time series. Use verbosity='detailed' to get the full report with all test results and actionable notes.

result = toolkit.detect(ts, verbosity='detailed')

Check the summary. If all three categories (trend, variance, seasonality) are stationary, you can proceed with modeling. If any are non-stationary, follow the transformation workflow below.

Step 2: Address Non-Stationarity (Order Matters!)

When transforming non-stationary data, always address issues in this order: Variance → Trend → Seasonality. This sequence matters because unstable variance affects the reliability of trend and seasonal tests, and unremoved trends can obscure seasonal patterns. See Why Order Matters for the reasoning behind this sequence.

2.1 Stabilize Variance First

If Levene's, Bartlett's, or White's tests detect variance instability, consider power transformations like Box-Cox (for positive data) or Yeo-Johnson (handles negative values). If the ARCH test detects volatility clustering, you may need GARCH modeling after addressing trend and seasonality. The toolkit's notes will guide you on which issue was detected.

2.2 Address Trend (After Variance is Stable)

If ADF, KPSS, or Phillips-Perron detect a unit root, differencing is typically needed. If they detect a deterministic trend, detrending (removing the fitted trend) is appropriate. If Zivot-Andrews detects structural breaks, consider approaches like dummy variables at break points, splitting the series into segments, or piecewise regression. The 4-case logic in the test notes will tell you which situation applies (see 4-Case Logic for Trend Tests for details).

2.3 Address Seasonality (Last)

If ACF, STL, or OCSB tests detect seasonal patterns, seasonal differencing at the detected period is a common approach. Alternatively, STL decomposition can separate the seasonal component, allowing you to model the residuals. The toolkit reports the detected period (e.g., 52 for yearly seasonality in weekly data). Note: OCSB detects stochastic (random-walk) seasonality, while STL detects deterministic (fixed) patterns.

After each transformation, re-run the toolkit to verify stationarity has been achieved. Sometimes multiple iterations are needed—for example, differencing might introduce new variance patterns requiring stabilization.

Step 3: Iterate Until Stationary

After each transformation, re-run the toolkit to check if stationarity has been achieved:

result = toolkit.detect(ts_transformed, verbosity='detailed')

Continue transforming until all three categories (trend, variance, seasonality) are stationary. Sometimes multiple iterations are needed—for example, differencing might introduce new variance patterns that require stabilization.

Step 4: Model the Stationary Series

Once your series is stationary across all three dimensions, proceed with time series modeling. Common approaches include ARIMA/SARIMA for univariate forecasting, VAR for multivariate analysis, or linear regression with time series features. Note that some machine learning models (like tree-based methods and neural networks) can handle non-stationary data directly, but stationary data may still improve generalization and reduce distribution shift.

Understanding Test Results

Each test returns four key pieces of information: a pass/fail result (✅ Stationary or ❌ Non-stationary), an actionable note indicating what transformation to apply if non-stationary, a brief technical interpretation with the null hypothesis (H0), and the test statistic and p-value for statistical significance.

Example Output

### Augmented Dickey-Fuller (ADF) Test
- Result: ✅ Stationary
- Note: Stationary around constant mean
- Interpretation: H0: Unit root present. ADF-c p=0.0234, ADF-ct p=0.0456 < 0.05. Reject H0.
- Statistic: -3.2145
- P-value: 0.0234

Key Concepts

Key Concepts

Why Order Matters: Variance → Trend → Seasonality

When transforming non-stationary data, the sequence matters. Start with variance stabilization because unstable variance throws off the reliability of trend and seasonal tests—you might miss a unit root or misidentify seasonality if variance is jumping around. Once variance is stable, address trend next. Trend affects the mean level of your series, and if you try to detect seasonality while a strong trend is present, you'll get confused signals. Remove the trend first, then seasonal patterns become clear and easy to identify. Think of it like cleaning a dataset: fix the noise (variance), remove the drift (trend), then handle the cycles (seasonality).

Comprehensive Testing, Informed Decisions

This toolkit runs every relevant test and reports all results—not just a single pass/fail verdict. You get detailed diagnostics across trend (constant mean), variance (constant variance), and seasonality (no periodic patterns). The goal isn't to force a rigid definition of stationarity, but to give you complete information so you can make informed decisions. Maybe your use case can tolerate mild heteroskedasticity but absolutely needs no trend. Maybe you're okay with deterministic seasonality but need to eliminate stochastic trends. The toolkit shows you what's happening in your data, and you decide what matters for your specific analysis or model. Flexibility through transparency.

4-Case Logic for Trend Tests

When testing for trend non-stationarity, unit root tests (ADF, KPSS, Phillips-Perron) face a challenge: they need to distinguish between unit roots (random walk behavior requiring differencing) and deterministic trends (predictable drift requiring detrending). The toolkit runs each test twice—once with just a constant ('c') and once with constant plus trend ('ct'). If both pass, you're stationary. If only 'c' passes, you're stationary around a constant. If only 'ct' passes, you have a deterministic trend. If both fail, you have a unit root.

The toolkit's report notes automatically identify which case applies and tell you the exact transformation needed:

Example: Unit root detected (both 'c' and 'ct' fail)

- Note: Unit root detected - requires differencing
- Interpretation: H0: Unit root. ADF-c p=0.2341, ADF-ct p=0.3456 >= 0.05. Fail to reject H0.

Example: Deterministic trend detected ('c' fails, 'ct' passes)

- Note: Deterministic trend detected - stationary after detrending
- Interpretation: H0: Unit root. ADF-c p=0.1234 >= 0.05, ADF-ct p=0.0123 < 0.05. Deterministic trend.

No guesswork, just clear guidance based on the 4-case analysis.

API Reference

StationarityToolkit

toolkit = StationarityToolkit(alpha=0.05)

Parameters:

  • alpha (float): Significance level for all tests (default: 0.05)

Methods:

  • detect(timeseries, verbosity='detailed'): Run all tests and return results
    • verbosity: 'minimal' or 'detailed' (default: 'detailed')
    • Returns: DetectionResult object

DetectionResult

Properties:

  • summary (str): 3-line summary of stationarity status (Trend/Variance/Seasonal)

Attributes:

  • trend_stationary (bool): True if all trend tests pass
  • variance_stationary (bool): True if all variance tests pass
  • seasonal_stationary (bool): True if all seasonal tests pass
  • tests (dict): Dictionary of test results by category

Methods:

  • report(filepath=None): Generate report as DataFrame and optionally save markdown to file
    • If filepath provided, writes markdown report to file
    • Returns: pandas DataFrame with all test results

Contextual Period Detection

The toolkit automatically detects expected seasonal periods to test for seasonal non-stationarity based on your time series frequency:

  • Daily ('D'): Tests for weekly (7), monthly (30), and yearly (365) seasonality
  • Hourly ('H'): Tests for daily (24) and weekly (168) seasonality
  • Monthly ('M'): Tests for quarterly (3, 6) and yearly (12) seasonality
  • Weekly ('W'): Tests for yearly (52) seasonality
  • Quarterly ('Q'): Tests for yearly (4) seasonality

Although the toolkit is capable of detecting time-series freq automatically, it is recommended to provide time series freq when loading the data:

ts = pd.Series(values, index=pd.DatetimeIndex(dates, freq='W-MON'))

Why Stationarity Matters

Stationarity isn't just a theoretical requirement—it's fundamental to how many time series methods work. Classical models like ARIMA and VAR are built on the assumption that your data's statistical properties don't change over time. Statistical tests like Granger causality and cointegration analysis require stationarity to produce valid results. Even machine learning models, while not strictly requiring stationarity and vary greatly by use-case, often generalize better and suffer less from distribution shift when trained on stationary data.

When you ignore non-stationarity, things go wrong in predictable ways. You get spurious regressions where variables appear related but aren't. Your forecasts become unreliable because the model learned patterns that don't hold in the future. Statistical inferences become invalid because the assumptions underlying your tests are violated. And your models become unstable, performing well on training data but failing on new data. Stationarity testing isn't about following rules—it's about ensuring your analysis reflects reality.

License

MIT License

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Citation

If you use this toolkit in your research, please cite:

@software{stationarity_toolkit,
  title={StationarityToolkit: Comprehensive Time Series Stationarity Analysis},
  author={Malla, Bhanu Suraj},
  year={2024},
  url={https://github.com/mbsuraj/stationarityToolkit},
  doi={10.5281/zenodo.18818474}
}

Or use the "Cite this repository" button on GitHub for automatic citation generation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stationaritytoolkit-2.0.0.tar.gz (23.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stationaritytoolkit-2.0.0-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file stationaritytoolkit-2.0.0.tar.gz.

File metadata

  • Download URL: stationaritytoolkit-2.0.0.tar.gz
  • Upload date:
  • Size: 23.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.10

File hashes

Hashes for stationaritytoolkit-2.0.0.tar.gz
Algorithm Hash digest
SHA256 dff53c45f020330d5564e5adf47f3ef29abcacefe8cf8c70003b43748a816ec2
MD5 123adf48e72e83f8ab018bbaef738584
BLAKE2b-256 e5deaba71afaf219322f194d030ea4adf7859ba172106d2873449c881b9484ac

See more details on using hashes here.

File details

Details for the file stationaritytoolkit-2.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for stationaritytoolkit-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c2de93a2f98200ce0bda6b4a983dbc96fd7806408229b07e26dbe5b762fcc5e3
MD5 324a82d4bde4be292c4fb05a18507712
BLAKE2b-256 3e96ea741c31af08c0b7e1c151daf53b6fed09ef7d4a30269b3c8a1f14cb1127

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page