Safe, delay-aware SARIMAX with rolling evaluation and AIC-based lag selection

These details have not been verified by PyPI

Project links

Project description

🧭 dynamic-sarimax

Delay-aware SARIMAX wrapper that fixes the common pitfalls of statsmodels.SARIMAX: proper lag alignment for exogenous variables, train-only scaling, and safe rolling-origin evaluation — all built-in.

✨ Why this exists

Plain SARIMAX requires you to hand-align exogenous regressors (e.g. lagged mobility, weather), risking leakage or off-by-one bugs. dynamic-sarimax makes this safe by construction.

Key guarantees

✅ For delay b, trains only on valid pairs (y_t, x_{t-b}) — never imputes missing lags.
✅ Scalers are fit only on training windows during CV.
✅ Forecasting refuses to run if required future exogenous rows are missing.
✅ Rolling-origin evaluation and AIC-based delay selection included.

🚀 Quickstart

# create venv and install deps
poetry install

# run example (uses example CSV under examples/)
poetry run python examples/ili_quickstart.py

from dynamic_sarimax import (
    SarimaxConfig,
    select_delay_by_aic,
    rolling_evaluate,
)

cfg = SarimaxConfig(order=(5,0,2), seasonal_order=(1,0,0,52))
best_b, best_aic = select_delay_by_aic(y_train, x_train, delays=[1,2,3], cfg=cfg)
print(f"Best lag = {best_b}  |  AIC = {best_aic:.2f}")

res = rolling_evaluate(y, x, cfg, delay=best_b, horizons=24, train_frac=0.8)
print(res.head())

📈 Example output

Chosen delay b (on 80% train): 2 | Train AIC: 1234.56

Per-horizon scores (rolling validation on last 20%):
 h  n_origins     MSE  sMAPE
 1         52   0.103   8.12
 2         51   0.109   8.54
 ...

Average MSE   = 0.124
Average sMAPE = 8.77 %

⚙️ Installation

pip install dynamic-sarimax
# or
poetry add dynamic-sarimax

Python ≥ 3.10, tested on 3.10–3.12.

🧩 Components

Module	Purpose
`config.py`	Parameter dataclasses for SARIMAX and lag spec
`features.py`	Safe lagging + scaling transformer
`model.py`	Wrapper around `statsmodels.SARIMAX`
`selection.py`	Delay (lag) selection via AIC
`evaluation.py`	Rolling-origin cross-validation (new v1.2)
`metrics.py`	MSE & sMAPE helpers

🔁 Rolling validation — strategies & knobs

rolling_evaluate is the batteries-included, safe rolling-origin evaluator.

Signature

agg = rolling_evaluate(
    y, X, cfg,
    delay,                # int or None
    horizons,             # int > 0
    train_frac=0.8,
    min_train=30,
    *,
    # exogenous policy
    allow_future_exog=False,
    X_future_manual=None,
    # window strategy
    strategy="expanding",         # "expanding" | "sliding"
    window=None,                  # required if strategy="sliding"
    refit_every=1,                # >1 = refit every k origins
    return_details=False,         # if True returns (agg, details)
)

🧱 Strategies

Strategy	Description
`"expanding"`	Default. Train on `[0..o-1]` for origin `o`. The training window grows over time.
`"sliding"`	Train on last `window` observations `[o-window..o-1]`. `window` must be ≥ `min_train`.

🔁 Refitting cadence

`refit_every`	Behavior
`1` (default)	Refit at every origin (fully independent fits).
`k>1`	Refit every `k` origins; reuse parameters between refits. (Faster)

Future v2 roadmap: optional state reconditioning for partial re-use without full re-fit.

⚖️ Exogenous policy (no-peek by default)

Case	Behavior
`delay=None`	Univariate SARIMAX; forecasts all `horizons`.
`delay=int`, `allow_future_exog=False`	Evaluate at most `steps_eff = min(horizons, delay)` per origin — prevents future X leakage.
`delay=int`, `allow_future_exog=True`	Requires passing `X_future_manual` with the same columns as `X`. Allows full-horizon forecasting.

If delay=0 and allow_future_exog=False, no valid horizon exists → raises RuntimeError (explicitly to prevent silent misuse).

📤 Return values

Mode	Description
Default	Returns aggregate DataFrame (`agg`) with columns `["h", "n_origins", "MSE", "sMAPE"]`.
With `return_details=True`	Returns tuple `(agg, details)`, where `details` has `["origin", "h", "y_true", "y_hat"]`.

agg.attrs always contains:

{
    "macro_MSE": float,
    "macro_sMAPE": float
}

🧪 Usage patterns

1️⃣ Univariate (default expanding window)

cfg = SarimaxConfig(order=(2,0,1), seasonal_order=(0,0,0,0))
agg = rolling_evaluate(y, X=None, cfg=cfg, delay=None, horizons=12, train_frac=0.8)

2️⃣ With exogenous (no-peek, delay-limited)

cfg = SarimaxConfig(order=(1,0,1), seasonal_order=(0,0,0,0))
agg = rolling_evaluate(y, X, cfg, delay=2, horizons=12, allow_future_exog=False)
# => Evaluates only h=1..2 per origin

3️⃣ With exogenous (opt-in future X)

X_future_manual = pd.DataFrame({...})  # Future exogenous block
agg = rolling_evaluate(
    y, X, cfg,
    delay=2, horizons=12,
    allow_future_exog=True,
    X_future_manual=X_future_manual,
)

4️⃣ Sliding window with refit cadence

agg = rolling_evaluate(
    y, X, cfg,
    delay=1, horizons=6,
    strategy="sliding",
    window=96,
    refit_every=4,
)

5️⃣ Detailed results for plotting

agg, details = rolling_evaluate(
    y, X=None, cfg=cfg,
    delay=None, horizons=8,
    return_details=True,
)
# details has origin, h, y_true, y_hat

⚠️ Common errors (by design)

Error	Reason
`ValueError("horizons must be positive")`	Invalid `horizons`.
`ValueError("window must be provided when strategy='sliding'")`	Missing window for sliding mode.
`ValueError("allow_future_exog=True but X_future_manual was not provided.")`	Required future exog missing.
`ValueError("Exogenous columns mismatch...")`	Column mismatch between X and X_future_manual.
`RuntimeError("No evaluations produced...")`	All origins skipped (e.g., delay=0 with no-peek).

📊 Example: Comparing rolling strategies

cfg = SarimaxConfig(order=(2,0,1), seasonal_order=(0,0,0,0))

agg1 = rolling_evaluate(y, X, cfg, delay=1, horizons=6, strategy="expanding")
agg2 = rolling_evaluate(y, X, cfg, delay=1, horizons=6, strategy="sliding", window=80)
agg3 = rolling_evaluate(y, X, cfg, delay=1, horizons=6, strategy="expanding", refit_every=4)

Plot macro averages or per-horizon curves to compare trade-offs between accuracy and runtime.

🧯 Testing

poetry run pytest -q

Comprehensive tests cover:

expanding vs sliding windows
refit cadence (refit_every)
no-peek & future-exog modes
input validation and error cases
optional return-details branch

🗺️ Roadmap (v2)

State reconditioning between refits (partial parameter reuse).
Parallel rolling origins for large datasets.
Custom metric hooks and progress callbacks.

🪞 Project links

📜 License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Oct 10, 2025

0.1.1

Oct 9, 2025

0.1.0

Oct 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dynamic_sarimax-1.0.0.tar.gz (21.1 kB view details)

Uploaded Oct 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dynamic_sarimax-1.0.0-py3-none-any.whl (22.0 kB view details)

Uploaded Oct 10, 2025 Python 3

File details

Details for the file dynamic_sarimax-1.0.0.tar.gz.

File metadata

Download URL: dynamic_sarimax-1.0.0.tar.gz
Upload date: Oct 10, 2025
Size: 21.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.4 CPython/3.12.3 Linux/6.14.0-32-generic

File hashes

Hashes for dynamic_sarimax-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`f8b66ecf46756d3a778f76adff6f96e20ef6a3b220805c0e80021361b2ec99bb`
MD5	`b8e769942a1647da5463f5bd1b9cddac`
BLAKE2b-256	`363692c31f6932081432bc0df88b8c7f6a2bf6109887b0ddb7723f8282ac7e67`

See more details on using hashes here.

File details

Details for the file dynamic_sarimax-1.0.0-py3-none-any.whl.

File metadata

Download URL: dynamic_sarimax-1.0.0-py3-none-any.whl
Upload date: Oct 10, 2025
Size: 22.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.4 CPython/3.12.3 Linux/6.14.0-32-generic

File hashes

Hashes for dynamic_sarimax-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`299b16dbe9be10e824206098e93ef8c59e929d9c5e847a7b16304bf889a598c2`
MD5	`e18138b49293ea1cb404b71cd94a7fff`
BLAKE2b-256	`9ef7cddd58fa3812bf81506b2fc0ba33a180d1fb472cc2722a10d60e6afa237d`

See more details on using hashes here.

dynamic-sarimax 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🧭 dynamic-sarimax

✨ Why this exists

🚀 Quickstart

📈 Example output

⚙️ Installation

🧩 Components

🔁 Rolling validation — strategies & knobs

Signature

🧱 Strategies

🔁 Refitting cadence

⚖️ Exogenous policy (no-peek by default)

📤 Return values

🧪 Usage patterns

1️⃣ Univariate (default expanding window)

2️⃣ With exogenous (no-peek, delay-limited)

3️⃣ With exogenous (opt-in future X)

4️⃣ Sliding window with refit cadence

5️⃣ Detailed results for plotting

⚠️ Common errors (by design)

📊 Example: Comparing rolling strategies

🧯 Testing

🗺️ Roadmap (v2)

🪞 Project links

📜 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes