A time-series forecasting package with an intuitive API capable of modeling short time series with prior knowledge derived from a similar long time series.
Project description
vangja
A Bayesian time series forecasting package that extends Facebook Prophet with hierarchical modeling and transfer learning capabilities. Vangja enables practitioners to model short time series using prior knowledge derived from similar long time series and is particularly good at forecasting horizons longer than the available data.
The package has been inspired by:
- Facebook Prophet
- Facebook Prophet implementation in PyMC3
- TimeSeers
- Modeling short time series with prior knowledge
- Modeling short time series with prior knowledge - PyMC
Key Features
- 🚀 Vectorized Multi-Series Fitting — Fit multiple time series simultaneously with vectorized computations, significantly faster than fitting sequentially with Facebook Prophet
- 📊 Hierarchical Bayesian Modeling — Model multiple related time series with flexible pooling strategies (complete, partial, or individual) for each component
- 🔄 Bayesian Transfer Learning — Learn from long time series and transfer knowledge to short time series, enabling accurate long-horizon forecasts from limited data
- ↔️ Bidirectional Changepoints — Interpret trend changepoints from right-to-left (in addition to left-to-right), essential for hierarchical modeling of time series with different lengths
- 🎯 Component-Level Flexibility — Independently configure pooling strategies and transfer learning methods for each model component (trend, seasonalities, etc.)
Installation
You need to create a conda PyMC environment before installing vangja. The recommended way of installing PyMC is by running:
conda create -c conda-forge -n pymc_env python=3.13 "pymc>=5.27.1"
Install vangja with pip:
pip install vangja
Usage
The data used for fitting the models is expected to be in the same format as the data used for fitting the Facebook Prophet model i.e. it should be a pandas dataframe, where the timestamp is stored in column ds and the value is stored in column y.
The API is heavily inspired by TimeSeers. A simple model consisting of a linear trend, a yearly seasonality and a weekly seasonality can be fitted like this:
from vangja import LinearTrend, FourierSeasonality
model = LinearTrend() + FourierSeasonality(365.25, 10) + FourierSeasonality(7, 3)
model.fit(data)
predictions = model.predict(horizon=365)
Vectorized Multi-Series Fitting
Unlike Facebook Prophet, which fits time series one at a time, vangja can fit multiple time series simultaneously using vectorized computations. This is significantly faster when you have many related time series:
# Data must have a 'series' column identifying each time series
# Example: sales data from multiple stores
multi_series_data = pd.DataFrame({
'ds': [...], # timestamps
'y': [...], # values
'series': ['store_A', 'store_A', ..., 'store_B', 'store_B', ...]
})
# Fit all series at once with independent parameters (no pooling)
model = LinearTrend(pool_type="individual") + FourierSeasonality(365.25, 10, pool_type="individual")
model.fit(multi_series_data)
Multiplicative operators
There are two types of multiplicative operators that vangja supports. The first one supports creating models from components $g(t)$ and $s(t)$ in the form $y(t)=g(t) * (1 + s(t))$. Using vangja, this can be written by using the __pow__ operator:
model = LinearTrend() ** FourierSeasonality(365.25, 10)
The second multiplicative operator supports creating models from components $g(t)$ and $s(t)$ in the form $y(t)=g(t) * s(t)$. Using vangja, this can be written by using the __mul__ operator:
model = LinearTrend() * FourierSeasonality(365.25, 10)
Components
Currently, vangja supports the following components:
LinearTrend
A piecewise linear trend with changepoints. Vangja extends Prophet's trend component with bidirectional changepoint interpretation.
LinearTrend(
n_changepoints=25, # Number of potential changepoints
changepoint_range=0.8, # Proportion of data for changepoint placement
slope_mean=0, # Prior mean for initial slope
slope_sd=5, # Prior std for initial slope
intercept_mean=0, # Prior mean for intercept
intercept_sd=5, # Prior std for intercept
delta_mean=0, # Prior mean for changepoint adjustments
delta_sd=0.05, # Prior std for changepoint adjustments
delta_side="left", # "left" or "right" - direction for changepoint interpretation
pool_type="complete", # Pooling: "complete", "partial", or "individual"
delta_pool_type="complete", # Separate pooling strategy for changepoints
tune_method=None # Transfer learning: "parametric" or "prior_from_idata"
)
Bidirectional Changepoints (delta_side)
By default (delta_side="left"), the slope parameter controls the trend slope at the earliest timestamp, and changepoints modify the slope going forward in time.
Setting delta_side="right" reverses this: the slope parameter controls the trend slope at the latest timestamp, and changepoints modify the slope going backward in time. This is essential for hierarchical modeling when you have:
- A long time series spanning many years
- Multiple short time series that only cover the recent period
With delta_side="right", the slope parameter is informed by both the long and short time series (since they overlap at the end), rather than only the long time series (which alone covers the beginning).
FourierSeasonality
Seasonal patterns modeled using Fourier series.
FourierSeasonality(
period, # Period in days (e.g., 365.25 for yearly)
series_order, # Number of Fourier terms (higher = more flexible)
beta_mean=0, # Prior mean for Fourier coefficients
beta_sd=10, # Prior std for Fourier coefficients
pool_type="complete", # Pooling: "complete", "partial", or "individual"
tune_method=None # Transfer learning: "parametric" or "prior_from_idata"
)
NormalConstant
A constant term with a Normal prior, useful for baseline offsets.
NormalConstant(
mu=0, # Prior mean
sigma=1, # Prior standard deviation
pool_type="complete", # Pooling: "complete", "partial", or "individual"
tune_method=None # Transfer learning: "parametric" or "prior_from_idata"
)
UniformConstant
A constant term with a Uniform prior.
UniformConstant(
lower=0, # Lower bound
upper=1, # Upper bound
pool_type="complete", # Pooling: "complete", "partial", or "individual"
tune_method=None # Transfer learning: "parametric" or "prior_from_idata"
)
FlatTrend
A constant-level baseline (intercept only, no slope, no changepoints). Useful when the time series has no discernible trend or is too short to estimate one reliably.
FlatTrend(
intercept_mean=0, # Prior mean for intercept
intercept_sd=5, # Prior std for intercept
pool_type="complete", # Pooling: "complete", "partial", or "individual"
tune_method=None # Transfer learning: "parametric" or "prior_from_idata"
)
BetaConstant
A constant term with a scaled Beta prior, bounded between [lower, upper].
BetaConstant(
lower=0, # Lower bound for scaling
upper=1, # Upper bound for scaling
alpha=2, # Beta distribution alpha parameter
beta=2, # Beta distribution beta parameter
pool_type="complete", # Pooling: "complete", "partial", or "individual"
tune_method=None # Transfer learning: "parametric" or "prior_from_idata"
)
Pooling Types (Hierarchical Modeling)
When modeling multiple time series together, you can control how parameters are shared using hierarchical Bayesian modeling. This is inspired by TimeSeers but with greater flexibility — vangja allows different pooling strategies for each component and even for different parameters within the same component.
"complete": All series share the same parameters. Best when series are very similar."partial": Hierarchical pooling with shared hyperpriors — parameters are drawn from a common distribution but can differ between series. This balances information sharing with individual variation."individual": Each series has completely independent parameters. Equivalent to fitting each series separately, but vectorized for speed.
Note: The pandas dataframe must have a series column that identifies which rows belong to which time series.
# Example: Hierarchical model with different pooling per component
# - Trend slope: partial pooling (similar but not identical across series)
# - Changepoints: complete pooling (shared across all series)
# - Yearly seasonality: complete pooling (same seasonal pattern)
# - Weekly seasonality: partial pooling (similar weekly patterns)
model = (
LinearTrend(pool_type="partial", delta_pool_type="complete")
+ FourierSeasonality(365.25, 10, pool_type="complete")
+ FourierSeasonality(7, 3, pool_type="partial")
)
model.fit(multi_series_data)
Why Different Pooling Strategies?
Unlike TimeSeers which applies the same pooling to all components, vangja lets you choose based on domain knowledge:
- Yearly seasonality: Often similar across related series → use
"complete" - Weekly seasonality: May vary by series (e.g., different stores have different weekly patterns) → use
"partial" - Trend slope: Usually similar for related series → use
"partial" - Changepoints: When dealing with a long context series and short target series, changepoints are only observable in the long series → use
"complete"
Model Tuning (Bayesian Transfer Learning)
A core feature of vangja is the ability to transfer knowledge from a long time series to multiple short time series. This is particularly useful when:
- You have only a few months of data but need to model yearly seasonality
- You want to forecast a horizon longer than your available short time series
- You have a "context" time series (e.g., market index) and want to use it to inform forecasts for related series (e.g., individual stocks)
Forecasting short time series is challenging because:
- Long-period seasonalities (e.g., yearly) cannot be estimated from short data
- Overfitting is likely when the forecast horizon exceeds the data length
- Standard methods like Facebook Prophet will produce unreliable forecasts
Vangja implements Bayesian transfer learning: fit a model on a long time series, extract the posterior distributions of parameters, and use them as informed priors when fitting short time series.
Transfer Learning Methods
There are two tuning methods available:
1. Parametric Transfer ("parametric")
Uses the posterior mean (you can also set the mode, or any other value that you need) and standard deviation from the fitted model to set new priors while keeping the same distribution form:
# Step 1: Fit on long time series
base_model = (
LinearTrend(tune_method="parametric")
+ FourierSeasonality(365.25, 10, tune_method="parametric")
)
base_model.fit(long_time_series, method="nuts", samples=1000, chains=4)
# Step 2: Transfer to short time series
# The posterior from step 1 becomes the prior for step 2
target_model = (
LinearTrend(tune_method="parametric")
+ FourierSeasonality(365.25, 10, tune_method="parametric")
)
target_model.fit(short_time_series, idata=base_model.trace)
# Step 3: Forecast with confidence
predictions = target_model.predict(horizon=365) # Can forecast beyond the short series length!
2. Prior from InferenceData ("prior_from_idata")
Uses the full posterior samples via multivariate normal approximation, preserving correlations between parameters:
base_model = (
LinearTrend(tune_method="prior_from_idata")
+ FourierSeasonality(365.25, 10, tune_method="prior_from_idata")
)
base_model.fit(long_time_series, method="nuts", samples=1000, chains=4)
target_model = (
LinearTrend(tune_method="prior_from_idata")
+ FourierSeasonality(365.25, 10, tune_method="prior_from_idata")
)
target_model.fit(short_time_series, idata=base_model.trace)
This method captures parameter dependencies (e.g., correlation between trend slope and seasonality amplitude) that the parametric method ignores.
Combining Hierarchical Modeling with Transfer Learning
Vangja uniquely allows you to combine both approaches:
# Step 1: Fit base model on long "context" time series
base_model = (
LinearTrend(tune_method="parametric")
+ FourierSeasonality(365.25, 10, tune_method="parametric")
)
base_model.fit(long_context_series, method="nuts", samples=1000, chains=4)
# Step 2: Combine short target time series
target_data = pd.concat([
short_series_1.assign(series='target_1'),
short_series_2.assign(series='target_2'),
])
# Step 3: Hierarchical model with transfer learning on targets
target_model = (
LinearTrend(
pool_type="partial", # Hierarchical pooling
delta_side="right", # Slope parameter informed by all series
tune_method="parametric" # Transfer from context to targets
)
+ FourierSeasonality(365.25, 10, pool_type="complete", tune_method="parametric")
)
# Fit targets using posterior from context as priors
target_model.fit(target_data, idata=base_model.trace)
predictions = target_model.predict(horizon=365)
Regularization for Transfer Learning
To prevent overfitting when transferring knowledge, vangja supports regularization via the loss_factor_for_tune parameter. This adds a penalty term that constrains parameters to stay close to the values learned from the long time series:
FourierSeasonality(
365.25, 10,
tune_method="parametric",
loss_factor_for_tune=1.0 # Higher = stronger regularization toward context series
)
Plotting
After fitting, you can visualize the model components:
# Make predictions
predictions = model.predict(horizon=365)
# Plot the overall model fit and forecast
model.plot(predictions)
# Plot with actuals overlaid
model.plot(predictions, y_true=test_data)
Metrics
Evaluate forecast accuracy using built-in metrics:
from vangja.utils import metrics
# Compare actual vs predicted values
results = metrics(test_data, predictions, pool_type="complete")
# Returns MSE, RMSE, MAE, MAPE per series
Vangja vs Facebook Prophet vs TimeSeers
| Feature | Facebook Prophet | TimeSeers | Vangja |
|---|---|---|---|
| Single time series | ✅ | ✅ | ✅ |
| Vectorized multi-series | ❌ | ✅ | ✅ |
| Hierarchical Bayesian | ❌ | ✅ | ✅ |
| Per-component pooling | ❌ | ❌ | ✅ |
| Bidirectional changepoints | ❌ | ❌ | ✅ |
| Transfer learning | ❌ | ❌ | ✅ |
| Parametric prior transfer | ❌ | ❌ | ✅ |
| Multivariate Gaussian prior | ❌ | ❌ | ✅ |
| Regularization for transfer | ❌ | ❌ | ✅ |
| Modern PyMC (5.x) | ❌ | ❌ | ✅ |
Inference Methods
Vangja supports multiple inference methods:
# MAP estimation (fast, recommended for quick results)
model.fit(data, method="mapx") # Uses JAX backend via pymc-extras
# Full Bayesian inference with MCMC
model.fit(data, method="nuts", samples=1000, chains=4)
# Variational inference
model.fit(data, method="advi", samples=1000)
Contributing
Pull requests and suggestions are always welcome. Please open an issue on the issue list before submitting in order to avoid doing unnecessary work.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vangja-0.2.4.tar.gz.
File metadata
- Download URL: vangja-0.2.4.tar.gz
- Upload date:
- Size: 90.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53da8f62351a03557c08b986790552f138aae0ee5916ac2076d0a66fbe9dc3b3
|
|
| MD5 |
019042bcef46c4929f02fd96bc0a4bdb
|
|
| BLAKE2b-256 |
cb2e0ced7a20b4fd5c94c89a5715a8d6386e17694e74bf46b3592d4bcdbb803f
|
Provenance
The following attestation bundles were made for vangja-0.2.4.tar.gz:
Publisher:
python-publish.yml on jovan-krajevski/vangja
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vangja-0.2.4.tar.gz -
Subject digest:
53da8f62351a03557c08b986790552f138aae0ee5916ac2076d0a66fbe9dc3b3 - Sigstore transparency entry: 1106509159
- Sigstore integration time:
-
Permalink:
jovan-krajevski/vangja@ae258329ace5a4d5fb8295961587ca6bfc56a5e5 -
Branch / Tag:
refs/tags/0.2.4 - Owner: https://github.com/jovan-krajevski
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@ae258329ace5a4d5fb8295961587ca6bfc56a5e5 -
Trigger Event:
release
-
Statement type:
File details
Details for the file vangja-0.2.4-py3-none-any.whl.
File metadata
- Download URL: vangja-0.2.4-py3-none-any.whl
- Upload date:
- Size: 73.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78c5e4c91bdd572bfa900b613a678625073406a70343df2c4fb8c1cc1f49a3d8
|
|
| MD5 |
8eb5b8f623307b3a4be7a58c19606006
|
|
| BLAKE2b-256 |
39da5a93e37f06b00bf5c38c6fbfd2dc612658f249ca7a6e64373b474864043f
|
Provenance
The following attestation bundles were made for vangja-0.2.4-py3-none-any.whl:
Publisher:
python-publish.yml on jovan-krajevski/vangja
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vangja-0.2.4-py3-none-any.whl -
Subject digest:
78c5e4c91bdd572bfa900b613a678625073406a70343df2c4fb8c1cc1f49a3d8 - Sigstore transparency entry: 1106509214
- Sigstore integration time:
-
Permalink:
jovan-krajevski/vangja@ae258329ace5a4d5fb8295961587ca6bfc56a5e5 -
Branch / Tag:
refs/tags/0.2.4 - Owner: https://github.com/jovan-krajevski
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@ae258329ace5a4d5fb8295961587ca6bfc56a5e5 -
Trigger Event:
release
-
Statement type: