Residual-Based Fully Modified Vector Autoregression for mixtures of I(0), I(1), and I(2) processes
Project description
RBFM-VAR: Residual-Based Fully Modified Vector Autoregression
A Python package for estimating and testing Vector Autoregression (VAR) models with unknown mixtures of I(0), I(1), and I(2) components.
Overview
This package implements the Residual-Based Fully Modified Vector Autoregression (RBFM-VAR) estimator proposed by:
Chang, Y. (2000). "Vector Autoregressions with Unknown Mixtures of I(0), I(1), and I(2) Components." Econometric Theory, 16(6), 905-926.
Key Features
- ✅ No Pretesting Required: No need to determine the exact order of integration or cointegration relationships beforehand
- ✅ Mixed Integration Orders: Handles I(0), I(1), and I(2) processes simultaneously
- ✅ Flexible Cointegration: Allows for various cointegration forms including multicointegration
- ✅ Optimal Inference: Provides optimal inference in the sense of Phillips (1991)
- ✅ Modified Wald Tests: Implements modified Wald tests with better finite-sample properties
- ✅ Granger Causality: Direct testing of Granger causality in nonstationary systems
Why RBFM-VAR?
Traditional VAR estimation methods require:
- Pretesting for unit roots
- Determining cointegration ranks
- Specifying error correction models
RBFM-VAR eliminates these steps while maintaining optimal asymptotic properties!
Installation
From Source
git clone https://github.com/merwanroudane/RBFMVAR.git
cd RBFMVAR
pip install -e .
Requirements
- Python >= 3.7
- NumPy >= 1.20.0
- SciPy >= 1.7.0
- Pandas >= 1.3.0
- Matplotlib >= 3.3.0 (optional, for plotting)
Quick Start
import numpy as np
from rbfmvar import RBFMVAREstimator, RBFMWaldTest, format_summary_table, format_test_results
# Load your data (T x n matrix)
data = np.loadtxt('your_data.csv', delimiter=',')
# Fit RBFM-VAR model with lag order p=2
model = RBFMVAREstimator(data, p=2, kernel='bartlett')
model.fit()
# View model summary
summary = model.summary()
print(format_summary_table(summary))
# Test Granger causality: Does variable 0 cause variables 1 and 2?
test = RBFMWaldTest(model)
result = test.test_granger_causality(
causing_vars=[0],
caused_vars=[1, 2],
alpha=0.05
)
print(format_test_results(result))
# Generate forecasts
forecasts = model.predict(steps=10)
print(f"10-step ahead forecasts:\n{forecasts}")
Detailed Examples
Example 1: Basic VAR Estimation
import numpy as np
from rbfmvar import RBFMVAREstimator
# Simulate I(1) VAR data
np.random.seed(42)
T = 200
n = 3
errors = np.random.normal(0, 1, (T, n))
data = np.cumsum(errors, axis=0) # I(1) process
# Estimate RBFM-VAR
model = RBFMVAREstimator(data, p=2)
model.fit()
# Check coefficients
print("Phi (stationary component):")
print(model.Phi_plus)
print("\nA (nonstationary component):")
print(model.A_plus)
Example 2: Granger Causality Testing
from rbfmvar import RBFMVAREstimator, RBFMWaldTest
# Fit model
model = RBFMVAREstimator(data, p=2)
model.fit()
# Create test object
test = RBFMWaldTest(model)
# Test if variable 0 Granger-causes variable 1
result = test.test_granger_causality(
causing_vars=[0],
caused_vars=[1]
)
if result['reject']:
print(f"Variable 0 Granger-causes variable 1 (p={result['p_value']:.4f})")
else:
print(f"No Granger causality detected (p={result['p_value']:.4f})")
Example 3: Model Selection and Diagnostics
from rbfmvar import select_lag_order, portmanteau_test, arch_test
# Select optimal lag order
optimal_p = select_lag_order(data, max_lag=10, criterion='bic')
print(f"Optimal lag order: {optimal_p}")
# Fit model with optimal lag
model = RBFMVAREstimator(data, p=optimal_p)
model.fit()
# Check residual autocorrelation
Q_stat, p_value = portmanteau_test(model.residuals, lags=10)
print(f"Portmanteau test: Q={Q_stat:.2f}, p-value={p_value:.4f}")
# Check for ARCH effects
arch_stat, arch_p = arch_test(model.residuals, lags=4)
print(f"ARCH test: LM={arch_stat:.2f}, p-value={arch_p:.4f}")
Example 4: Forecasting
# Fit model
model = RBFMVAREstimator(data, p=2)
model.fit()
# Generate multi-step forecasts
forecast_horizon = 20
forecasts = model.predict(steps=forecast_horizon)
# Plot forecasts (requires matplotlib)
import matplotlib.pyplot as plt
fig, axes = plt.subplots(3, 1, figsize=(12, 8))
for i in range(3):
axes[i].plot(data[-50:, i], label='Actual', color='blue')
axes[i].plot(range(len(data), len(data) + forecast_horizon),
forecasts[:, i], label='Forecast', color='red', linestyle='--')
axes[i].set_title(f'Variable {i+1}')
axes[i].legend()
axes[i].grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
Methodology
The Model
Consider a p-th order VAR:
$$y_t = A_1 y_{t-1} + \cdots + A_p y_{t-p} + \varepsilon_t$$
where $y_t$ is an n-dimensional vector that may contain a mixture of I(0), I(1), and I(2) processes.
RBFM-VAR Estimation
The RBFM-VAR estimator reformulates the model as:
$$y_t = \Phi z_t + A w_t + \varepsilon_t$$
where:
- $z_t = (\Delta^2 y_{t-1}, \ldots, \Delta^2 y_{t-p+2})'$ are known stationary regressors
- $w_t = (\Delta y_{t-1}, y_{t-1})'$ are potentially nonstationary regressors
The estimator applies corrections for:
- Endogeneity between errors and regressors
- Serial correlation induced by differencing
Asymptotic Properties
Theorem 1 (Chang 2000):
- Stationary component: $\sqrt{T}(\hat{\Phi}^+ - \Phi) \rightarrow_d N(0, \Sigma_{\varepsilon\varepsilon} \otimes \Sigma_{x_1 x_1}^{-1})$
- Nonstationary component: Has mixed normal limit distribution
Theorem 2 (Modified Wald Test):
For certain linear restrictions, the modified Wald statistic converges to:
$$W_F^+ \rightarrow_d \chi^2_{q_1(q_\Phi + q_{A_1})} + \sum_{i=1}^{q_1} d_i \chi^2_{q_{A_b}(i)}$$
where $0 \leq d_i \leq 1$ are eigenvalues depending on long-run covariances.
Key advantage: The limit distribution is bounded above by $\chi^2$ with known degrees of freedom, enabling conservative tests without nuisance parameter dependence!
API Reference
Main Classes
RBFMVAREstimator
RBFMVAREstimator(data, p, kernel='bartlett', bandwidth=None)
Parameters:
data(np.ndarray): (T x n) data matrixp(int): VAR lag orderkernel(str): Kernel for long-run covariance estimation. Options: 'bartlett', 'parzen', 'quadratic_spectral', 'tukey_hanning'bandwidth(int or None): Bandwidth parameter (None for automatic selection)
Methods:
fit(): Estimate the modelpredict(steps): Generate forecastssummary(): Get model summary statistics
RBFMWaldTest
RBFMWaldTest(estimator)
Methods:
test_granger_causality(causing_vars, caused_vars, alpha): Test Granger causalitytest_linear_restriction(R1, R2, r, alpha): Test general linear restrictionstest_coefficient_restriction(equation_idx, variable_idx, lag, value, alpha): Test individual coefficients
Utility Functions
select_lag_order(data, max_lag, criterion): Select optimal VAR lag orderportmanteau_test(residuals, lags): Test for residual autocorrelationarch_test(residuals, lags): Test for ARCH effectsstability_check(Phi, A, p): Check VAR stabilityplot_residual_diagnostics(residuals): Create diagnostic plots
Advanced Topics
Custom Kernel Functions
The package supports multiple kernel functions for long-run covariance estimation:
- Bartlett (Newey-West): Triangular kernel, good general-purpose choice
- Parzen: Higher-order kernel with better bias properties
- Quadratic Spectral: Optimal rate of convergence (Andrews 1991)
- Tukey-Hanning: Popular in spectral analysis
# Use Quadratic Spectral kernel with automatic bandwidth
model = RBFMVAREstimator(data, p=2, kernel='quadratic_spectral')
model.fit()
Bandwidth Selection
The package implements Andrews (1991) automatic bandwidth selection:
from rbfmvar.kernel_estimators import KernelCovarianceEstimator
estimator = KernelCovarianceEstimator(kernel='bartlett')
bandwidth = estimator.select_bandwidth_andrews(residuals)
print(f"Selected bandwidth: {bandwidth}")
Simulation Studies
The examples/simulations.py file contains Monte Carlo simulations replicating the results from Chang (2000), Section 5.
Key findings:
- RBFM-VAR has lower bias and variance than OLS-VAR
- Modified Wald test has better size properties than standard Wald test
- Performance improves with sample size as predicted by theory
Testing
Run the test suite:
pytest tests/ -v --cov=rbfmvar
Citation
If you use this package in your research, please cite:
@article{chang2000vector,
title={Vector Autoregressions with Unknown Mixtures of I(0), I(1), and I(2) Components},
author={Chang, Yoosoon},
journal={Econometric Theory},
volume={16},
number={6},
pages={905--926},
year={2000},
publisher={Cambridge University Press},
doi={10.1017/S0266466600166046}
}
For the Python implementation:
@software{roudane2024rbfmvar,
author = {Roudane, Merwan},
title = {RBFM-VAR: Python Implementation of Chang (2000)},
year = {2024},
url = {https://github.com/merwanroudane/RBFMVAR}
}
References
-
Chang, Y. (2000). Vector Autoregressions with Unknown Mixtures of I(0), I(1), and I(2) Components. Econometric Theory, 16(6), 905-926.
-
Phillips, P.C.B. (1995). Fully Modified Least Squares and Vector Autoregression. Econometrica, 63(5), 1023-1078.
-
Phillips, P.C.B. (1991). Optimal Inference in Cointegrated Systems. Econometrica, 59(2), 283-306.
-
Johansen, S. (1995). A Statistical Analysis of Cointegration for I(2) Variables. Econometric Theory, 11(1), 25-59.
-
Andrews, D.W.K. (1991). Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation. Econometrica, 59(3), 817-858.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contact
Dr. Merwan Roudane
- Email: merwanroudane920@gmail.com
- GitHub: @merwanroudane
Acknowledgments
This implementation is based on the groundbreaking work of Professor Yoosoon Chang (Rice University). The author thanks Professor Chang for developing this elegant methodology.
Disclaimer
This package is provided "as is" without warranty of any kind. Users are responsible for verifying results and ensuring appropriate use for their specific applications.
Note: This is an independent implementation and is not officially affiliated with or endorsed by the original paper's author.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rbfmvar-1.0.1.tar.gz.
File metadata
- Download URL: rbfmvar-1.0.1.tar.gz
- Upload date:
- Size: 51.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b06af37ade2e751d890a6be07796039d82cb3f210e7bc684eceb8f6c7d427570
|
|
| MD5 |
9d1c1f94715ff83d616400309a9202a8
|
|
| BLAKE2b-256 |
f8b160b1c50dc5ee7e32d64f4e7a5f37ad88387c0aaa2685a0dd94893ac1d066
|
File details
Details for the file rbfmvar-1.0.1-py3-none-any.whl.
File metadata
- Download URL: rbfmvar-1.0.1-py3-none-any.whl
- Upload date:
- Size: 32.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
469c01d1be95ff32f26c7bdb2dd7bcff7e0b6be255c1bf00990d2821c499f870
|
|
| MD5 |
dc55642bc7c4b192103176ce665d9655
|
|
| BLAKE2b-256 |
b869f629466b549228618d98f2f78555e918721d203071f610d322ae3e706732
|