Skip to main content

Python Module for Cointegration Tests with Two Endogenous Structural Breaks

Project description

PMCT: Python Module for Cointegration Tests with Two Endogenous Structural Breaks

Python Version License: GPL v3

A Python package implementing three residual-based cointegration tests that account for two unknown regime shifts, following the methodology of Hatemi-J (2008).

Overview

Testing for long-run relationships between time series variables while accounting for structural breaks is crucial in econometric analysis. This package provides a comprehensive implementation of cointegration tests with two endogenous structural breaks, where the timing of each break is determined endogenously through the testing procedure.

Key Features

  • Three residual-based tests: Modified ADF, Phillips Za, and Phillips Zt
  • Endogenous break detection: Automatically identifies the timing of two structural breaks
  • Multiple model specifications: Support for level shifts, trend breaks, and regime shifts
  • Flexible lag selection: Multiple criteria (AIC, BIC, downward-t, or pre-specified)
  • Easy-to-use API: Simple function calls with comprehensive output
  • Pandas integration: Works seamlessly with pandas DataFrames and Series

Installation

From PyPI (when available)

pip install pmct

From Source

git clone https://github.com/merwanroudane/pmct.git
cd pmct
pip install -e .

Requirements

  • Python >= 3.7
  • NumPy >= 1.19.0
  • Pandas >= 1.1.0

Quick Start

import numpy as np
from pmct import cointegration_test_2breaks

# Load your data
# y: dependent variable (n x 1)
# x: independent variable(s) (n x k)

# Run the cointegration test
results = cointegration_test_2breaks(
    y=y, 
    x=x, 
    model=4,        # Regime shift model (C/S)
    max_lag=2,      # Maximum lag for ADF test
    lag_selection=2 # Use AIC for lag selection
)

# Display results
print(results)

# Access specific results
print(f"ADF statistic: {results.adf_statistic:.4f}")
print(f"First break point: {results.adf_break1:.4f}")
print(f"Second break point: {results.adf_break2:.4f}")

Model Specifications

The package supports three model specifications:

Model 2: Level Shift (C)

y_t = α_0 + α_1·D1_t + α_2·D2_t + β·x_t + u_t

Model 3: Level Shift with Trend (C/T)

y_t = α_0 + α_1·D1_t + α_2·D2_t + γ·t + β·x_t + u_t

Model 4: Regime Shift (C/S)

y_t = α_0 + α_1·D1_t + α_2·D2_t + β_0·x_t + β_1·D1_t·x_t + β_2·D2_t·x_t + u_t

Where:

  • D1_t and D2_t are dummy variables for structural breaks
  • t is a time trend
  • α, β, and γ are parameters to be estimated

Usage Examples

Example 1: Basic Usage

import pandas as pd
from pmct import cointegration_test_2breaks

# Load data from CSV
data = pd.read_csv('your_data.csv')
y = data['dependent_var'].values.reshape(-1, 1)
x = data['independent_var'].values.reshape(-1, 1)

# Run test with default settings
results = cointegration_test_2breaks(y, x)
print(results.summary())

Example 2: Multiple Independent Variables

from pmct import cointegration_test_2breaks

# Multiple independent variables
y = data[['y']].values
x = data[['x1', 'x2', 'x3']].values

# Run test with BIC lag selection
results = cointegration_test_2breaks(
    y=y,
    x=x,
    model=4,
    max_lag=4,
    lag_selection=3  # BIC
)

# Access detailed results
print(f"ADF test statistic: {results.adf_statistic:.4f}")
print(f"Za test statistic: {results.za_statistic:.4f}")
print(f"Zt test statistic: {results.zt_statistic:.4f}")

Example 3: Using Helper Function for CSV Files

from pmct import load_data_from_csv, cointegration_test_2breaks

# Load data directly from CSV
y, x = load_data_from_csv(
    'data.csv',
    y_col=0,      # First column is dependent variable
    x_cols=[1, 2] # Columns 1 and 2 are independent variables
)

results = cointegration_test_2breaks(y, x, model=4)
print(results)

Example 4: Interpreting Break Points

from pmct import cointegration_test_2breaks
import pandas as pd

# Assuming you have a date index
dates = pd.date_range('2010-01-01', periods=len(y), freq='D')

results = cointegration_test_2breaks(y, x)

# Convert break points to actual dates
n_obs = len(y)
break1_idx = int(results.adf_break1 * n_obs)
break2_idx = int(results.adf_break2 * n_obs)

print(f"First structural break: {dates[break1_idx]}")
print(f"Second structural break: {dates[break2_idx]}")

API Reference

Main Function

cointegration_test_2breaks(y, x, model=4, max_lag=2, lag_selection=2, trim=0.15)

Conduct cointegration tests with two endogenous structural breaks.

Parameters:

  • y : array-like, shape (n, 1)
    • Dependent variable
  • x : array-like, shape (n, k)
    • Independent variable(s)
  • model : int, default=4
    • Model specification (2, 3, or 4)
  • max_lag : int, default=2
    • Maximum lag order for ADF test
  • lag_selection : int, default=2
    • Lag selection criterion:
      • 1: Pre-specified (uses max_lag)
      • 2: AIC (Akaike Information Criterion)
      • 3: BIC (Bayesian Information Criterion)
      • 4: Downward-t selection
  • trim : float, default=0.15
    • Trimming percentage for break point search

Returns:

  • CointegrationResults object containing:
    • Test statistics (ADF, Za, Zt)
    • Break points for each test
    • Parameter estimates
    • Standard errors
    • t-statistics

CointegrationResults Class

The results object provides:

Attributes:

  • adf_statistic, za_statistic, zt_statistic: Test statistics
  • adf_break1, adf_break2: Break points (as fraction of sample)
  • za_break1, za_break2: Break points from Za test
  • zt_break1, zt_break2: Break points from Zt test
  • coefficients: Estimated parameters
  • standard_errors: Standard errors of parameters
  • t_statistics: t-statistics for parameters
  • adf_lag: Optimal lag length

Methods:

  • summary(): Returns formatted summary string
  • __str__(): Prints formatted results

Critical Values

To determine statistical significance, compare the test statistics with critical values from Hatemi-J (2008, Table 1, page 501). The critical values depend on:

  1. The number of independent variables (k)
  2. The significance level (1%, 5%, or 10%)
  3. The test being used (ADF, Za, or Zt)

Example Critical Values (k=1)

Test 1% 5% 10%
ADF* -6.503 -6.015 -5.653
Za* -6.503 -6.015 -5.653
Zt* -90.794 -76.003 -52.232

Note: For complete critical value tables, please refer to the original paper.

Methodology

This package implements the methodology developed by Hatemi-J (2008), which extends the cointegration testing framework to account for two structural breaks. The key innovation is the endogenous determination of break points through a grid search procedure that minimizes the test statistics.

Testing Procedure

  1. Grid Search: The algorithm searches over all possible combinations of two break points within the specified trimming range.

  2. For each combination:

    • Construct dummy variables
    • Build the regression matrix according to the model specification
    • Compute the three test statistics (ADF, Za, Zt)
  3. Optimal breaks: The break points that minimize each test statistic are selected as the optimal breaks.

  4. Final estimation: Parameters are estimated using the optimal break points.

Test Statistics

Modified ADF Test:

ADF* = inf_{(τ1,τ2)∈T} ADF(τ1, τ2)

Modified Phillips Tests:

Za* = inf_{(τ1,τ2)∈T} Za(τ1, τ2)
Zt* = inf_{(τ1,τ2)∈T} Zt(τ1, τ2)

Where τ1 and τ2 are the relative timing of the breaks, and T is the search space.

Practical Considerations

Sample Size

The tests require sufficient observations to reliably detect structural breaks. A minimum of 100 observations is recommended, though more observations improve power.

Trimming

The trim parameter (default 0.15) ensures breaks are not searched for too close to the sample endpoints. This is important for:

  • Maintaining adequate observations in each regime
  • Ensuring reliable parameter estimation
  • Avoiding spurious break detection

Lag Selection

Proper lag selection is crucial for the ADF test:

  • AIC (lag_selection=2): Tends to select more lags, better for capturing dynamics
  • BIC (lag_selection=3): More parsimonious, penalizes additional lags more heavily
  • Downward-t (lag_selection=4): Sequential testing approach
  • Pre-specified (lag_selection=1): When you have prior knowledge about the appropriate lag length

Model Selection

  • Model 2 (C): When you expect shifts in the intercept only
  • Model 3 (C/T): When there's a deterministic trend in addition to level shifts
  • Model 4 (C/S): When you expect the relationship between variables to change (regime shifts)

Model 4 is the most flexible and is typically recommended as the default choice.

Real-World Application Example

Financial Market Integration Study

import pandas as pd
import numpy as np
from pmct import cointegration_test_2breaks

# Load financial data
data = pd.read_csv('financial_data.csv', parse_dates=['Date'])
data.set_index('Date', inplace=True)

# Convert to log prices
gold_log = np.log(data['Gold_Price'].values).reshape(-1, 1)
stock_log = np.log(data['World_Stock_Index'].values).reshape(-1, 1)

# Test for cointegration with structural breaks
results = cointegration_test_2breaks(
    y=gold_log,
    x=stock_log,
    model=4,  # Regime shift model
    max_lag=4,
    lag_selection=2  # AIC
)

# Display results
print(results)

# Interpret breaks
n = len(gold_log)
break1_idx = int(results.adf_break1 * n)
break2_idx = int(results.adf_break2 * n)

print(f"\nFirst structural break: {data.index[break1_idx]}")
print(f"Second structural break: {data.index[break2_idx]}")

# Compare with critical values
if results.adf_statistic < -6.015:  # 5% critical value for k=1
    print("\nReject null hypothesis: Evidence of cointegration with structural breaks")
else:
    print("\nFail to reject null hypothesis: No evidence of cointegration")

Testing

Run the test suite:

pytest tests/

Run with coverage:

pytest --cov=pmct tests/

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

git clone https://github.com/merwanroudane/pmct.git
cd pmct
pip install -e ".[dev]"

Citation

If you use this package in your research, please cite:

Software:

@software{pmct2024,
  author = {Roudane, Merwan},
  title = {PMCT: Python Module for Cointegration Tests with Two Endogenous Structural Breaks},
  year = {2024},
  url = {https://github.com/merwanroudane/pmct},
  version = {1.0.0}
}

Methodology:

@article{hatemi2008tests,
  title={Tests for cointegration with two unknown regime shifts with an application to financial market integration},
  author={Hatemi-J, Abdulnasser},
  journal={Empirical Economics},
  volume={35},
  number={3},
  pages={497--505},
  year={2008},
  publisher={Springer},
  doi={10.1007/s00181-007-0175-9}
}

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Acknowledgments

  • Prof. Abdulnasser Hatemi-J for developing the original methodology
  • Dr. Alan Mustafa for the initial Python/GAUSS implementation
  • Original paper: Hatemi-J, A. (2008). Tests for cointegration with two unknown regime shifts with an application to financial market integration. Empirical Economics, 35(3), 497-505.

References

  1. Hatemi-J, A. (2008). Tests for cointegration with two unknown regime shifts with an application to financial market integration. Empirical Economics, 35(3), 497-505. https://doi.org/10.1007/s00181-007-0175-9

  2. Gregory, A. W., & Hansen, B. E. (1996). Residual-based tests for cointegration in models with regime shifts. Journal of Econometrics, 70(1), 99-126.

  3. Phillips, P. C., & Ouliaris, S. (1990). Asymptotic properties of residual based tests for cointegration. Econometrica, 58(1), 165-193.

  4. Dickey, D. A., & Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association, 74(366a), 427-431.

Contact

Dr. Merwan Roudane

Support

If you encounter any problems or have questions:

  1. Check the documentation
  2. Search existing issues
  3. Create a new issue if needed

Note: This package implements rigorous econometric tests. Ensure you understand the underlying methodology and assumptions before interpreting results for research or policy decisions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pmct-1.0.0.tar.gz (23.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pmct-1.0.0-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file pmct-1.0.0.tar.gz.

File metadata

  • Download URL: pmct-1.0.0.tar.gz
  • Upload date:
  • Size: 23.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for pmct-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d458718bde831a63378d1f2d54c0d6585c2a368b79f57cf5dbae15b742729159
MD5 70718106432db1e3eae9b4193104d07a
BLAKE2b-256 0526b5743fbefed3b4fa3476c3439745ebae5cec042ea8f313eef5ffd18af450

See more details on using hashes here.

File details

Details for the file pmct-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: pmct-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for pmct-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c01d6f9ebf214edc11c7ed1aa449d4edda0809e4a9ef60c8739685a499c5dce9
MD5 13ff73937261210e7bbfb252776095b6
BLAKE2b-256 2a66b81770a7029d6a8968376933ce1455582be84cd147c7a3f433ed2fe9f805

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page