Skip to main content

Narayan-Popp ADF Unit Root Test with Two Structural Breaks

Project description

narayanpop

Python Version License: MIT

Narayan-Popp ADF Unit Root Test with Two Structural Breaks

A Python implementation of the unit root test with two structural breaks proposed by Narayan and Popp (2010).

Reference

Narayan, P. K. and Popp, S. (2010), "A new unit root test with two structural breaks in level and slope at unknown time", Journal of Applied Statistics, 37(9), 1425-1438.

DOI: 10.1080/02664760903039883

Features

  • Exact replication of the original GAUSS code and paper methodology
  • Two model specifications:
    • Model 1 (Model A): Two breaks in level
    • Model 2 (Model C): Two breaks in level and trend
  • Sequential break date selection procedure
  • Innovational Outlier (IO) model for gradual breaks
  • Flexible lag selection: AIC, SIC, or t-statistic criterion
  • Publication-ready output formatted for top-tier journals
  • Critical values for various sample sizes (T ≤ 50, 50 < T ≤ 200, 200 < T ≤ 400, T > 400)
  • Panel data support for testing multiple series

Installation

pip install narayanpop

Or install from source:

git clone https://github.com/merwanroudane/narayanpop.git
cd narayanpop
pip install -e .

Quick Start

Basic Usage

import numpy as np
import pandas as pd
from narayanpop import adf_2breaks

# Generate sample data
np.random.seed(42)
y = np.cumsum(np.random.randn(100))

# Run test with Model 1 (breaks in level only)
result = adf_2breaks(y, model=1)

# Print formatted results
print(result.summary())

Working with Time Series Data

import pandas as pd
from narayanpop import adf_2breaks

# Load your data
dates = pd.date_range('1960', periods=100, freq='Y')
y = pd.Series(np.cumsum(np.random.randn(100)), index=dates)

# Run test with Model 2 (breaks in level and trend)
result = adf_2breaks(y, model=2, pmax=8, ic=3, trimm=0.10)

# Access results
print(f"Test Statistic: {result.test_statistic:.4f}")
print(f"First Break: {result.break1}")
print(f"Second Break: {result.break2}")
print(f"Optimal Lag: {result.optimal_lag}")
print(f"Critical Values: {result.critical_values}")

Panel Data Analysis

import pandas as pd
from narayanpop import adf_2breaks_panel

# Load panel data
data = pd.DataFrame({
    'GDP': np.cumsum(np.random.randn(100)),
    'CPI': np.cumsum(np.random.randn(100)),
    'Unemployment': np.cumsum(np.random.randn(100))
})

# Test all series
results_df = adf_2breaks_panel(data, model=1)
print(results_df)

Methodology

Models

Model 1 (Model A): Break in Level

Δy_t = ρy_{t-1} + α_1 + β*t + θ_1*D(TB)_{1,t} + θ_2*D(TB)_{2,t} 
       + δ_1*DU_{1,t-1} + δ_2*DU_{2,t-1} + Σβ_j*Δy_{t-j} + ε_t

Model 2 (Model C): Break in Level and Trend

Δy_t = ρy_{t-1} + α* + β*t + κ_1*D(TB)_{1,t} + κ_2*D(TB)_{2,t}
       + δ*_1*DU_{1,t-1} + δ*_2*DU_{2,t-1} + γ*_1*DT_{1,t-1} + γ*_2*DT_{2,t-1}
       + Σβ_j*Δy_{t-j} + ε_t

Where:

  • DU_{i,t} = 1 if t > TB_i, 0 otherwise (level shift dummy)
  • DT_{i,t} = (t - TB_i) if t > TB_i, 0 otherwise (trend shift dummy)
  • D(TB)_{i,t} = 1 if t = TB_i + 1, 0 otherwise (impulse dummy)

Break Date Selection

The test uses a sequential procedure:

  1. First Break: Maximize |t_θ1| (Model 1) or |t_κ1| (Model 2)
  2. Second Break: Conditional on the first, maximize |t_θ2| or |t_κ2|

This approach is computationally efficient (2T operations vs T² for grid search).

Null and Alternative Hypotheses

  • H₀: Unit root with structural breaks (y_t is I(1) with breaks)
  • H₁: Trend stationary with structural breaks (y_t is I(0) around a broken trend)

Parameters

adf_2breaks(y, model, pmax=8, ic=3, trimm=0.10)

Parameter Type Description Default
y array-like Data series (1D array or pandas Series) Required
model int Model specification: 1 (level breaks) or 2 (level & trend breaks) Required
pmax int Maximum number of lags for Δy 8
ic int Information criterion: 1 (AIC), 2 (SIC), 3 (t-stat) 3
trimm float Trimming rate for break search (0 < trimm < 0.5) 0.10

Output

ADF2BreaksResult Object

Attribute Type Description
test_statistic float ADF test statistic
break1 int/date First break location
break2 int/date Second break location
optimal_lag int Selected lag length
critical_values dict Critical values at 1%, 5%, 10% levels
model int Model specification used
nobs int Number of observations

Methods

  • summary(): Returns formatted output suitable for journal publication

Critical Values

Critical values from Narayan & Popp (2010), Table 3:

Model 1 (Break in Level)

Sample Size 1% 5% 10%
T ≤ 50 -5.259 -4.514 -4.143
50 < T ≤ 200 -4.958 -4.316 -3.980
200 < T ≤ 400 -4.731 -4.136 -3.825
T > 400 -4.672 -4.081 -3.772

Model 2 (Break in Level and Trend)

Sample Size 1% 5% 10%
T ≤ 50 -5.949 -5.181 -4.789
50 < T ≤ 200 -5.576 -4.937 -4.596
200 < T ≤ 400 -5.318 -4.741 -4.430
T > 400 -5.287 -4.692 -4.396

Examples

Example 1: Nelson-Plosser Data

import pandas as pd
from narayanpop import adf_2breaks

# Real GNP data (1909-1970)
data = pd.read_csv('nelson_plosser.csv', index_col=0)
y = data['Real_GNP']

# Test with Model 2
result = adf_2breaks(y, model=2, pmax=8, ic=3)
print(result.summary())

Output:

======================================================================
Narayan-Popp ADF Unit Root Test with Two Structural Breaks
======================================================================
Model: Model C (Break in Level and Trend)
Number of observations: 62
Optimal lag length: 2

Test Results:
----------------------------------------------------------------------
ADF test statistic:    -5.5970

Critical Values:
  1% level:     -5.9490
  5% level:     -5.1810
 10% level:     -4.7890

Structural Breaks:
  First break:  1921 (19.35%)
  Second break: 1938 (46.77%)

Conclusion: Reject H0 at 5% level: Evidence AGAINST unit root **
======================================================================
Note: *** 1%, ** 5%, * 10% significance levels
H0: Unit root with structural breaks
H1: Trend stationary with structural breaks
======================================================================

Example 2: US Macroeconomic Data

from narayanpop import adf_2breaks

# CPI data (1948-2007)
result = adf_2breaks(cpi_data, model=1, pmax=8, ic=3, trimm=0.10)

if result.test_statistic < result.critical_values['5%']:
    print(f"Reject unit root at 5% level")
    print(f"Breaks detected at: {result.break1}, {result.break2}")
else:
    print("Cannot reject unit root hypothesis")

Example 3: Monte Carlo Simulation

import numpy as np
from narayanpop import adf_2breaks

# Simulation parameters
T = 100
n_sims = 1000
rejections = 0

for i in range(n_sims):
    # Generate I(1) data with no breaks
    y = np.cumsum(np.random.randn(T))
    
    # Run test
    result = adf_2breaks(y, model=1, pmax=8, ic=3)
    
    # Check rejection at 5% level
    if result.test_statistic < result.critical_values['5%']:
        rejections += 1

print(f"Empirical size at 5% level: {rejections/n_sims:.3f}")
# Should be close to 0.05

Comparison with Related Tests

Test Breaks Under H₀ Under H₁ Type
Narayan-Popp (2010) 2 Yes Yes ADF-IO
Lee-Strazicich (2003) 2 Yes Yes LM
Lumsdaine-Papell (1997) 2 No Yes ADF-IO
Perron (1989) 1 Yes Yes ADF-IO
Zivot-Andrews (1992) 1 No Yes ADF-IO

Key Advantage: Narayan-Popp allows for breaks under both null and alternative hypotheses, avoiding spurious rejections that occur with tests that only allow breaks under H₁.

Testing Strategy

Step 1: Choose Model

  • Use Model 1 if only level shifts are expected
  • Use Model 2 if both level and trend changes are possible

Step 2: Set Parameters

  • pmax: Rule of thumb: int(12*(T/100)^{1/4}) or 8 for T ≈ 100
  • ic: Use 3 (t-stat) for general-to-specific approach
  • trimm: Keep at 0.10 (following Zivot-Andrews, Lumsdaine-Papell)

Step 3: Interpret Results

  1. Compare test statistic to critical values
  2. If reject H₀: evidence of trend stationarity with breaks
  3. Check break dates for economic/historical relevance
  4. Verify optimal lag is reasonable

Step 4: Robustness Checks

  • Try both models
  • Vary pmax
  • Check sensitivity to trimming rate

Validation

This implementation has been validated against:

  1. Original GAUSS code (Saban Nazlioglu)
  2. Critical values from Narayan & Popp (2010), Table 3
  3. Nelson-Plosser dataset results
  4. Monte Carlo simulations for size and power properties

Technical Notes

Innovational Outlier (IO) Model

The IO model assumes breaks occur gradually rather than instantaneously:

  • More realistic for economic time series
  • Breaks affect the series through the same dynamic process as innovations
  • Specified through the inclusion of Ψ*(L) in the deterministic component

Computational Efficiency

  • Sequential procedure: ~2T operations
  • Grid search: ~T² operations
  • For T=100: Sequential is ~50x faster

Trimming

Default trimming (0.10) excludes:

  • First 10% of observations from break1 search
  • Last 10% of observations from break2 search
  • Ensures sufficient observations on each side of breaks

Troubleshooting

Issue: "Data contains missing values"

Solution: Remove or interpolate NaN values before testing

y = y.dropna()  # or y.fillna(method='ffill')

Issue: "Optimal lag is 0"

Solution: Normal if series is white noise or pmax too small. Consider:

  • Increasing pmax
  • Using different ic criterion
  • Checking data quality

Issue: "No clear breaks detected"

Solution:

  • Try different model specification
  • Check if breaks actually exist in data
  • Consider single-break tests first

Citation

If you use this package in your research, please cite:

@article{narayan2010unit,
  title={A new unit root test with two structural breaks in level and slope at unknown time},
  author={Narayan, Paresh Kumar and Popp, Stephan},
  journal={Journal of Applied Statistics},
  volume={37},
  number={9},
  pages={1425--1438},
  year={2010},
  publisher={Taylor \& Francis}
}

@software{narayanpop2024,
  author = {Roudane, Merwan},
  title = {narayanpop: Python implementation of Narayan-Popp unit root test},
  year = {2024},
  url = {https://github.com/merwanroudane/narayanpop},
  version = {0.0.1}
}

Related Packages

  • statsmodels: General econometrics (ADF, KPSS, etc.)
  • arch: ARCH/GARCH models and unit root tests
  • linearmodels: Panel data econometrics

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/NewFeature)
  3. Commit your changes (git commit -am 'Add NewFeature')
  4. Push to the branch (git push origin feature/NewFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Dr Merwan Roudane

Acknowledgments

  • Original GAUSS code by Saban Nazlioglu
  • Narayan & Popp (2010) for the methodology
  • The econometrics community for valuable feedback

Changelog

Version 0.0.1 (2024)

  • Initial release
  • Full implementation of Narayan-Popp test
  • Model 1 and Model 2 support
  • Panel data functionality
  • Publication-ready output formatting

Note: This is an independent implementation for research purposes. For commercial applications, please verify results with the original paper and code.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

narayanpop-0.0.1.tar.gz (23.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

narayanpop-0.0.1-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file narayanpop-0.0.1.tar.gz.

File metadata

  • Download URL: narayanpop-0.0.1.tar.gz
  • Upload date:
  • Size: 23.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for narayanpop-0.0.1.tar.gz
Algorithm Hash digest
SHA256 b34ae16492afdb3093ab783f2a318720c50751f9c7b74ab1526e48125285cfaa
MD5 b62db6d6296c4cca74da8680dc2651ed
BLAKE2b-256 cbf6d3e40dbdf569fee188918dd8c11ae79a775c61c632abed80a90f46fb3432

See more details on using hashes here.

File details

Details for the file narayanpop-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: narayanpop-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for narayanpop-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7e305e460d54c8c58a4faf8d1fd284a3a19575d0b6d7bbb4e7e94196cb8eeab7
MD5 806f93d3a64b84467542b291065817b1
BLAKE2b-256 184feaccb8a2d33a9d5827625ed089052b76ede0028a7f4518887ad62f4e2261

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page