Narayan-Popp ADF Unit Root Test with Two Structural Breaks
Project description
narayanpop
Narayan-Popp ADF Unit Root Test with Two Structural Breaks
A Python implementation of the unit root test with two structural breaks proposed by Narayan and Popp (2010).
Reference
Narayan, P. K. and Popp, S. (2010), "A new unit root test with two structural breaks in level and slope at unknown time", Journal of Applied Statistics, 37(9), 1425-1438.
DOI: 10.1080/02664760903039883
Features
- ✅ Exact replication of the original GAUSS code and paper methodology
- ✅ Two model specifications:
- Model 1 (Model A): Two breaks in level
- Model 2 (Model C): Two breaks in level and trend
- ✅ Sequential break date selection procedure
- ✅ Innovational Outlier (IO) model for gradual breaks
- ✅ Flexible lag selection: AIC, SIC, or t-statistic criterion
- ✅ Publication-ready output formatted for top-tier journals
- ✅ Critical values for various sample sizes (T ≤ 50, 50 < T ≤ 200, 200 < T ≤ 400, T > 400)
- ✅ Panel data support for testing multiple series
Installation
pip install narayanpop
Or install from source:
git clone https://github.com/merwanroudane/narayanpop.git
cd narayanpop
pip install -e .
Quick Start
Basic Usage
import numpy as np
import pandas as pd
from narayanpop import adf_2breaks
# Generate sample data
np.random.seed(42)
y = np.cumsum(np.random.randn(100))
# Run test with Model 1 (breaks in level only)
result = adf_2breaks(y, model=1)
# Print formatted results
print(result.summary())
Working with Time Series Data
import pandas as pd
from narayanpop import adf_2breaks
# Load your data
dates = pd.date_range('1960', periods=100, freq='Y')
y = pd.Series(np.cumsum(np.random.randn(100)), index=dates)
# Run test with Model 2 (breaks in level and trend)
result = adf_2breaks(y, model=2, pmax=8, ic=3, trimm=0.10)
# Access results
print(f"Test Statistic: {result.test_statistic:.4f}")
print(f"First Break: {result.break1}")
print(f"Second Break: {result.break2}")
print(f"Optimal Lag: {result.optimal_lag}")
print(f"Critical Values: {result.critical_values}")
Panel Data Analysis
import pandas as pd
from narayanpop import adf_2breaks_panel
# Load panel data
data = pd.DataFrame({
'GDP': np.cumsum(np.random.randn(100)),
'CPI': np.cumsum(np.random.randn(100)),
'Unemployment': np.cumsum(np.random.randn(100))
})
# Test all series
results_df = adf_2breaks_panel(data, model=1)
print(results_df)
Methodology
Models
Model 1 (Model A): Break in Level
Δy_t = ρy_{t-1} + α_1 + β*t + θ_1*D(TB)_{1,t} + θ_2*D(TB)_{2,t}
+ δ_1*DU_{1,t-1} + δ_2*DU_{2,t-1} + Σβ_j*Δy_{t-j} + ε_t
Model 2 (Model C): Break in Level and Trend
Δy_t = ρy_{t-1} + α* + β*t + κ_1*D(TB)_{1,t} + κ_2*D(TB)_{2,t}
+ δ*_1*DU_{1,t-1} + δ*_2*DU_{2,t-1} + γ*_1*DT_{1,t-1} + γ*_2*DT_{2,t-1}
+ Σβ_j*Δy_{t-j} + ε_t
Where:
DU_{i,t}= 1 if t > TB_i, 0 otherwise (level shift dummy)DT_{i,t}= (t - TB_i) if t > TB_i, 0 otherwise (trend shift dummy)D(TB)_{i,t}= 1 if t = TB_i + 1, 0 otherwise (impulse dummy)
Break Date Selection
The test uses a sequential procedure:
- First Break: Maximize |t_θ1| (Model 1) or |t_κ1| (Model 2)
- Second Break: Conditional on the first, maximize |t_θ2| or |t_κ2|
This approach is computationally efficient (2T operations vs T² for grid search).
Null and Alternative Hypotheses
- H₀: Unit root with structural breaks (y_t is I(1) with breaks)
- H₁: Trend stationary with structural breaks (y_t is I(0) around a broken trend)
Parameters
adf_2breaks(y, model, pmax=8, ic=3, trimm=0.10)
| Parameter | Type | Description | Default |
|---|---|---|---|
y |
array-like | Data series (1D array or pandas Series) | Required |
model |
int | Model specification: 1 (level breaks) or 2 (level & trend breaks) | Required |
pmax |
int | Maximum number of lags for Δy | 8 |
ic |
int | Information criterion: 1 (AIC), 2 (SIC), 3 (t-stat) | 3 |
trimm |
float | Trimming rate for break search (0 < trimm < 0.5) | 0.10 |
Output
ADF2BreaksResult Object
| Attribute | Type | Description |
|---|---|---|
test_statistic |
float | ADF test statistic |
break1 |
int/date | First break location |
break2 |
int/date | Second break location |
optimal_lag |
int | Selected lag length |
critical_values |
dict | Critical values at 1%, 5%, 10% levels |
model |
int | Model specification used |
nobs |
int | Number of observations |
Methods
summary(): Returns formatted output suitable for journal publication
Critical Values
Critical values from Narayan & Popp (2010), Table 3:
Model 1 (Break in Level)
| Sample Size | 1% | 5% | 10% |
|---|---|---|---|
| T ≤ 50 | -5.259 | -4.514 | -4.143 |
| 50 < T ≤ 200 | -4.958 | -4.316 | -3.980 |
| 200 < T ≤ 400 | -4.731 | -4.136 | -3.825 |
| T > 400 | -4.672 | -4.081 | -3.772 |
Model 2 (Break in Level and Trend)
| Sample Size | 1% | 5% | 10% |
|---|---|---|---|
| T ≤ 50 | -5.949 | -5.181 | -4.789 |
| 50 < T ≤ 200 | -5.576 | -4.937 | -4.596 |
| 200 < T ≤ 400 | -5.318 | -4.741 | -4.430 |
| T > 400 | -5.287 | -4.692 | -4.396 |
Examples
Example 1: Nelson-Plosser Data
import pandas as pd
from narayanpop import adf_2breaks
# Real GNP data (1909-1970)
data = pd.read_csv('nelson_plosser.csv', index_col=0)
y = data['Real_GNP']
# Test with Model 2
result = adf_2breaks(y, model=2, pmax=8, ic=3)
print(result.summary())
Output:
======================================================================
Narayan-Popp ADF Unit Root Test with Two Structural Breaks
======================================================================
Model: Model C (Break in Level and Trend)
Number of observations: 62
Optimal lag length: 2
Test Results:
----------------------------------------------------------------------
ADF test statistic: -5.5970
Critical Values:
1% level: -5.9490
5% level: -5.1810
10% level: -4.7890
Structural Breaks:
First break: 1921 (19.35%)
Second break: 1938 (46.77%)
Conclusion: Reject H0 at 5% level: Evidence AGAINST unit root **
======================================================================
Note: *** 1%, ** 5%, * 10% significance levels
H0: Unit root with structural breaks
H1: Trend stationary with structural breaks
======================================================================
Example 2: US Macroeconomic Data
from narayanpop import adf_2breaks
# CPI data (1948-2007)
result = adf_2breaks(cpi_data, model=1, pmax=8, ic=3, trimm=0.10)
if result.test_statistic < result.critical_values['5%']:
print(f"Reject unit root at 5% level")
print(f"Breaks detected at: {result.break1}, {result.break2}")
else:
print("Cannot reject unit root hypothesis")
Example 3: Monte Carlo Simulation
import numpy as np
from narayanpop import adf_2breaks
# Simulation parameters
T = 100
n_sims = 1000
rejections = 0
for i in range(n_sims):
# Generate I(1) data with no breaks
y = np.cumsum(np.random.randn(T))
# Run test
result = adf_2breaks(y, model=1, pmax=8, ic=3)
# Check rejection at 5% level
if result.test_statistic < result.critical_values['5%']:
rejections += 1
print(f"Empirical size at 5% level: {rejections/n_sims:.3f}")
# Should be close to 0.05
Comparison with Related Tests
| Test | Breaks | Under H₀ | Under H₁ | Type |
|---|---|---|---|---|
| Narayan-Popp (2010) | 2 | Yes | Yes | ADF-IO |
| Lee-Strazicich (2003) | 2 | Yes | Yes | LM |
| Lumsdaine-Papell (1997) | 2 | No | Yes | ADF-IO |
| Perron (1989) | 1 | Yes | Yes | ADF-IO |
| Zivot-Andrews (1992) | 1 | No | Yes | ADF-IO |
Key Advantage: Narayan-Popp allows for breaks under both null and alternative hypotheses, avoiding spurious rejections that occur with tests that only allow breaks under H₁.
Testing Strategy
Step 1: Choose Model
- Use Model 1 if only level shifts are expected
- Use Model 2 if both level and trend changes are possible
Step 2: Set Parameters
pmax: Rule of thumb:int(12*(T/100)^{1/4})or 8 for T ≈ 100ic: Use3(t-stat) for general-to-specific approachtrimm: Keep at 0.10 (following Zivot-Andrews, Lumsdaine-Papell)
Step 3: Interpret Results
- Compare test statistic to critical values
- If reject H₀: evidence of trend stationarity with breaks
- Check break dates for economic/historical relevance
- Verify optimal lag is reasonable
Step 4: Robustness Checks
- Try both models
- Vary pmax
- Check sensitivity to trimming rate
Validation
This implementation has been validated against:
- ✅ Original GAUSS code (Saban Nazlioglu)
- ✅ Critical values from Narayan & Popp (2010), Table 3
- ✅ Nelson-Plosser dataset results
- ✅ Monte Carlo simulations for size and power properties
Technical Notes
Innovational Outlier (IO) Model
The IO model assumes breaks occur gradually rather than instantaneously:
- More realistic for economic time series
- Breaks affect the series through the same dynamic process as innovations
- Specified through the inclusion of Ψ*(L) in the deterministic component
Computational Efficiency
- Sequential procedure: ~2T operations
- Grid search: ~T² operations
- For T=100: Sequential is ~50x faster
Trimming
Default trimming (0.10) excludes:
- First 10% of observations from break1 search
- Last 10% of observations from break2 search
- Ensures sufficient observations on each side of breaks
Troubleshooting
Issue: "Data contains missing values"
Solution: Remove or interpolate NaN values before testing
y = y.dropna() # or y.fillna(method='ffill')
Issue: "Optimal lag is 0"
Solution: Normal if series is white noise or pmax too small. Consider:
- Increasing pmax
- Using different ic criterion
- Checking data quality
Issue: "No clear breaks detected"
Solution:
- Try different model specification
- Check if breaks actually exist in data
- Consider single-break tests first
Citation
If you use this package in your research, please cite:
@article{narayan2010unit,
title={A new unit root test with two structural breaks in level and slope at unknown time},
author={Narayan, Paresh Kumar and Popp, Stephan},
journal={Journal of Applied Statistics},
volume={37},
number={9},
pages={1425--1438},
year={2010},
publisher={Taylor \& Francis}
}
@software{narayanpop2024,
author = {Roudane, Merwan},
title = {narayanpop: Python implementation of Narayan-Popp unit root test},
year = {2024},
url = {https://github.com/merwanroudane/narayanpop},
version = {0.0.1}
}
Related Packages
- statsmodels: General econometrics (ADF, KPSS, etc.)
- arch: ARCH/GARCH models and unit root tests
- linearmodels: Panel data econometrics
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/NewFeature) - Commit your changes (
git commit -am 'Add NewFeature') - Push to the branch (
git push origin feature/NewFeature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Author
Dr Merwan Roudane
- Email: merwanroudane920@gmail.com
- GitHub: @merwanroudane
Acknowledgments
- Original GAUSS code by Saban Nazlioglu
- Narayan & Popp (2010) for the methodology
- The econometrics community for valuable feedback
Changelog
Version 0.0.1 (2024)
- Initial release
- Full implementation of Narayan-Popp test
- Model 1 and Model 2 support
- Panel data functionality
- Publication-ready output formatting
Note: This is an independent implementation for research purposes. For commercial applications, please verify results with the original paper and code.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file narayanpop-0.0.1.tar.gz.
File metadata
- Download URL: narayanpop-0.0.1.tar.gz
- Upload date:
- Size: 23.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b34ae16492afdb3093ab783f2a318720c50751f9c7b74ab1526e48125285cfaa
|
|
| MD5 |
b62db6d6296c4cca74da8680dc2651ed
|
|
| BLAKE2b-256 |
cbf6d3e40dbdf569fee188918dd8c11ae79a775c61c632abed80a90f46fb3432
|
File details
Details for the file narayanpop-0.0.1-py3-none-any.whl.
File metadata
- Download URL: narayanpop-0.0.1-py3-none-any.whl
- Upload date:
- Size: 13.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e305e460d54c8c58a4faf8d1fd284a3a19575d0b6d7bbb4e7e94196cb8eeab7
|
|
| MD5 |
806f93d3a64b84467542b291065817b1
|
|
| BLAKE2b-256 |
184feaccb8a2d33a9d5827625ed089052b76ede0028a7f4518887ad62f4e2261
|