Enterprise-grade pre-training identifiability diagnostics for regime-switching models
Project description
Identifiability Diagnostic Framework
Overview
An enterprise-grade, production-ready unsupervised pre-training geometric diagnostic system for identifying practical identifiability boundaries in 2-regime Switching State-Space Models (S-SSMs).
This framework implements a novel algorithm that constructs an Observable Moment Matrix from interventional data and applies Singular Value Decomposition (SVD) to determine whether regime parameters are identifiable from observations alone.
Key Features
- ๐ฏ SVD-based Identifiability Analysis: Automatically detects parameter identifiability via spectral metrics
- ๐ Regime-Switching Support: Full support for multi-regime autoregressive processes with exogenous interventions
- ๐ Comprehensive Metrics: Effective rank, condition numbers, singular values, deployment decisions
- ๐ก๏ธ Production-Grade: Full type hints, error handling, logging, and extensive test coverage
- ๐ CI/CD Ready: GitHub Actions workflows, automated testing, code quality checks
- ๐ฆ Easy Installation: PyPI-ready package with full dependency management
- ๐ป Cross-Platform: Runs on Linux, macOS, and Windows
Installation
From PyPI (Coming Soon)
pip install identifiability-diagnostic
From Source
git clone https://github.com/yourorg/identifiability-diagnostic.git
cd identifiability-diagnostic
pip install -e .
Development Installation
pip install -e ".[dev]"
Quick Start
Basic Usage
from identifiability_diagnostic import (
RegimeSwitchingGenerator,
PretrainingDiagnosticPipeline
)
from identifiability_diagnostic.utils.config import load_config
# Load configuration
config = load_config('config/parameters.yaml')
# Generate synthetic data
generator = RegimeSwitchingGenerator(config)
y, u = generator.generate(epsilon=1e-3, symmetric=False)
# Run diagnostic pipeline
pipeline = PretrainingDiagnosticPipeline(config)
metrics = pipeline.run(y, u)
# Interpret results
print(f"ฯ_2 = {metrics['sigma_2']:.4e}")
print(f"Decision: {metrics['deploy_decision']}")
print(f"Identifiable: {metrics['identifiable']}")
Command-Line Interface
# Run parametric sweep
python main.py --config config/parameters.yaml --log-level INFO
# Run with custom epsilon values
python main.py --epsilon 1e-5 1e-4 1e-3 1e-2 1e-1
# Test mode (smaller dataset)
python main.py --test --verbose
# Symmetric control only
python main.py --symmetric-only
Algorithm Overview
The framework implements a 7-step diagnostic pipeline:
- Compute First-Order Innovations: dy_t = y_t - y_{t-1}
- Lift to Feature Space: v_t = [dy_t, dy_{t-1}]^T
- Align Interventional Context: Match intervention levels to innovation timeline
- Construct Moment Matrix: M โ โ^{|U| ร 4} with conditional covariances
- Execute SVD: Singular Value Decomposition of M
- Compute Metrics: Effective rank, singular values, condition numbers
- Deploy Gate: ฯ_2 > ฯ determines identifiability
Mathematical Foundation
For regime-switching model:
y_t = a_{s_t} y_{t-1} + b_{s_t} u_t + c_{s_t} u_t y_{t-1} + ฮท_t
The Observable Moment Matrix captures covariance structure stratified by intervention level. SVD spectrum determines whether regimes can be distinguished from data.
Key Decision Rule:
- If ฯ_2 > threshold (ฯ = 5.12e-17): Identifiable โ Deploy estimator
- If ฯ_2 โค threshold: Non-identifiable โ Abort training
Project Structure
identifiability-diagnostic/
โโโ .github/workflows/ # CI/CD configurations
โโโ src/
โ โโโ identifiability_diagnostic/
โ โโโ __init__.py
โ โโโ core/ # Core algorithms
โ โ โโโ data_generator.py
โ โ โโโ pipeline.py
โ โ โโโ metrics.py
โ โโโ utils/ # Utilities
โ โ โโโ config.py
โ โ โโโ logging.py
โ โโโ exceptions.py
โโโ tests/
โ โโโ unit/ # Unit tests (20+ tests)
โ โโโ integration/ # Integration tests (10+ tests)
โโโ examples/ # Usage examples
โโโ config/
โ โโโ parameters.yaml # Default configuration
โโโ main.py # CLI entry point
โโโ setup.py # Package setup
โโโ pyproject.toml # Modern packaging config
โโโ requirements.txt # Dependencies
โโโ README.md # This file
Testing
Run All Tests
pytest tests/ -v --cov=src/identifiability_diagnostic
Unit Tests Only
pytest tests/unit -v
Integration Tests Only
pytest tests/integration -v
With Coverage Report
pytest tests/ --cov=src/identifiability_diagnostic --cov-report=html
Test suite includes:
- โ Data generation tests
- โ Pipeline execution tests
- โ Metrics computation tests
- โ Configuration validation tests
- โ End-to-end workflow tests
- โ Edge case handling
- โ Numerical stability tests
Configuration
Default configuration in config/parameters.yaml:
simulation:
T: 100000 # Time series length
seed: 42 # Random seed
sigma_n: 0.5 # Process noise std dev
regime_0:
a: 0.90 # AR coefficient regime 0
b: 1.0 # Intervention coupling
c: 0.0 # Bilinear term
regime_1:
a: 0.90 # AR coefficient regime 1
b: 1.0
c: 0.2 # Asymmetric bilinear!
diagnostic:
u_levels: [-2, -1, 0, 1, 2] # Intervention grid
threshold_tau: 5.12e-17 # Gating threshold
Examples
See examples/basic_usage.py for detailed examples:
python examples/basic_usage.py
Examples include:
- Basic workflow (single epsilon)
- Parametric sweep (multiple epsilon values)
- Symmetric control test
- Custom configuration
API Reference
RegimeSwitchingGenerator
gen = RegimeSwitchingGenerator(config)
y, u = gen.generate(epsilon=1e-3, symmetric=False, verbose=True)
results = gen.batch_generate([1e-5, 1e-4, 1e-3])
PretrainingDiagnosticPipeline
pipeline = PretrainingDiagnosticPipeline(config)
metrics = pipeline.run(y, u, verbose=True)
batch_results = pipeline.batch_run(data_dict)
MetricsCalculator
calc = MetricsCalculator(threshold_tau=5.12e-17)
metrics = calc.compute_metrics(moment_matrix)
eff_rank = calc.compute_effective_rank(singular_values)
Performance
Benchmark results on Intel i7 @ 3.6GHz:
| Task | Data Size | Time |
|---|---|---|
| Data Generation (T=100k) | 100,000 obs | ~50 ms |
| Pipeline Execution | 100,000 obs | ~80 ms |
| Full Diagnostic Cycle | 100,000 obs | ~130 ms |
| Parametric Sweep (5 ฮต values) | 500,000 obs | ~650 ms |
Code Quality
- Coverage: >85% test coverage
- Type Hints: Full type annotations throughout
- Linting: flake8, black, isort, pylint
- Static Analysis: mypy type checking
- Documentation: Comprehensive docstrings and comments
CI/CD Pipeline
Automated workflows for:
- โ Unit & integration tests (Python 3.8-3.11)
- โ Multi-platform testing (Ubuntu, macOS, Windows)
- โ Code quality checks (flake8, black, mypy, pylint)
- โ Security scanning (bandit, safety)
- โ Coverage reporting (Codecov)
- โ Package building and verification
Contributing
See CONTRIBUTING.md for guidelines.
License
MIT License - see LICENSE file for details.
Citation
If you use this framework in research, please cite:
@software{identifiability_diagnostic_2024,
title={Identifiability Diagnostic Framework for Regime-Switching Models},
author={MLOps Team},
year={2024},
url={https://github.com/yourorg/identifiability-diagnostic}
}
Support
- ๐ Documentation
- ๐ Issue Tracker
- ๐ฌ Discussions
Authors
MLOps Team
Acknowledgments
This framework implements novel identifiability diagnostics for regime-switching state-space models using Observable Moment Matrix analysis and SVD-based metrics.
Last Updated: 2024
Status: Production Ready โ
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file identifiability_diagnostic-1.0.0.tar.gz.
File metadata
- Download URL: identifiability_diagnostic-1.0.0.tar.gz
- Upload date:
- Size: 28.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bd0d6af11ac4e24424a934a91a74de71db1af3ff496740c22a5c24a4fbc45c0
|
|
| MD5 |
2592812b467d69f407ee85fefe6da075
|
|
| BLAKE2b-256 |
7135e72511adac82f6d4c654f788d5447a495316247d9abc1024e8dc374105c0
|
File details
Details for the file identifiability_diagnostic-1.0.0-py3-none-any.whl.
File metadata
- Download URL: identifiability_diagnostic-1.0.0-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4b42bd6cd59401526db3c487072ff6d8844f0ee7b086a62ff84f74835f6d2a8
|
|
| MD5 |
80b216015ef18d38997b988ac6883c2f
|
|
| BLAKE2b-256 |
c15b0470eedc2abc5c8e664f0353c0560eab271193b9825842848636b2eeec47
|