Institutional-grade hierarchical portfolio optimization (HRP, HERC, NCO) with data loading, metrics, and backtesting — by Anagatam Technologies
Project description
🌳 Canopy
The Institutional Hierarchical Portfolio Optimization Engine
HRP · HERC · NCO — Three algorithms. One facade. Zero matrix inversions.
Documentation · PyPI · Wiki · Release Notes · Disclaimer
Canopy is an open-source, institutional-grade Python library for hierarchical portfolio allocation. It implements three algorithms — HRP, HERC, and NCO — with four covariance estimators, four risk measures, walk-forward backtesting, and a full compliance audit trail.
One facade. One import. One line to optimal weights.
from canopy.MasterCanopy import MasterCanopy
weights = MasterCanopy(method='HRP', cov_estimator='ledoit_wolf').cluster(returns).allocate()
[!NOTE] Canopy Pro — featuring next-generation hierarchical algorithms (HRCP, HERC-DRL, Spectral NCO, Bayesian HRP), 12+ risk measures, real-time streaming covariance, and enterprise support — is under active development. 📩 Sign up for early access →
Table of Contents
- Why Canopy?
- Quick Start
- Examples
- Algorithms
- Covariance Estimators
- Risk Measures & Portfolio Modes
- New in v3.0
- Performance Benchmarks
- Architecture
- Installation
- Documentation
- Canopy Pro
- License & Disclaimer
Why Canopy?
| What | Why it matters | |
|---|---|---|
| 🏗️ | Three algorithms, one facade | HRP, HERC, NCO — each with distinct risk-return properties. Switch with one parameter. |
| 📐 | Four covariance estimators | Ledoit-Wolf, Marchenko-Pastur denoising, EWMA, detoning. The covariance is the portfolio. |
| 📊 | Four risk measures | Variance, CVaR, CDaR, MAD — HERC allocates across clusters using the measure you choose. |
| 🔍 | Full audit trail | ISO 8601 timestamped. Export as JSON. MiFID II / SEC Rule 15c3-5 compliant traceability. |
| 🧪 | Zero matrix inversion | HRP never inverts Σ. Stable even when condition number > 10⁸. |
| ⚡ | Fast | HRP: 11ms, HERC: 17ms, NCO: 46ms on 20 assets. Pure NumPy/SciPy. |
Quick Start
pip install canopy-optimizer
import yfinance as yf
from canopy.MasterCanopy import MasterCanopy
# Fetch → Optimize → Allocate
data = yf.download(['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'JPM'], start='2020-01-01')
returns = data['Close'].pct_change().dropna()
opt = MasterCanopy(method='HRP', cov_estimator='ledoit_wolf')
weights = opt.cluster(returns).allocate()
print(weights)
AAPL 0.1824
MSFT 0.2016
GOOGL 0.1953
AMZN 0.1892
JPM 0.2315
Examples
📊 DataLoader — Zero-Boilerplate Data Pipeline
from canopy.data import DataLoader
from canopy.MasterCanopy import MasterCanopy
# One-liner: fetch Nifty stocks + benchmark
returns, nifty = DataLoader.yfinance(
['RELIANCE.NS', 'TCS.NS', 'HDFCBANK.NS', 'INFY.NS', 'ICICIBANK.NS',
'SBIN.NS', 'BHARTIARTL.NS', 'KOTAKBANK.NS', 'LT.NS', 'ITC.NS'],
start='2021-01-01',
benchmark='^NSEI' # Auto-fetches Nifty 50
)
opt = MasterCanopy(method='HRP')
weights = opt.cluster(returns).allocate()
print(weights)
🎯 HERC — Tail-Risk-Aware Allocation with CVaR
opt = MasterCanopy(
method='HERC',
cov_estimator='denoised', # Marchenko-Pastur denoising
risk_measure='cvar', # CVaR for tail-risk-aware allocation
detone=True, # Remove market mode for better clustering
min_weight=0.01, # UCITS-compliant floor
max_weight=0.10 # UCITS-compliant ceiling
)
weights = opt.cluster(returns).allocate()
🔬 NCO — Full Audit Trail for Compliance
opt = MasterCanopy(
method='NCO',
cov_estimator='ledoit_wolf',
max_k=8, # Up to 8 clusters
)
weights = opt.cluster(returns).allocate()
# Institutional-grade audit
print(opt.summary()) # Human-readable report
audit = opt.tojson() # Machine-readable JSON for compliance
diag = opt.diagnostics() # Eigenvalue + condition number analysis
print(f"Condition Number: {diag['covariance']['condition_number']:.0f}")
📈 Full Backtest + Performance Metrics
from canopy.data import DataLoader
from canopy.MasterCanopy import MasterCanopy
from canopy.metrics import PortfolioMetrics
from canopy.backtest import BacktestEngine
# Load data
returns, sp500 = DataLoader.yfinance(
['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'NVDA', 'META',
'JPM', 'JNJ', 'PG', 'KO'],
start='2020-01-01',
benchmark='^GSPC'
)
# Walk-forward backtest with monthly rebalancing
engine = BacktestEngine(
optimizer=MasterCanopy(method='HRP', cov_estimator='ledoit_wolf'),
frequency='monthly',
lookback=252,
)
result = engine.run(returns)
print(result.summary())
# Performance analytics
opt = MasterCanopy(method='HRP', cov_estimator='ledoit_wolf')
weights = opt.cluster(returns).allocate()
pm = PortfolioMetrics(returns, weights, benchmark=sp500)
print(f"Sharpe Ratio : {pm.sharpe():.3f}")
print(f"Sortino Ratio : {pm.sortino():.3f}")
print(f"Max Drawdown : {pm.maxdrawdown():.2%}")
print(f"CVaR (5%) : {pm.cvar():.4f}")
print(f"Information Ratio : {pm.informationratio():.3f}")
print(pm.report()) # Full formatted report
Algorithms
Canopy implements three hierarchical allocation algorithms. Each solves the portfolio construction problem differently:
| Algorithm | Method | Key Property | Speed |
|---|---|---|---|
| HRP | Recursive bisection under inverse-variance risk parity. No Σ⁻¹ required. | Maximum stability | 11 ms |
| HERC | Two-stage: inter-cluster risk parity + intra-cluster inverse-variance. 4 risk measures. | Cluster-aware diversification | 17 ms |
| NCO | Tikhonov-regularized nested optimization: (Σ_k + λI)⁻¹ · 1 |
Lowest tail risk | 46 ms |
Cumulative Returns: India — Canopy vs Nifty 50
Cumulative Returns: US — Canopy vs S&P 500
Covariance Estimators
The covariance matrix is the single most important input. Canopy provides four institutional-grade estimators:
| Estimator | Formula | When to Use |
|---|---|---|
| Sample | Σ̂ = (1/T)·Rᵀ·R |
Baseline. T/N ratio > 10× |
| Ledoit-Wolf | Σ_LW = α·F + (1−α)·Σ̂ |
Default. Reduces estimation error ~40% |
| Denoised | Marchenko-Pastur RMT: clip eigenvalues below λ₊ = σ²(1+√(N/T))² |
High-noise. N/T > 0.5 |
| EWMA | Σ_t = λ·Σ_{t-1} + (1−λ)·rₜ·rₜᵀ |
Regime-adaptive |
Detoning (Lopez de Prado, 2020): Removes the market mode (first eigenvector) before clustering for more discriminative sector-level grouping.
Risk Measures & Portfolio Modes
HERC Inter-Cluster Risk Allocation
| Measure | Use Case |
|---|---|
| Variance | Classic Raffinot (2017). Symmetric risk budgeting. |
| CVaR | Tail risk. Allocates away from crash-prone clusters. |
| CDaR | Drawdown risk. Penalizes deep underwater periods. |
| MAD | Robust to outliers. No squared deviations. |
Portfolio Modes
| Mode | Constraint | Use Case |
|---|---|---|
long_only |
wᵢ ≥ 0 |
Mutual funds, ETFs, pensions, UCITS |
long_short |
Σwᵢ = 1 |
Hedge funds, 130/30 strategies |
market_neutral |
Σwᵢ = 0 |
Statistical arbitrage |
Dendrogram & Cluster Analysis
Canopy builds a full hierarchical clustering tree using 7 linkage methods (Ward, Single, Complete, Average, Weighted, Centroid, Median) with optional optimal leaf ordering (Bar-Joseph et al., 2001):
The dendrogram reveals the correlation structure of the asset universe. Strongly correlated assets cluster together at low distances, while uncorrelated assets are separated at higher distances.
Risk Decomposition
Canopy decomposes portfolio risk to show each asset's marginal contribution to total variance:
Equal Risk Contribution (the gold dashed line at 5% for N=20) is the theoretical target. HRP with denoised covariance achieves near-equal risk contribution without any explicit optimization constraint.
📦 New in v3.0
DataLoader
Zero-boilerplate data pipeline. Fetch from Yahoo Finance, CSV, Parquet, or DataFrame.
from canopy.data import DataLoader
returns, nifty = DataLoader.yfinance(
['RELIANCE.NS', 'TCS.NS', 'HDFCBANK.NS', 'INFY.NS'],
start='2021-01-01',
benchmark='^NSEI'
)
returns = DataLoader.csv('prices.csv')
returns = DataLoader.parquet('bloomberg.parquet')
PortfolioMetrics
Math separated from logic. Pure functions + comprehensive reporting class.
from canopy.metrics import PortfolioMetrics
pm = PortfolioMetrics(returns, weights, benchmark=nifty)
print(pm.sharpe()) # Annualized Sharpe Ratio
print(pm.maxdrawdown()) # Maximum peak-to-trough decline
print(pm.cvar()) # Conditional Value-at-Risk
print(pm.report()) # Full formatted report
BacktestEngine
Walk-forward rebalancing. Daily, weekly, monthly, quarterly, annual frequencies.
from canopy.backtest import BacktestEngine, MasterCanopy
engine = BacktestEngine(
optimizer=MasterCanopy(method='HRP', cov_estimator='ledoit_wolf'),
frequency='monthly',
lookback=252,
)
result = engine.run(returns)
print(result.summary())
Performance Benchmarks
Benchmarked on 20 global assets (US + India), 5 years of daily data (2020–2025):
| Method | Cov Estimator | Sharpe | Sortino | CVaR 95% | Max DD | Speed |
|---|---|---|---|---|---|---|
| HRP | Denoised | 0.83 | 0.95 | -2.27% | -30.5% | 11 ms |
| HRP | Ledoit-Wolf | 0.79 | 0.91 | -2.29% | -31.0% | 11 ms |
| HERC | LW + CVaR | 0.70 | 0.81 | -2.35% | -31.8% | 17 ms |
| NCO | Ledoit-Wolf | 0.68 | 0.79 | -2.19% | -23.2% | 46 ms |
Feature Matrix
| Feature | Status |
|---|---|
| HRP, HERC, NCO Allocation | ✅ |
| 4 Covariance Estimators (Sample, Ledoit-Wolf, Denoised, EWMA) | ✅ |
| 4 Risk Measures (Variance, CVaR, CDaR, MAD) | ✅ |
| Correlation Matrix Detoning | ✅ |
| Weight Constraints (min/max) | ✅ |
| 3 Portfolio Modes | ✅ |
| DataLoader (yfinance, CSV, Parquet) | ✅ |
| PortfolioMetrics (Sharpe, Sortino, CVaR, Calmar, IR) | ✅ |
| Walk-Forward BacktestEngine | ✅ |
| ISO 8601 Audit Trail + JSON Export | ✅ |
| 9 Interactive Plotly Charts | ✅ |
| 7 Linkage Methods + Optimal Leaf Ordering | ✅ |
| Block Bootstrap Confidence Intervals | ✅ |
| 29 Tests Passing (0.84s) | ✅ |
Architecture
canopy/
├── MasterCanopy.py ← Facade (v3.0.0)
├── core/
│ ├── CovarianceEngine.py ← Ledoit-Wolf, Denoised, EWMA, Detoning
│ └── ClusterEngine.py ← 7 Linkage Methods, 4 Distance Metrics
├── optimizers/
│ ├── HRP.py ← Vectorized Recursive Bisection
│ ├── HERC.py ← 4 Risk Measures (Var, CVaR, CDaR, MAD)
│ └── NCO.py ← Tikhonov-Regularized Nested Optimization
├── data/
│ └── loader.py ← DataLoader (.yfinance, .csv, .parquet)
├── metrics/
│ └── performance.py ← Sharpe, Sortino, MaxDD, CVaR, Calmar, IR
├── backtest/
│ └── engine.py ← Walk-Forward BacktestEngine
├── viz/ChartEngine.py ← 9 Interactive Plotly Charts
├── tests/test_canopy.py ← 29 Tests (all passing)
└── docs/ ← Sphinx + ReadTheDocs
Design Principles
- Fail fast, fail loud. Inputs validated at construction time, not compute time.
- Zero matrix inversion for HRP. Stable even for near-singular covariance matrices.
- Audit everything. Every step timestamped. Export JSON for compliance.
- Modular kernel.
core/(math),optimizers/(allocation),viz/(charts),data/(loading),metrics/(analytics),backtest/(simulation). - Fluent API.
opt.cluster(returns).allocate()— one chain, readable, Pythonic.
Installation
pip install canopy-optimizer
With data loading:
pip install canopy-optimizer[data]
From source:
git clone https://github.com/Anagatam/Canopy.git
cd Canopy && pip install -e .[dev]
Requirements: Python ≥ 3.10 · NumPy · Pandas · SciPy · scikit-learn · Plotly · NetworkX
Testing
pytest tests/test_canopy.py -v # Run tests
pytest tests/test_canopy.py -v --cov # With coverage
29/29 tests passing in 0.84 seconds.
📚 Documentation
| Resource | Link |
|---|---|
| ReadTheDocs | canopy-institutional-hierarchical-optimization-engine.readthedocs.io |
| PyPI | pypi.org/project/canopy-optimizer |
| GitHub Wiki | github.com/Anagatam/Canopy/wiki |
| API Reference | docs/api_reference.md |
| Algorithms | docs/algorithms.md |
| Linkage Methods | docs/linkage_methods.md |
| Diagnostics | docs/diagnostics.md |
🔮 Canopy Pro
Canopy Pro is our advanced premium engine designed for institutional portfolio managers, financial analysts, and investment advisors who need cutting-edge capabilities beyond the open-source edition. It builds on Canopy's proven foundation with next-generation algorithms, expanded risk analytics, and enterprise-grade integrations.
🧬 Next-Generation Algorithms
| Algorithm | What It Does |
|---|---|
| HRCP (Hierarchical Risk Contribution Parity) | Achieves exact risk budgets (< 0.01% tolerance) through iterative scaling within the hierarchical tree. Designed for Basel III/IV risk parity mandates. |
| HERC-DRL (Deep Reinforcement Learning) | A policy gradient agent dynamically reweights clusters based on rolling covariance features. Trained on 20+ years of crisis data (GFC, COVID, SVB). |
| Spectral NCO | Combines spectral graph theory with persistent homology (topological data analysis) to capture higher-order asset dependencies beyond pairwise correlations. |
| Bayesian HRP | Integrates Black-Litterman posterior returns into the hierarchical tree, enabling view-consistent allocation for portfolio managers with conviction ideas. |
📐 Advanced Covariance
| Estimator | What It Delivers |
|---|---|
| DCC-GARCH | Time-varying correlations that capture regime shifts during market stress — critical for VaR models under Basel III. |
| Factor Models (Barra-style) | Handles universes of 500+ assets via factor-based decomposition (style + industry), reducing dimensionality from N² to K². |
| Realized Kernels | Reconstructs covariance from intraday tick data with microstructure noise removal. Purpose-built for HFT and intraday rebalancing. |
📊 12+ Risk Measures
| Measure | What It Captures |
|---|---|
| EVaR (Entropic VaR) | Coherent + convex. Tighter tail bound than CVaR. The preferred measure for robust optimization. |
| RLVaR (Relativistic VaR) | Handles heavy-tailed (non-Gaussian) return distributions. Based on Kaniadakis entropy. |
| EDaR (Entropic Drawdown-at-Risk) | Drawdown-aware tail risk for multi-horizon institutional portfolios. |
| Tail Gini | Measures inequality in the return tail — a more nuanced view of concentration risk than VaR alone. |
| Omega Ratio | Uses the full return distribution, not just the left tail, for a complete risk-return picture. |
🔌 Enterprise Integration
| Integration | Use Case |
|---|---|
| Bloomberg B-PIPE / SAPI | Direct market data ingestion from Bloomberg Terminal |
| Refinitiv Eikon | Alternative data source for firms on Refinitiv |
| MOSEK Solver | Convex optimization backend for constrained Pro algorithms |
| Real-Time Streaming | WebSocket-based covariance updates for intraday rebalancing |
Feature Comparison
| Capability | Canopy (Open Source) | Canopy Pro |
|---|---|---|
| Algorithms | 3 (HRP, HERC, NCO) | 7+ (HRCP, HERC-DRL, Spectral NCO, Bayesian HRP) |
| Covariance Estimators | 4 | 8+ (DCC-GARCH, Factor Models, Realized Kernels) |
| Risk Measures | 4 | 12+ (EVaR, RLVaR, EDaR, Tail Gini, Omega) |
| Portfolio Modes | 3 | 6+ (Risk Budgeting, Black-Litterman, Regime-Aware) |
| Data Sources | yfinance, CSV | Bloomberg, Refinitiv, streaming |
| Backtesting | Walk-forward | Walk-forward + Monte Carlo + Stress Testing |
| Support | Community | Priority SLA + Dedicated Engineering |
Interested in Canopy Pro? 📩 Sign up for early access →
Built specifically for institutional portfolio managers, financial analysts, and investment advisors.
⚖️ License & Disclaimer
Apache License 2.0 — Copyright © 2026 Anagatam Technologies. All rights reserved.
[!CAUTION] Not investment advice. Canopy is a mathematical software library for educational and research purposes only. It does not provide financial recommendations or trading signals. Consult a licensed financial professional before making investment decisions. See DISCLAIMER.md for SEC, SEBI, and global regulatory compliance.
Built with precision for the institutional quantitative finance community.
📖 Docs · 📦 PyPI · 📚 Wiki · 📰 Wikipedia · 🐛 Issues · ⚖️ Disclaimer
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file canopy_optimizer-3.0.2.tar.gz.
File metadata
- Download URL: canopy_optimizer-3.0.2.tar.gz
- Upload date:
- Size: 64.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66c6860a98beba5543e74eabada3b8d8e0cdae770c2b5b9199387235b3b01d03
|
|
| MD5 |
e6b8efa497d2017cbd333753a1e71f54
|
|
| BLAKE2b-256 |
f97ba8d3c6f86f1185f7a765910289cad8fc1155622a7d5e437c69cd20aad689
|
File details
Details for the file canopy_optimizer-3.0.2-py3-none-any.whl.
File metadata
- Download URL: canopy_optimizer-3.0.2-py3-none-any.whl
- Upload date:
- Size: 62.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31f6c130ed88599e97859d8e6614a4b708d26a24bbf303675ed8f776704715e0
|
|
| MD5 |
40539f3b5120da0336d5ec33eb36c708
|
|
| BLAKE2b-256 |
d05986660ff0db05f6b9288db9ccacca57b44f83674cc12c7d51ea24ee333832
|