Institutional-grade hierarchical portfolio optimization (HRP, HERC, NCO) with data loading, metrics, and backtesting — by Anagatam Technologies
Project description
Canopy: The Institutional Hierarchical Portfolio Optimization Engine
| Documentation · PyPI · Release Notes · Disclaimer | |
| Open Source | |
| CI/CD | |
| Code | |
| Algorithms | |
| Tests | |
| Downloads | |
Welcome to Canopy. Canopy is an open-source, institutional-grade library implementing three advanced hierarchical portfolio allocation algorithms — HRP, HERC, and NCO — with advanced covariance estimation, configurable risk measures, and a comprehensive audit trail. Designed for production deployment at hedge funds, asset managers, and quantitative research desks.
Canopy abstracts disjointed mathematical scripts into a single, devastatingly powerful execution facade: the MasterCanopy.
[!NOTE] Canopy Pro — Our advanced, top-grade premium model featuring next-generation hierarchical allocation algorithms — is currently under development. Canopy Pro extends the open-source edition with proprietary hierarchical methods (HRCP, HERC-DRL, Spectral NCO), 12+ risk measures, real-time streaming covariance, enterprise backtesting, and dedicated support. Stay tuned — we will notify you when it launches.
Table of Contents
- 📚 Official Documentation
- Why Canopy?
- Getting Started
- Features & Mathematical Architecture
- 📦 New in v3.0
- Performance Benchmarks
- Project Principles & Design Decisions
- 🚀 Installation
- Testing & Developer Setup
- Canopy Pro (Coming Soon)
- ⚖️ License & Disclaimer
📚 Official Documentation
Canopy is built with the rigor and scale of Tier-1 quantitative infrastructure. Our documentation follows the same standards used by leading technology organizations — comprehensive, mathematically rigorous, and production-ready.
📖 Read the Full Documentation on ReadTheDocs ➔
The documentation covers:
| Section | Description |
|---|---|
| Getting Started | Installation, quickstart, and first portfolio in 30 seconds |
| API Reference | Complete API for MasterCanopy, CovarianceEngine, ClusterEngine, and all optimizers |
| Algorithms Deep Dive | Mathematical derivations for HRP, HERC, NCO with proofs and complexity analysis |
| Linkage Methods | Ward, Single, Complete, Average, Weighted — when to use each with dendrograms |
| Diagnostics & Audit | ISO 8601 audit trail, JSON export, compliance logging |
| Covariance Theory | Ledoit-Wolf shrinkage derivation, Marchenko-Pastur denoising, detoning mathematics |
Why Canopy?
Canopy was explicitly engineered for absolute mathematical precision and institutional scalability.
-
Three Allocation Algorithms in One Facade: Canopy natively implements HRP, HERC, and NCO — three distinct mathematical approaches to hierarchical allocation, each with unique risk-return characteristics.
-
Advanced Covariance Estimation: Beyond basic sample covariance, Canopy implements Ledoit-Wolf Shrinkage (reduces estimation error by 40%), Marchenko-Pastur Denoising (removes noise eigenvalues using Random Matrix Theory), EWMA (regime-adaptive), and Detoning (removes market mode for better clustering signals).
-
Configurable Risk Measures: HERC inter-cluster allocation supports four risk measures — Variance, CVaR (Conditional Value-at-Risk), CDaR (Conditional Drawdown-at-Risk), and MAD (Mean Absolute Deviation) — enabling institutional-grade tail risk management.
-
Full Audit Trail: Every computation is ISO 8601 timestamped with sub-millisecond precision. Export full audit logs as JSON for compliance and reproducibility.
Getting Started
Gone are the days of importing disjointed functions. Canopy abstracts the entire mathematical realm into a single MasterCanopy object:
import numpy as np
import yfinance as yf
from canopy.MasterCanopy import MasterCanopy
# 1. Effortless Market Ingestion
data = yf.download(['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'JPM'], start='2020-01-01')
returns = data['Close'].pct_change().dropna()
# 2. One-Line Optimal Allocation
opt = MasterCanopy(method='HRP', cov_estimator='ledoit_wolf')
weights = opt.cluster(returns).allocate()
print(weights)
# 3. Institutional Risk Report
print(opt.summary())
print(opt.to_json()) # Full audit trail as JSON
The Output
AAPL 0.1824
MSFT 0.2016
GOOGL 0.1953
AMZN 0.1892
JPM 0.2315
Advanced Usage: HERC with CVaR Risk Measure
opt = MasterCanopy(
method='HERC',
cov_estimator='denoised', # Marchenko-Pastur denoising
risk_measure='cvar', # CVaR for tail-risk-aware allocation
detone=True, # Remove market mode
min_weight=0.01, # UCITS-compliant floor
max_weight=0.10 # UCITS-compliant ceiling
)
weights = opt.cluster(returns).allocate()
Features & Mathematical Architecture
Hierarchical Allocation Algorithms
Canopy implements three distinct hierarchical allocation algorithms, each targeting different portfolio construction objectives:
| Algorithm | Mathematical Foundation | Key Property | Speed (20 assets) |
|---|---|---|---|
| HRP | Recursive bisection under inverse-variance naive risk parity. w = recursive_bisect(tree, Σ) — NO matrix inversion required. |
Maximum stability. Avoids Σ⁻¹ entirely. | ~11 ms |
| HERC | Two-stage allocation: inter-cluster risk parity + intra-cluster inverse-variance with configurable risk measures (Variance, CVaR, CDaR, MAD). | Cluster-aware diversification. | ~17 ms |
| NCO | Nested Clustered Optimization with Tikhonov regularization: (Σ_k + λI)⁻¹ · 1 for intra-cluster min-variance. |
Lowest tail risk & drawdown. | ~46 ms |
Cumulative Returns: India — Canopy vs Nifty 50
Cumulative Returns: US — Canopy vs S&P 500
Covariance Estimation Engine
The quality of the covariance matrix is the single most important factor in portfolio optimization. Canopy's CovarianceEngine provides four institutional-grade estimators:
| Estimator | Mathematical Basis | When to Use |
|---|---|---|
| Sample | Σ̂ = (1/T)·Rᵀ·R — Maximum likelihood under Gaussian assumptions |
Baseline. Large T/N ratio (>10×) |
| Ledoit-Wolf | Σ_LW = α·F + (1−α)·Σ̂ — Optimal shrinkage toward scaled identity |
Standard institutional default |
| Denoised | Marchenko-Pastur RMT: clip noise eigenvalues below λ₊ = σ²(1+√(N/T))² |
High-noise environments (N/T > 0.5) |
| EWMA | Σ_EWMA = Σ wₜ · rₜ · rₜᵀ, decay halflife λ |
Regime-adaptive risk management |
Detoning (Lopez de Prado, 2020): Optionally removes the market mode (first eigenvalue) from the correlation matrix before clustering. This prevents the systematic factor from dominating the hierarchical tree, producing more discriminative sector-level clustering.
Risk Measures & Portfolio Modes
HERC Inter-Cluster Risk Measures
| Risk Measure | Formula | Institutional Use Case |
|---|---|---|
| Variance | V_k = wᵀ · Σ_k · w |
Classic Raffinot (2017). Symmetric risk |
| CVaR | E[R_k | R_k ≤ VaR₅%] |
Tail risk. Allocates AWAY from crash-prone clusters |
| CDaR | E[DD_k | DD_k ≥ DDaR₉₅%] |
Drawdown risk. Penalizes deep underwater periods |
| MAD | E[|R_k - E[R_k]|] |
Robust to outliers. No squared deviations |
Portfolio Modes
| Mode | Constraint | Use Case |
|---|---|---|
long_only |
wᵢ ≥ 0 ∀i |
Mutual funds, ETFs, pensions, UCITS |
long_short |
wᵢ ∈ ℝ, Σwᵢ = 1 |
Hedge funds, 130/30 strategies |
market_neutral |
Σwᵢ = 0 |
Statistical arbitrage desks |
Dendrogram & Cluster Analysis
Canopy builds a full hierarchical clustering tree using 7 linkage methods (Ward, Single, Complete, Average, Weighted, Centroid, Median) with optional optimal leaf ordering (Bar-Joseph et al., 2001):
The dendrogram reveals the correlation structure of the asset universe. Strongly correlated assets (e.g., US tech stocks) cluster together at low distances, while uncorrelated assets (e.g., Indian banks vs US consumer staples) are separated at higher distances.
Risk Decomposition
Canopy decomposes portfolio risk to show each asset's marginal contribution to total variance:
Equal Risk Contribution (the gold dashed line at 5% for N=20) is the theoretical target. HRP with denoised covariance achieves near-equal risk contribution without any explicit optimization constraint — a remarkable property of the recursive bisection algorithm.
📦 New in v3.0
DataLoader — Zero-Boilerplate Data Pipeline
No more yfinance boilerplate. One-line data loading with automatic cleaning, returns computation, and benchmark alignment.
from canopy.data import DataLoader
# Fetch Indian equities with Nifty 50 benchmark
returns, nifty = DataLoader.yfinance(
['RELIANCE.NS', 'TCS.NS', 'HDFCBANK.NS', 'INFY.NS'],
start='2021-01-01',
benchmark='^NSEI'
)
# Or from local files
returns = DataLoader.csv('institutional_prices.csv')
returns = DataLoader.parquet('bloomberg_feed.parquet')
PortfolioMetrics — Institutional Analytics
Math is separated from logic. Pure mathematical functions for individual metrics + PortfolioMetrics class for comprehensive reporting.
from canopy.metrics import PortfolioMetrics
pm = PortfolioMetrics(returns, weights, benchmark=nifty)
print(pm.sharpe()) # Annualized Sharpe Ratio
print(pm.sortino()) # Sortino Ratio (downside-only vol)
print(pm.maxdrawdown()) # Maximum peak-to-trough decline
print(pm.calmar()) # Calmar Ratio (return / drawdown)
print(pm.cvar()) # Conditional Value-at-Risk
print(pm.informationratio()) # IR vs benchmark
print(pm.report()) # Full formatted report
BacktestEngine — Walk-Forward Backtesting
Production-grade rolling-window rebalancing engine. Supports daily, weekly, monthly, quarterly, and annual rebalance frequencies.
from canopy.backtest import BacktestEngine
from canopy.MasterCanopy import MasterCanopy
engine = BacktestEngine(
optimizer=MasterCanopy(method='HRP', cov_estimator='ledoit_wolf'),
frequency='monthly',
lookback=252, # 1 year estimation window
)
result = engine.run(returns)
print(result.summary()) # Sharpe, MaxDD, Turnover
print(result.equity) # NAV equity curve
Performance Benchmarks
Canopy has been extensively benchmarked on 20 global assets (US + India) across 5 years of daily data (2020-2025):
| Method | Cov Estimator | Sharpe | Sortino | CVaR 95% | Max DD | Eff N | Speed |
|---|---|---|---|---|---|---|---|
| HRP | Denoised | 0.83 | 0.95 | -2.27% | -30.5% | 16.9 | 11 ms |
| HRP | Ledoit-Wolf | 0.79 | 0.91 | -2.29% | -31.0% | 16.5 | 11 ms |
| HERC | LW + CVaR | 0.70 | 0.81 | -2.35% | -31.8% | 15.5 | 17 ms |
| HERC | LW + Variance | 0.72 | 0.84 | -2.25% | -30.1% | 15.2 | 17 ms |
| NCO | Ledoit-Wolf | 0.68 | 0.79 | -2.19% | -23.2% | 8.4 | 46 ms |
Feature Summary
| Feature | Supported |
|---|---|
| HRP, HERC, NCO Allocation | ✅ |
| 4 Covariance Estimators (Sample, Ledoit-Wolf, Denoised, EWMA) | ✅ |
| 4 Risk Measures (Variance, CVaR, CDaR, MAD) | ✅ |
| Correlation Matrix Detoning | ✅ |
| Weight Constraints (min/max bounds) | ✅ |
| 3 Portfolio Modes (long_only, long_short, market_neutral) | ✅ |
| Block Bootstrap Confidence Intervals | ✅ |
| ISO 8601 Audit Trail + JSON Export | ✅ |
| 9 Interactive Plotly Dark-Theme Charts | ✅ |
| 7 Linkage Methods + Optimal Leaf Ordering | ✅ |
Project Principles & Design Decisions
-
Fail Fast, Fail Loud: All inputs are validated at construction time. Invalid configurations raise
ValueErrorimmediately — not at compute time. -
Zero Matrix Inversion for HRP: HRP never inverts the covariance matrix. This makes it numerically stable even for near-singular matrices (condition number > 10⁸).
-
Audit Everything: Every computation step is timestamped and logged. Export as JSON for compliance and reproducibility.
-
Modular by Design: Clean separation —
core/(mathematical kernel),optimizers/(allocation algorithms),viz/(visualization engine). -
Method Chaining: Fluent API design:
opt.cluster(returns).allocate()— clean, readable, Pythonic.
canopy/
├── MasterCanopy.py ← Facade (v2.3.0)
├── core/
│ ├── CovarianceEngine.py ← Ledoit-Wolf, Denoised, EWMA, Detoning
│ └── ClusterEngine.py ← 7 Linkage Methods, 4 Distance Metrics
├── optimizers/
│ ├── HRP.py ← Vectorized Recursive Bisection
│ ├── HERC.py ← 4 Risk Measures (Var, CVaR, CDaR, MAD)
│ └── NCO.py ← Tikhonov-Regularized Nested Optimization
├── viz/ChartEngine.py ← 9 Interactive Plotly Charts
├── tests/test_canopy.py ← 29 Tests (all passing)
└── docs/ ← Sphinx + ReadTheDocs
🚀 Installation
Using pip
pip install canopy-optimizer
From source
git clone https://github.com/Anagatam/Canopy.git
cd Canopy
pip install -e .
Dependencies
numpy>=1.24
pandas>=2.0
scipy>=1.10
scikit-learn>=1.3
plotly>=5.18
Testing & Developer Setup
# Run the full test suite
python -m pytest tests/test_canopy.py -v
# Run with coverage
python -m pytest tests/test_canopy.py -v --cov=canopy
# Generate charts
make charts
# Full validation
make all
Current: 29/29 tests passing in 0.84 seconds.
🔮 Canopy Pro
Canopy (this repository) is our open-source edition, freely available under the MIT License.
Canopy Pro is our advanced, top-grade premium model featuring next-generation hierarchical allocation algorithms currently under active development. It extends the open-source core with proprietary mathematical methods designed for the most demanding institutional portfolios:
🧬 Advanced Hierarchical Allocation Algorithms
| Algorithm | Description | Advantage over Open Source |
|---|---|---|
| HRCP (Hierarchical Risk Contribution Parity) | Exact risk budgeting within the hierarchical tree | True equal risk contribution, not approximate |
| HERC-DRL (Deep Reinforcement Learning HERC) | Dynamic cluster rebalancing via policy gradient | Adapts to regime changes in real-time |
| Spectral NCO | Spectral graph theory + NCO with persistent homology | Captures higher-order asset relationships |
| Bayesian HRP | Posterior-weighted hierarchical allocation | Incorporates prior views (Black-Litterman compatible) |
Feature Comparison
| Feature | Canopy (Open Source) | Canopy Pro (Coming Soon) |
|---|---|---|
| Hierarchical Algorithms | 3 (HRP, HERC, NCO) | 7+ (HRCP, HERC-DRL, Spectral NCO, Bayesian HRP) |
| Covariance Estimators | 4 | 8+ (DCC-GARCH, Factor Models, Realized Kernels) |
| Risk Measures | 4 (Var, CVaR, CDaR, MAD) | 12+ (EVaR, RLVaR, EDaR, Tail Gini) |
| Portfolio Modes | 3 | 6+ (Risk Budgeting, Black-Litterman) |
| Real-Time Streaming | ❌ | ✅ |
| Enterprise Backtesting | ❌ | ✅ (Walk-forward, Monte Carlo) |
| Dedicated Support | Community | Priority SLA |
| Custom Integrations | ❌ | ✅ (Bloomberg, Refinitiv, MOSEK) |
Interested in Canopy Pro? 📩 Sign up for early access →
We will notify you as soon as Canopy Pro is available.
⚖️ License & Disclaimer
Apache License 2.0 — Copyright © 2026 Anagatam Technologies. All rights reserved.
Apache 2.0 provides patent protection for contributors and users. See the full LICENSE file.
[!CAUTION] This library is NOT investment advice. Canopy is a mathematical software library for educational and research purposes only. It does not provide financial recommendations, trading signals, or portfolio management services. Before making any investment decisions, consult a qualified, licensed financial professional. See our full DISCLAIMER for details on SEC, SEBI, and global regulatory compliance.
Built with precision for the institutional quantitative finance community.
Links: 📖 Documentation · 📦 PyPI · 🐛 Issues · 📋 Changelog · ⚖️ Disclaimer
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file canopy_optimizer-3.0.0.tar.gz.
File metadata
- Download URL: canopy_optimizer-3.0.0.tar.gz
- Upload date:
- Size: 65.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
723890864de5e595d6fce1680ddc18b37ae609247b6a2b3afb79b48ffd9baee2
|
|
| MD5 |
732a794f99484dd64060d78149dd19fc
|
|
| BLAKE2b-256 |
f0b1b3997ee0aab9730fc59dab6dada3a07e41e38ca5d8716672ccfb1708eb93
|
File details
Details for the file canopy_optimizer-3.0.0-py3-none-any.whl.
File metadata
- Download URL: canopy_optimizer-3.0.0-py3-none-any.whl
- Upload date:
- Size: 62.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5743f1c2f0ffbbe18a340925b9bcf67301c317b6c0dda32dece86d41839ec4f
|
|
| MD5 |
0eeaa700bac80f6c3ca6d36e28b1c06c
|
|
| BLAKE2b-256 |
335d4298d88d1e3db5af3e86fa7af433d484b1f00ccaf0276674be706e8dbee2
|