Skip to main content

Institutional-grade hierarchical portfolio optimization (HRP, HERC, NCO) by Anagatam Technologies

Project description

Canopy: The Institutional Hierarchical Portfolio Optimization Engine

Documentation · Tutorials · Release Notes
Open SourceLicense: MIT
CI/CDBuild Docs
CodePython Code style: black Version
AlgorithmsHRP HERC NCO
TestsTests Speed
DownloadsDownloads/week Downloads/month Cumulative

Welcome to Canopy. Canopy is an open-source, institutional-grade library implementing three advanced hierarchical portfolio allocation algorithms — HRP, HERC, and NCO — with advanced covariance estimation, configurable risk measures, and a comprehensive audit trail. Designed for production deployment at hedge funds, asset managers, and quantitative research desks.

Canopy abstracts disjointed mathematical scripts into a single, devastatingly powerful execution facade: the MasterCanopy.

[!NOTE] Canopy Pro — Our advanced, top-grade premium model — is currently under development and will be available soon. Canopy Pro extends the open-source edition with proprietary allocation algorithms, real-time streaming covariance, enterprise-grade backtesting, and dedicated support. Stay tuned — we will notify you when it launches.

📩 Sign up for Canopy Pro early access →


Table of Contents


📚 Official Documentation

Canopy is built with the rigor and scale of Tier-1 quantitative infrastructure. Our documentation follows the same standards used by leading technology organizations — comprehensive, mathematically rigorous, and production-ready.

📖 Read the Full Documentation on ReadTheDocs ➔

The documentation covers:

Section Description
Getting Started Installation, quickstart, and first portfolio in 30 seconds
API Reference Complete API for MasterCanopy, CovarianceEngine, ClusterEngine, and all optimizers
Algorithms Deep Dive Mathematical derivations for HRP, HERC, NCO with proofs and complexity analysis
Linkage Methods Ward, Single, Complete, Average, Weighted — when to use each with dendrograms
Diagnostics & Audit ISO 8601 audit trail, JSON export, compliance logging
Covariance Theory Ledoit-Wolf shrinkage derivation, Marchenko-Pastur denoising, detoning mathematics

Why Canopy?

Canopy was explicitly engineered for absolute mathematical precision and institutional scalability.

  1. Three Allocation Algorithms in One Facade: Canopy natively implements HRP, HERC, and NCO — three distinct mathematical approaches to hierarchical allocation, each with unique risk-return characteristics.

  2. Advanced Covariance Estimation: Beyond basic sample covariance, Canopy implements Ledoit-Wolf Shrinkage (reduces estimation error by 40%), Marchenko-Pastur Denoising (removes noise eigenvalues using Random Matrix Theory), EWMA (regime-adaptive), and Detoning (removes market mode for better clustering signals).

  3. Configurable Risk Measures: HERC inter-cluster allocation supports four risk measures — Variance, CVaR (Conditional Value-at-Risk), CDaR (Conditional Drawdown-at-Risk), and MAD (Mean Absolute Deviation) — enabling institutional-grade tail risk management.

  4. Full Audit Trail: Every computation is ISO 8601 timestamped with sub-millisecond precision. Export full audit logs as JSON for compliance and reproducibility.


Getting Started

Gone are the days of importing disjointed functions. Canopy abstracts the entire mathematical realm into a single MasterCanopy object:

import numpy as np
import yfinance as yf
from canopy.MasterCanopy import MasterCanopy

# 1. Effortless Market Ingestion
data = yf.download(['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'JPM'], start='2020-01-01')
returns = data['Close'].pct_change().dropna()

# 2. One-Line Optimal Allocation
opt = MasterCanopy(method='HRP', cov_estimator='ledoit_wolf')
weights = opt.cluster(returns).allocate()
print(weights)

# 3. Institutional Risk Report
print(opt.summary())
print(opt.to_json())    # Full audit trail as JSON

The Output

AAPL     0.1824
MSFT     0.2016
GOOGL    0.1953
AMZN     0.1892
JPM      0.2315

Advanced Usage: HERC with CVaR Risk Measure

opt = MasterCanopy(
    method='HERC',
    cov_estimator='denoised',    # Marchenko-Pastur denoising
    risk_measure='cvar',         # CVaR for tail-risk-aware allocation
    detone=True,                 # Remove market mode
    min_weight=0.01,             # UCITS-compliant floor
    max_weight=0.10              # UCITS-compliant ceiling
)
weights = opt.cluster(returns).allocate()

Features & Mathematical Architecture

Hierarchical Allocation Algorithms

Canopy implements three distinct hierarchical allocation algorithms, each targeting different portfolio construction objectives:

Allocation Comparison

Algorithm Mathematical Foundation Key Property Speed (20 assets)
HRP Recursive bisection under inverse-variance naive risk parity. w = recursive_bisect(tree, Σ) — NO matrix inversion required. Maximum stability. Avoids Σ⁻¹ entirely. ~11 ms
HERC Two-stage allocation: inter-cluster risk parity + intra-cluster inverse-variance with configurable risk measures (Variance, CVaR, CDaR, MAD). Cluster-aware diversification. ~17 ms
NCO Nested Clustered Optimization with Tikhonov regularization: (Σ_k + λI)⁻¹ · 1 for intra-cluster min-variance. Lowest tail risk & drawdown. ~46 ms

Cumulative Returns: Strategy Comparison

Cumulative Returns


Covariance Estimation Engine

The quality of the covariance matrix is the single most important factor in portfolio optimization. Canopy's CovarianceEngine provides four institutional-grade estimators:

Estimator Mathematical Basis When to Use
Sample Σ̂ = (1/T)·Rᵀ·R — Maximum likelihood under Gaussian assumptions Baseline. Large T/N ratio (>10×)
Ledoit-Wolf Σ_LW = α·F + (1−α)·Σ̂ — Optimal shrinkage toward scaled identity Standard institutional default
Denoised Marchenko-Pastur RMT: clip noise eigenvalues below λ₊ = σ²(1+√(N/T))² High-noise environments (N/T > 0.5)
EWMA Σ_EWMA = Σ wₜ · rₜ · rₜᵀ, decay halflife λ Regime-adaptive risk management

Detoning (Lopez de Prado, 2020): Optionally removes the market mode (first eigenvalue) from the correlation matrix before clustering. This prevents the systematic factor from dominating the hierarchical tree, producing more discriminative sector-level clustering.


Risk Measures & Portfolio Modes

HERC Inter-Cluster Risk Measures

Risk Measure Formula Institutional Use Case
Variance V_k = wᵀ · Σ_k · w Classic Raffinot (2017). Symmetric risk
CVaR E[R_k | R_k ≤ VaR₅%] Tail risk. Allocates AWAY from crash-prone clusters
CDaR E[DD_k | DD_k ≥ DDaR₉₅%] Drawdown risk. Penalizes deep underwater periods
MAD E[|R_k - E[R_k]|] Robust to outliers. No squared deviations

Portfolio Modes

Mode Constraint Use Case
long_only wᵢ ≥ 0 ∀i Mutual funds, ETFs, pensions, UCITS
long_short wᵢ ∈ ℝ, Σwᵢ = 1 Hedge funds, 130/30 strategies
market_neutral Σwᵢ = 0 Statistical arbitrage desks

Dendrogram & Cluster Analysis

Canopy builds a full hierarchical clustering tree using 7 linkage methods (Ward, Single, Complete, Average, Weighted, Centroid, Median) with optional optimal leaf ordering (Bar-Joseph et al., 2001):

Dendrogram

The dendrogram reveals the correlation structure of the asset universe. Strongly correlated assets (e.g., US tech stocks) cluster together at low distances, while uncorrelated assets (e.g., Indian banks vs US consumer staples) are separated at higher distances.


Risk Decomposition

Canopy decomposes portfolio risk to show each asset's marginal contribution to total variance:

Risk Contribution

Equal Risk Contribution (the gold dashed line at 5% for N=20) is the theoretical target. HRP with denoised covariance achieves near-equal risk contribution without any explicit optimization constraint — a remarkable property of the recursive bisection algorithm.


Performance Benchmarks

Canopy has been extensively benchmarked on 20 global assets (US + India) across 5 years of daily data (2020-2025):

Method Cov Estimator Sharpe Sortino CVaR 95% Max DD Eff N Speed
HRP Denoised 0.83 0.95 -2.27% -30.5% 16.9 11 ms
HRP Ledoit-Wolf 0.79 0.91 -2.29% -31.0% 16.5 11 ms
HERC LW + CVaR 0.70 0.81 -2.35% -31.8% 15.5 17 ms
HERC LW + Variance 0.72 0.84 -2.25% -30.1% 15.2 17 ms
NCO Ledoit-Wolf 0.68 0.79 -2.19% -23.2% 8.4 46 ms

Feature Summary

Feature Supported
HRP, HERC, NCO Allocation
4 Covariance Estimators (Sample, Ledoit-Wolf, Denoised, EWMA)
4 Risk Measures (Variance, CVaR, CDaR, MAD)
Correlation Matrix Detoning
Weight Constraints (min/max bounds)
3 Portfolio Modes (long_only, long_short, market_neutral)
Block Bootstrap Confidence Intervals
ISO 8601 Audit Trail + JSON Export
9 Interactive Plotly Dark-Theme Charts
7 Linkage Methods + Optimal Leaf Ordering

Project Principles & Design Decisions

  1. Fail Fast, Fail Loud: All inputs are validated at construction time. Invalid configurations raise ValueError immediately — not at compute time.

  2. Zero Matrix Inversion for HRP: HRP never inverts the covariance matrix. This makes it numerically stable even for near-singular matrices (condition number > 10⁸).

  3. Audit Everything: Every computation step is timestamped and logged. Export as JSON for compliance and reproducibility.

  4. Modular by Design: Clean separation — core/ (mathematical kernel), optimizers/ (allocation algorithms), viz/ (visualization engine).

  5. Method Chaining: Fluent API design: opt.cluster(returns).allocate() — clean, readable, Pythonic.

canopy/
├── MasterCanopy.py              ← Facade (v2.3.0)
├── core/
│   ├── CovarianceEngine.py      ← Ledoit-Wolf, Denoised, EWMA, Detoning
│   └── ClusterEngine.py         ← 7 Linkage Methods, 4 Distance Metrics
├── optimizers/
│   ├── HRP.py                   ← Vectorized Recursive Bisection
│   ├── HERC.py                  ← 4 Risk Measures (Var, CVaR, CDaR, MAD)
│   └── NCO.py                   ← Tikhonov-Regularized Nested Optimization
├── viz/ChartEngine.py           ← 9 Interactive Plotly Charts
├── tests/test_canopy.py         ← 29 Tests (all passing)
└── docs/                        ← Sphinx + ReadTheDocs

🚀 Installation

Using pip

pip install canopy-optimizer

From source

git clone https://github.com/Anagatam/Canopy.git
cd Canopy
pip install -e .

Dependencies

numpy>=1.24
pandas>=2.0
scipy>=1.10
scikit-learn>=1.3
plotly>=5.18

Testing & Developer Setup

# Run the full test suite
python -m pytest tests/test_canopy.py -v

# Run with coverage
python -m pytest tests/test_canopy.py -v --cov=canopy

# Generate charts
make charts

# Full validation
make all

Current: 29/29 tests passing in 0.84 seconds.


🔮 Canopy Pro

Canopy (this repository) is our open-source edition, freely available under the MIT License.

Canopy Pro is our advanced, top-grade premium model currently under active development. It extends the open-source core with:

Feature Canopy (Open Source) Canopy Pro (Coming Soon)
HRP / HERC / NCO
4 Covariance Estimators ✅ + DCC-GARCH, Factor Models
Risk Measures 4 (Var, CVaR, CDaR, MAD) 12+ (EVaR, RLVaR, EDaR, Tail Gini)
Portfolio Modes 3 6+ (Risk Budgeting, Black-Litterman)
Real-Time Streaming
Enterprise Backtesting ✅ (Walk-forward, Monte Carlo)
Dedicated Support Community Priority SLA
Custom Integrations ✅ (Bloomberg, Refinitiv, MOSEK)

Interested in Canopy Pro? Sign up for early access →

We will notify you as soon as Canopy Pro is available.


License

MIT License. Copyright © 2026 Anagatam Technologies. All rights reserved.

Built with precision for the institutional quantitative finance community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canopy_optimizer-2.3.0.tar.gz (52.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

canopy_optimizer-2.3.0-py3-none-any.whl (50.0 kB view details)

Uploaded Python 3

File details

Details for the file canopy_optimizer-2.3.0.tar.gz.

File metadata

  • Download URL: canopy_optimizer-2.3.0.tar.gz
  • Upload date:
  • Size: 52.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for canopy_optimizer-2.3.0.tar.gz
Algorithm Hash digest
SHA256 05489f2f1090683882dfa07868286273e8d20bf2b8502e683c24dfa6189720ff
MD5 cdd4979817d4f8ddb3448275d6ef90f8
BLAKE2b-256 4086323e4e0f20a7bbf3448583d37553d310b7c1d9015cab25458e5c3e32579e

See more details on using hashes here.

File details

Details for the file canopy_optimizer-2.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for canopy_optimizer-2.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5032b8290779359a6b3a44eebd99509e9fc159b88053df3e40987bc36945c932
MD5 e96832b111f6ddc721abc587fe36024e
BLAKE2b-256 5eb429ebcc03fc9cd71ac62c40fce8da2a57969c66b558722776308b220023c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page