Skip to main content

Institutional-grade hierarchical portfolio optimization (HRP, HERC, NCO) with data loading, metrics, and backtesting — by Anagatam Technologies

Project description

Canopy: The Institutional Hierarchical Portfolio Optimization Engine

Documentation · PyPI · Release Notes · Disclaimer
Open SourceLicense: Apache 2.0
CI/CDBuild Docs PyPI
CodePython Code style: black Version
AlgorithmsHRP HERC NCO
TestsTests Speed
DownloadsDownloads/week Downloads/month Cumulative

Welcome to Canopy. Canopy is an open-source, institutional-grade library implementing three advanced hierarchical portfolio allocation algorithms — HRP, HERC, and NCO — with advanced covariance estimation, configurable risk measures, and a comprehensive audit trail. Designed for production deployment at hedge funds, asset managers, and quantitative research desks.

Canopy abstracts disjointed mathematical scripts into a single, devastatingly powerful execution facade: the MasterCanopy.

[!NOTE] Canopy Pro — Our advanced, top-grade premium model featuring next-generation hierarchical allocation algorithms — is currently under development. Canopy Pro extends the open-source edition with proprietary hierarchical methods (HRCP, HERC-DRL, Spectral NCO), 12+ risk measures, real-time streaming covariance, enterprise backtesting, and dedicated support. Stay tuned — we will notify you when it launches.

📩 Sign up for Canopy Pro early access →


Table of Contents


📚 Official Documentation

Canopy is built with the rigor and scale of Tier-1 quantitative infrastructure. Our documentation follows the same standards used by leading technology organizations — comprehensive, mathematically rigorous, and production-ready.

📖 Read the Full Documentation on ReadTheDocs ➔

The documentation covers:

Section Description
Getting Started Installation, quickstart, and first portfolio in 30 seconds
API Reference Complete API for MasterCanopy, CovarianceEngine, ClusterEngine, and all optimizers
Algorithms Deep Dive Mathematical derivations for HRP, HERC, NCO with proofs and complexity analysis
Linkage Methods Ward, Single, Complete, Average, Weighted — when to use each with dendrograms
Diagnostics & Audit ISO 8601 audit trail, JSON export, compliance logging
Covariance Theory Ledoit-Wolf shrinkage derivation, Marchenko-Pastur denoising, detoning mathematics

Why Canopy?

Canopy was explicitly engineered for absolute mathematical precision and institutional scalability.

  1. Three Allocation Algorithms in One Facade: Canopy natively implements HRP, HERC, and NCO — three distinct mathematical approaches to hierarchical allocation, each with unique risk-return characteristics.

  2. Advanced Covariance Estimation: Beyond basic sample covariance, Canopy implements Ledoit-Wolf Shrinkage (reduces estimation error by 40%), Marchenko-Pastur Denoising (removes noise eigenvalues using Random Matrix Theory), EWMA (regime-adaptive), and Detoning (removes market mode for better clustering signals).

  3. Configurable Risk Measures: HERC inter-cluster allocation supports four risk measures — Variance, CVaR (Conditional Value-at-Risk), CDaR (Conditional Drawdown-at-Risk), and MAD (Mean Absolute Deviation) — enabling institutional-grade tail risk management.

  4. Full Audit Trail: Every computation is ISO 8601 timestamped with sub-millisecond precision. Export full audit logs as JSON for compliance and reproducibility.


Getting Started

Gone are the days of importing disjointed functions. Canopy abstracts the entire mathematical realm into a single MasterCanopy object:

import numpy as np
import yfinance as yf
from canopy.MasterCanopy import MasterCanopy

# 1. Effortless Market Ingestion
data = yf.download(['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'JPM'], start='2020-01-01')
returns = data['Close'].pct_change().dropna()

# 2. One-Line Optimal Allocation
opt = MasterCanopy(method='HRP', cov_estimator='ledoit_wolf')
weights = opt.cluster(returns).allocate()
print(weights)

# 3. Institutional Risk Report
print(opt.summary())
print(opt.to_json())    # Full audit trail as JSON

The Output

AAPL     0.1824
MSFT     0.2016
GOOGL    0.1953
AMZN     0.1892
JPM      0.2315

Advanced Usage: HERC with CVaR Risk Measure

opt = MasterCanopy(
    method='HERC',
    cov_estimator='denoised',    # Marchenko-Pastur denoising
    risk_measure='cvar',         # CVaR for tail-risk-aware allocation
    detone=True,                 # Remove market mode
    min_weight=0.01,             # UCITS-compliant floor
    max_weight=0.10              # UCITS-compliant ceiling
)
weights = opt.cluster(returns).allocate()

Features & Mathematical Architecture

Hierarchical Allocation Algorithms

Canopy implements three distinct hierarchical allocation algorithms, each targeting different portfolio construction objectives:

Allocation Comparison

Algorithm Mathematical Foundation Key Property Speed (20 assets)
HRP Recursive bisection under inverse-variance naive risk parity. w = recursive_bisect(tree, Σ) — NO matrix inversion required. Maximum stability. Avoids Σ⁻¹ entirely. ~11 ms
HERC Two-stage allocation: inter-cluster risk parity + intra-cluster inverse-variance with configurable risk measures (Variance, CVaR, CDaR, MAD). Cluster-aware diversification. ~17 ms
NCO Nested Clustered Optimization with Tikhonov regularization: (Σ_k + λI)⁻¹ · 1 for intra-cluster min-variance. Lowest tail risk & drawdown. ~46 ms

Cumulative Returns: India — Canopy vs Nifty 50

India Cumulative Returns

Cumulative Returns: US — Canopy vs S&P 500

US Cumulative Returns


Covariance Estimation Engine

The quality of the covariance matrix is the single most important factor in portfolio optimization. Canopy's CovarianceEngine provides four institutional-grade estimators:

Estimator Mathematical Basis When to Use
Sample Σ̂ = (1/T)·Rᵀ·R — Maximum likelihood under Gaussian assumptions Baseline. Large T/N ratio (>10×)
Ledoit-Wolf Σ_LW = α·F + (1−α)·Σ̂ — Optimal shrinkage toward scaled identity Standard institutional default
Denoised Marchenko-Pastur RMT: clip noise eigenvalues below λ₊ = σ²(1+√(N/T))² High-noise environments (N/T > 0.5)
EWMA Σ_EWMA = Σ wₜ · rₜ · rₜᵀ, decay halflife λ Regime-adaptive risk management

Detoning (Lopez de Prado, 2020): Optionally removes the market mode (first eigenvalue) from the correlation matrix before clustering. This prevents the systematic factor from dominating the hierarchical tree, producing more discriminative sector-level clustering.


Risk Measures & Portfolio Modes

HERC Inter-Cluster Risk Measures

Risk Measure Formula Institutional Use Case
Variance V_k = wᵀ · Σ_k · w Classic Raffinot (2017). Symmetric risk
CVaR E[R_k | R_k ≤ VaR₅%] Tail risk. Allocates AWAY from crash-prone clusters
CDaR E[DD_k | DD_k ≥ DDaR₉₅%] Drawdown risk. Penalizes deep underwater periods
MAD E[|R_k - E[R_k]|] Robust to outliers. No squared deviations

Portfolio Modes

Mode Constraint Use Case
long_only wᵢ ≥ 0 ∀i Mutual funds, ETFs, pensions, UCITS
long_short wᵢ ∈ ℝ, Σwᵢ = 1 Hedge funds, 130/30 strategies
market_neutral Σwᵢ = 0 Statistical arbitrage desks

Dendrogram & Cluster Analysis

Canopy builds a full hierarchical clustering tree using 7 linkage methods (Ward, Single, Complete, Average, Weighted, Centroid, Median) with optional optimal leaf ordering (Bar-Joseph et al., 2001):

Dendrogram

The dendrogram reveals the correlation structure of the asset universe. Strongly correlated assets (e.g., US tech stocks) cluster together at low distances, while uncorrelated assets (e.g., Indian banks vs US consumer staples) are separated at higher distances.


Risk Decomposition

Canopy decomposes portfolio risk to show each asset's marginal contribution to total variance:

Risk Contribution

Equal Risk Contribution (the gold dashed line at 5% for N=20) is the theoretical target. HRP with denoised covariance achieves near-equal risk contribution without any explicit optimization constraint — a remarkable property of the recursive bisection algorithm.


📦 New in v3.0

DataLoader — Zero-Boilerplate Data Pipeline

No more yfinance boilerplate. One-line data loading with automatic cleaning, returns computation, and benchmark alignment.

from canopy.data import DataLoader

# Fetch Indian equities with Nifty 50 benchmark
returns, nifty = DataLoader.yfinance(
    ['RELIANCE.NS', 'TCS.NS', 'HDFCBANK.NS', 'INFY.NS'],
    start='2021-01-01',
    benchmark='^NSEI'
)

# Or from local files
returns = DataLoader.csv('institutional_prices.csv')
returns = DataLoader.parquet('bloomberg_feed.parquet')

📖 Full DataLoader API Reference →

PortfolioMetrics — Institutional Analytics

Math is separated from logic. Pure mathematical functions for individual metrics + PortfolioMetrics class for comprehensive reporting.

from canopy.metrics import PortfolioMetrics

pm = PortfolioMetrics(returns, weights, benchmark=nifty)
print(pm.sharpe())           # Annualized Sharpe Ratio
print(pm.sortino())          # Sortino Ratio (downside-only vol)
print(pm.maxdrawdown())      # Maximum peak-to-trough decline
print(pm.calmar())           # Calmar Ratio (return / drawdown)
print(pm.cvar())             # Conditional Value-at-Risk
print(pm.informationratio()) # IR vs benchmark
print(pm.report())           # Full formatted report

📖 Full Metrics API Reference →

BacktestEngine — Walk-Forward Backtesting

Production-grade rolling-window rebalancing engine. Supports daily, weekly, monthly, quarterly, and annual rebalance frequencies.

from canopy.backtest import BacktestEngine
from canopy.MasterCanopy import MasterCanopy

engine = BacktestEngine(
    optimizer=MasterCanopy(method='HRP', cov_estimator='ledoit_wolf'),
    frequency='monthly',
    lookback=252,  # 1 year estimation window
)
result = engine.run(returns)
print(result.summary())      # Sharpe, MaxDD, Turnover
print(result.equity)          # NAV equity curve

📖 Full Backtest API Reference →


Performance Benchmarks

Canopy has been extensively benchmarked on 20 global assets (US + India) across 5 years of daily data (2020-2025):

Method Cov Estimator Sharpe Sortino CVaR 95% Max DD Eff N Speed
HRP Denoised 0.83 0.95 -2.27% -30.5% 16.9 11 ms
HRP Ledoit-Wolf 0.79 0.91 -2.29% -31.0% 16.5 11 ms
HERC LW + CVaR 0.70 0.81 -2.35% -31.8% 15.5 17 ms
HERC LW + Variance 0.72 0.84 -2.25% -30.1% 15.2 17 ms
NCO Ledoit-Wolf 0.68 0.79 -2.19% -23.2% 8.4 46 ms

Feature Summary

Feature Supported
HRP, HERC, NCO Allocation
4 Covariance Estimators (Sample, Ledoit-Wolf, Denoised, EWMA)
4 Risk Measures (Variance, CVaR, CDaR, MAD)
Correlation Matrix Detoning
Weight Constraints (min/max bounds)
3 Portfolio Modes (long_only, long_short, market_neutral)
Block Bootstrap Confidence Intervals
ISO 8601 Audit Trail + JSON Export
9 Interactive Plotly Dark-Theme Charts
7 Linkage Methods + Optimal Leaf Ordering

Project Principles & Design Decisions

  1. Fail Fast, Fail Loud: All inputs are validated at construction time. Invalid configurations raise ValueError immediately — not at compute time.

  2. Zero Matrix Inversion for HRP: HRP never inverts the covariance matrix. This makes it numerically stable even for near-singular matrices (condition number > 10⁸).

  3. Audit Everything: Every computation step is timestamped and logged. Export as JSON for compliance and reproducibility.

  4. Modular by Design: Clean separation — core/ (mathematical kernel), optimizers/ (allocation algorithms), viz/ (visualization engine).

  5. Method Chaining: Fluent API design: opt.cluster(returns).allocate() — clean, readable, Pythonic.

canopy/
├── MasterCanopy.py              ← Facade (v2.3.0)
├── core/
│   ├── CovarianceEngine.py      ← Ledoit-Wolf, Denoised, EWMA, Detoning
│   └── ClusterEngine.py         ← 7 Linkage Methods, 4 Distance Metrics
├── optimizers/
│   ├── HRP.py                   ← Vectorized Recursive Bisection
│   ├── HERC.py                  ← 4 Risk Measures (Var, CVaR, CDaR, MAD)
│   └── NCO.py                   ← Tikhonov-Regularized Nested Optimization
├── viz/ChartEngine.py           ← 9 Interactive Plotly Charts
├── tests/test_canopy.py         ← 29 Tests (all passing)
└── docs/                        ← Sphinx + ReadTheDocs

🚀 Installation

Using pip

pip install canopy-optimizer

From source

git clone https://github.com/Anagatam/Canopy.git
cd Canopy
pip install -e .

Dependencies

numpy>=1.24
pandas>=2.0
scipy>=1.10
scikit-learn>=1.3
plotly>=5.18

Testing & Developer Setup

# Run the full test suite
python -m pytest tests/test_canopy.py -v

# Run with coverage
python -m pytest tests/test_canopy.py -v --cov=canopy

# Generate charts
make charts

# Full validation
make all

Current: 29/29 tests passing in 0.84 seconds.


🔮 Canopy Pro

Canopy (this repository) is our open-source edition, freely available under the MIT License.

Canopy Pro is our advanced, top-grade premium model featuring next-generation hierarchical allocation algorithms currently under active development. It extends the open-source core with proprietary mathematical methods designed for the most demanding institutional portfolios:

🧬 Advanced Hierarchical Allocation Algorithms

Algorithm Description Advantage over Open Source
HRCP (Hierarchical Risk Contribution Parity) Exact risk budgeting within the hierarchical tree True equal risk contribution, not approximate
HERC-DRL (Deep Reinforcement Learning HERC) Dynamic cluster rebalancing via policy gradient Adapts to regime changes in real-time
Spectral NCO Spectral graph theory + NCO with persistent homology Captures higher-order asset relationships
Bayesian HRP Posterior-weighted hierarchical allocation Incorporates prior views (Black-Litterman compatible)

Feature Comparison

Feature Canopy (Open Source) Canopy Pro (Coming Soon)
Hierarchical Algorithms 3 (HRP, HERC, NCO) 7+ (HRCP, HERC-DRL, Spectral NCO, Bayesian HRP)
Covariance Estimators 4 8+ (DCC-GARCH, Factor Models, Realized Kernels)
Risk Measures 4 (Var, CVaR, CDaR, MAD) 12+ (EVaR, RLVaR, EDaR, Tail Gini)
Portfolio Modes 3 6+ (Risk Budgeting, Black-Litterman)
Real-Time Streaming
Enterprise Backtesting ✅ (Walk-forward, Monte Carlo)
Dedicated Support Community Priority SLA
Custom Integrations ✅ (Bloomberg, Refinitiv, MOSEK)

Interested in Canopy Pro? 📩 Sign up for early access →

We will notify you as soon as Canopy Pro is available.


⚖️ License & Disclaimer

Apache License 2.0 — Copyright © 2026 Anagatam Technologies. All rights reserved.

Apache 2.0 provides patent protection for contributors and users. See the full LICENSE file.

[!CAUTION] This library is NOT investment advice. Canopy is a mathematical software library for educational and research purposes only. It does not provide financial recommendations, trading signals, or portfolio management services. Before making any investment decisions, consult a qualified, licensed financial professional. See our full DISCLAIMER for details on SEC, SEBI, and global regulatory compliance.

Built with precision for the institutional quantitative finance community.


Links: 📖 Documentation · 📦 PyPI · 🐛 Issues · 📋 Changelog · ⚖️ Disclaimer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canopy_optimizer-3.0.0.tar.gz (65.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

canopy_optimizer-3.0.0-py3-none-any.whl (62.0 kB view details)

Uploaded Python 3

File details

Details for the file canopy_optimizer-3.0.0.tar.gz.

File metadata

  • Download URL: canopy_optimizer-3.0.0.tar.gz
  • Upload date:
  • Size: 65.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for canopy_optimizer-3.0.0.tar.gz
Algorithm Hash digest
SHA256 723890864de5e595d6fce1680ddc18b37ae609247b6a2b3afb79b48ffd9baee2
MD5 732a794f99484dd64060d78149dd19fc
BLAKE2b-256 f0b1b3997ee0aab9730fc59dab6dada3a07e41e38ca5d8716672ccfb1708eb93

See more details on using hashes here.

File details

Details for the file canopy_optimizer-3.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for canopy_optimizer-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c5743f1c2f0ffbbe18a340925b9bcf67301c317b6c0dda32dece86d41839ec4f
MD5 0eeaa700bac80f6c3ca6d36e28b1c06c
BLAKE2b-256 335d4298d88d1e3db5af3e86fa7af433d484b1f00ccaf0276674be706e8dbee2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page