Parallel compressed cover tree (PCCT) library targeting JAX-based Vecchia GP workloads.
Project description
Covertreex
High-performance cover tree library for k-NN queries, optimized for Vecchia-style Gaussian process pipelines.
Features
- ~170x faster than GPBoost for residual correlation k-NN queries
- Hybrid Python/Numba + Rust implementation
- AVX2 SIMD optimized dot products in Rust backend
- Residual correlation metric with RBF and Matérn 5/2 kernels
- Hilbert curve ordering for cache-efficient tree construction
Installation
pip install covertreex
Quick Start
Basic Euclidean k-NN
import numpy as np
from covertreex import CoverTree
# Build tree from points
points = np.random.randn(10000, 3)
tree = CoverTree().fit(points)
# Query k nearest neighbors
query_points = np.random.randn(100, 3)
neighbors = tree.knn(query_points, k=10)
# With distances
neighbors, distances = tree.knn(query_points, k=10, return_distances=True)
Residual Correlation Metric (Vecchia GP)
For Gaussian process applications with Vecchia approximations:
import numpy as np
from covertreex import CoverTree, Runtime
from covertreex.metrics.residual import build_residual_backend, configure_residual_correlation
# Your spatial coordinates
coords = np.random.randn(10000, 3).astype(np.float32)
# Build residual backend (computes V-matrix from inducing points)
# V[i] = L_mm^{-1} @ K(x_i, inducing_points)
# p_diag = diag(K) - ||V||^2 (residual variance)
backend = build_residual_backend(
coords,
seed=42,
inducing_count=512, # Number of inducing points
variance=1.0, # Kernel variance
lengthscale=1.0, # Kernel lengthscale
kernel_type=0, # 0=RBF, 1=Matérn 5/2
)
# Configure and build tree
runtime = Runtime(metric="residual_correlation", engine="rust-hilbert")
ctx = runtime.activate()
configure_residual_correlation(backend, context=ctx)
# Query indices (residual metric uses point indices, not coordinates)
query_indices = np.arange(1000, dtype=np.int64)
tree = CoverTree(runtime).fit(query_indices.reshape(-1, 1))
neighbors = tree.knn(query_indices.reshape(-1, 1), k=50)
Engine Selection
from covertreex import CoverTree, Runtime
# Python/Numba reference implementation (full telemetry)
runtime = Runtime(engine="python-numba")
# Rust backend, natural order
runtime = Runtime(engine="rust-natural")
# Rust backend with Hilbert ordering (fastest)
runtime = Runtime(engine="rust-hilbert")
tree = CoverTree(runtime).fit(points)
Profile Presets
from covertreex import Runtime
# Load predefined configuration
runtime = Runtime.from_profile("residual-gold")
# With overrides
runtime = Runtime.from_profile("residual-gold", overrides=["seeds.global_seed=42"])
Available profiles: default, residual-gold, residual-fast, residual-audit, cpu-debug
Performance
Benchmark on AMD Ryzen 9 9950X (N=32k points, D=3, k=50 neighbors):
| Engine | Build Time | Query Throughput | vs GPBoost |
|---|---|---|---|
| python-numba | 7.2s | 42,000 q/s | 154x faster |
| rust-hilbert | 0.85s | 47,000 q/s | 170x faster |
API Reference
CoverTree
Main interface for building trees and running k-NN queries.
CoverTree(runtime: Runtime = Runtime())
.fit(points) -> tree # Build tree from points
.knn(queries, k=10) -> indices # Find k nearest neighbors
.knn(queries, k=10, return_distances=True) -> (indices, distances)
Runtime
Configuration for backend, metric, and engine selection.
Runtime(
engine="rust-hilbert", # Execution engine
metric="residual_correlation", # Distance metric
backend="numpy", # Array backend
precision="float32", # Float precision
)
Residual Backend
For Vecchia GP residual correlation:
from covertreex.metrics.residual import build_residual_backend
backend = build_residual_backend(
coords, # (n, d) spatial coordinates
seed=42, # Random seed
inducing_count=512, # Number of inducing points
variance=1.0, # Kernel variance σ²
lengthscale=1.0, # Kernel lengthscale ℓ
kernel_type=0, # 0=RBF, 1=Matérn 5/2
)
Development
# Install in development mode
pip install -e ".[dev]"
# Build Rust backend
maturin develop --release
# Run tests
pytest
# Lint
ruff check covertreex
CLI (Testing)
A CLI is included for benchmarking and testing:
python -m cli.pcct --help
python -m cli.pcct query --engine rust-hilbert --tree-points 32768 --k 50
License
Apache 2.0 — See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file covertreex-0.2.0-cp313-cp313-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: covertreex-0.2.0-cp313-cp313-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 785.3 kB
- Tags: CPython 3.13, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad4230fcf9a7f651182d822efa0d10abc271c73fc55fd431ab6d69100659f82b
|
|
| MD5 |
e0d2d1bc7ae6e1525730494116dd8c3a
|
|
| BLAKE2b-256 |
e341de77535f392c62f2a3216c43899b70e4859a40e311ff5ad2f519a6e022e9
|