A Cython wrapper for Bregman Kd-trees
Project description
Bregman Approximate Nearest Neighbours for Python
This is a python wrapper for the (Decomposable) Bregman Approximate Nearest Neighbour(Bregman ANN, or BANN) package adapted by Hubert Wagner and Tuyen Pham. The original ANN package was written by David Mount and Sunil Arya. The original package uses Kd-trees to search for nearest neighbours in Euclidean space equipped with an $L^{p}$-norm. See David Mount's page for documention for the original ANN package.
For a Python wrapped version of the original ANN, see:
Bregman Divergences
Bregman divergences are measurements of generalized distances in a space. Unlike metrics, they are often assymmetric and do not globally satisfy the triangle inequality. Recently, these divergences have been useful in machine learning, with the most prominent example being the Kullback--Leibler divergence.
About
The BANN package currently uses Kd-trees for two primary computations:
- (approximate) $k$-nearest neighbour searches with decomposable Bregman divergences
- Bregman--Hausdorff divergence for decomposable Bregman divergences
Currently, this package supports the following divergences:
- Kullback--Leibler divergence (primal and dual)
- Itakura--Saito divergence (primal and dual)
- Squared Euclidean divergence
Additional decomposable divergences can be simply added to the source code, and passing a divergence from Python is a planned implementation.
Details
Let $D_F:\Omega\times\Omega\to [0,\infty]$ be a decomposable Bregman divergence and let $P = \{p_n\}^N_{n=1}$, $Q = \{q_m\}^M_{m=1}$ be subsets of $\Omega$.
Bregman $k$-nn search
For $q\in Q$, the Bregman $k$-nearest neighbour search returns the ordered list of indices $(x_1,x_2,\dots,x_k)$, such that $D_F(q\|p_{x_1})\leq D_F(q\|p_{x_2})\leq\cdots\leq D_F(q\|p_{x_k})$ and $D_F(q\|p_{x_k})\leq D_F(q\|p_{\ell})$ for all $\ell\notin\{x_{1},x_{2},\dots,x_{k}\}$. As Bregman divergences are rarely symmetric, we can reverse the arguments as necessary.
This package also supports $\epsilon$-approximate nearest neighbour searches, where the divergence to the reported nearest neighbour is at most $(1+\epsilon)$-times the divergence to the true nearest neighbour.
Further details of using Kd-trees with Bregman Divergences are discussed here.
Bregman—Hausdorff divergence
The Bregman—Hausdorff divergence generalizes the Bregman divergence between two vectors to the Bregman divergence between to sets of vectors. The Bregman—Hausdorff divergence was introduced by Pham, Dal Poz Kouřimská, and Wagner, where they also provide algorithms for its computation. Specifically, we compute $$H_{D_F}(P|Q) = \inf \{r\geq0 : P\subseteq\bigcup_{q\in Q}B_F(q;r)\}$$ and $$H_{D_F}' = \inf \{r\geq0 : P\subseteq\bigcup_{q\in Q}B'_F(q;r)\}$$ via the shell algorithm, where $B_F(q;r)=\{x\in \Omega,:, D_F(q|x)\leq r\}$ and $B'_F(q;r) = \{x\in \Omega,:,D_F(x|q)\leq r\}$. Note that the directions of computations for the Bregman--Hausdorff divergences are reversed compared to the directions for the nearest neighbour searches.
The Bregman—Hausdorff divergence and shell algorithm for computation are introduced here.
Further details
For further details and example uses, see the documentation.
Requirements
Python Version
BANN requires Python >=3.11.
Dependencies
Installation
Feedback
Bug reports, pull requests after forking, and other questions may be sent to the maintainer: tuyen.pham@ufl.edu
Copyright and License
See Copyright and License for copyright and license information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file bann-0.0.3.tar.gz.
File metadata
- Download URL: bann-0.0.3.tar.gz
- Upload date:
- Size: 383.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e80fd881bc192e2529ff19cbf6e72431ae148724da0dd6c45280c0ff93c4f5d9
|
|
| MD5 |
ed07c82c5da8662fb0895711b0b020ac
|
|
| BLAKE2b-256 |
c053c79d7df10d20f66f859d8aa9dcf46fe634a2818d610392e53ea05ad5b023
|