Skip to main content

A Cython wrapper for Bregman ANN searches

Project description

Bregman Approximate Nearest Neighbours for Python

This is a python wrapper for the (Decomposable) Bregman Approximate Nearest Neighbour(Bregman ANN, or BANN) package adapted by Hubert Wagner and Tuyen Pham. The original ANN package was written by David Mount and Sunil Arya. The original package uses Kd-trees to search for nearest neighbours in Euclidean space equipped with an $L^{p}$-norm. See David Mount's page for documention for the original ANN package.

For a Python wrapped version of the original ANN, see:

Bregman Divergences

Bregman divergences are measurements of generalized distances in a space. Unlike metrics, they are often assymmetric and do not globally satisfy the triangle inequality. Recently, these divergences have been useful in machine learning, with the most prominent example being the Kullback--Leibler divergence.

About

The BANN package currently uses Kd-trees for two primary computations:

  • (approximate) $k$-nearest neighbour searches with decomposable Bregman divergences
  • Bregman--Hausdorff divergence for decomposable Bregman divergences

Currently, this package supports the following divergences:

  • Kullback--Leibler divergence (primal and dual)
  • Itakura--Saito divergence (primal and dual)
  • Squared Euclidean divergence

Additional decomposable divergences can be simply added to the source code, and passing a divergence from Python is a planned implementation.

Details

Let $D_F:\Omega\times\Omega\to [0,\infty]$ be a decomposable Bregman divergence and let $P = \{p_n\}^N_{n=1}$, $Q = \{q_m\}^M_{m=1}$ be subsets of $\Omega$.

Bregman $k$-nn search

For $q\in Q$, the Bregman $k$-nearest neighbour search returns the ordered list of indices $(x_1,x_2,\dots,x_k)$, such that $D_F(q\|p_{x_1})\leq D_F(q\|p_{x_2})\leq\cdots\leq D_F(q\|p_{x_k})$ and $D_F(q\|p_{x_k})\leq D_F(q\|p_{\ell})$ for all $\ell\notin\{x_{1},x_{2},\dots,x_{k}\}$. As Bregman divergences are rarely symmetric, we can reverse the arguments as necessary.

This package also supports $\epsilon$-approximate nearest neighbour searches, where the divergence to the reported nearest neighbour is at most $(1+\epsilon)$-times the divergence to the true nearest neighbour.

Further details of using Kd-trees with Bregman Divergences are discussed here.

Bregman—Hausdorff divergence

The Bregman—Hausdorff divergence generalizes the Bregman divergence between two vectors to the Bregman divergence between to sets of vectors. The Bregman—Hausdorff divergence was introduced by Pham, Dal Poz Kouřimská, and Wagner, where they also provide algorithms for its computation. Specifically, we compute $$H_{D_F}(P|Q) = \inf \{r\geq0 : P\subseteq\bigcup_{q\in Q}B_F(q;r)\}$$ and $$H_{D_F}' = \inf \{r\geq0 : P\subseteq\bigcup_{q\in Q}B'_F(q;r)\}$$ via the shell algorithm, where $B_F(q;r)=\{x\in \Omega,:, D_F(q|x)\leq r\}$ and $B'_F(q;r) = \{x\in \Omega,:,D_F(x|q)\leq r\}$. Note that the directions of computations for the Bregman--Hausdorff divergences are reversed compared to the directions for the nearest neighbour searches.

The Bregman—Hausdorff divergence and shell algorithm for computation are introduced here.

Further details

For further details and example uses, see the documentation.

Requirements

Python Version

BANN requires Python >=3.11.

Dependencies

Installation

Feedback

Bug reports, pull requests after forking, and other questions may be sent to the maintainer: tuyen.pham@ufl.edu

Copyright and License

See Copyright and License for copyright and license information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bann-0.0.1.tar.gz (164.1 kB view details)

Uploaded Source

File details

Details for the file bann-0.0.1.tar.gz.

File metadata

  • Download URL: bann-0.0.1.tar.gz
  • Upload date:
  • Size: 164.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for bann-0.0.1.tar.gz
Algorithm Hash digest
SHA256 84d890f2a930b29ae382bec578d645be089ea20fe19cea261d23b113db5f62b9
MD5 a5f2fe47eca4f950ca4d91d5bbcd5089
BLAKE2b-256 c6ad411b97282ee420ad017cd2ab53d11569cc4c255a41eeb01b6a16161900fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page