Skip to main content

Various integrations for ANN (Approximate Nearest Neighbours) libraries into scikit-learn.

Project description

sklearn-ann eases integration of approximate nearest neighbours libraries such as annoy, nmslib and faiss into your sklearn pipelines. It consists of:

  • Transformers conforming to the same interface as KNeighborsTransformer which can be used to transform feature matrices into sparse distance matrices for use by any estimator that can deal with sparse distance matrices. Many, but not all, of scikit-learn’s clustering and manifold learning algorithms can work with this kind of input.

  • RNN-DBSCAN: a variant of DBSCAN based on reverse nearest neighbours.

Installation

To install the latest release from PyPI, run:

pip install sklearn-ann

To install the latest development version from GitHub, run:

pip install git+https://github.com/scikit-learn-contrib/sklearn-ann.git#egg=sklearn-ann

Why? When do I want this?

The main scenarios in which this is needed is for performing clustering or manifold learning or high dimensional data. The reason is that currently the only neighbourhood algorithms which are build into scikit-learn are essentially the standard tree approaches to space partitioning: the ball tree and the K-D tree. These do not perform competitively in high dimensional spaces.

Development

This project is managed using Hatch and pre-commit. To get started, run pre-commit install and hatch env create. Run all commands using hatch run python <command> which will ensure the environment is kept up to date. pre-commit comes into play on every git commit after installation.

Consult pyproject.toml for which dependency groups and extras exist, and the Hatch help or user guide for more info on what they are.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn_ann-0.1.4.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sklearn_ann-0.1.4-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file sklearn_ann-0.1.4.tar.gz.

File metadata

  • Download URL: sklearn_ann-0.1.4.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sklearn_ann-0.1.4.tar.gz
Algorithm Hash digest
SHA256 60e7c91de7a6bda1afe5a9082d4bc441c641291052d48242e5fb991d0acf81c0
MD5 877e0a8c0c502c1f259179c5e0f7a1e2
BLAKE2b-256 8e4107c7d40542513ac25014bb1bafe3f137adba717bf886603cbeca6521c007

See more details on using hashes here.

Provenance

The following attestation bundles were made for sklearn_ann-0.1.4.tar.gz:

Publisher: publish.yml on scikit-learn-contrib/sklearn-ann

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sklearn_ann-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: sklearn_ann-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sklearn_ann-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c449454d1472f835ba223fcc2936a72d77e31ae31c5c1da4aca21c4e13dcad0f
MD5 b315c092f1795e1751905e09073c77b7
BLAKE2b-256 ba91b2f217d3570990d95c2483a8748658c5a22ac40609264e0ca32552dde0eb

See more details on using hashes here.

Provenance

The following attestation bundles were made for sklearn_ann-0.1.4-py3-none-any.whl:

Publisher: publish.yml on scikit-learn-contrib/sklearn-ann

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page