Skip to main content

A clean and easy interface for nearest-neighbors lookup

Project description

Simple Neighbors

https://img.shields.io/travis/aparrish/simpleneighbors.svg https://coveralls.io/repos/github/aparrish/simpleneighbors/badge.svg?branch=master https://img.shields.io/pypi/v/simpleneighbors.svg

Simple Neighbors is a clean and easy interface for performing nearest-neighbor lookups on items from a corpus. For example, here’s how to find the most similar color to a color in the xkcd colors list:

>>> from simpleneighbors import SimpleNeighbors
>>> import json
>>> color_data = json.load(open('xkcd.json'))['colors']
>>> hex2int = lambda s: [int(s[n:n+2], 16) for n in range(1,7,2)]
>>> colors = [(item['color'], hex2int(item['hex'])) for item in color_data]
>>> sim = SimpleNeighbors(3)
>>> sim.feed(colors)
>>> sim.build()
>>> list(sim.neighbors('pink', 5))
['pink', 'bubblegum pink', 'pale magenta', 'dark mauve', 'light plum']

Read the documentation here: https://simpleneighbors.readthedocs.org.

Approximate nearest-neighbor lookups are a quick way to find the items in your data set that are closest (or most similar to) any other item in your data, or an arbitrary point in the space that your data defines. Your data items might be colors in a (R, G, B) space, or sprites in a (X, Y) space, or word vectors in a 300-dimensional space.

You could always perform pairwise distance calculations to find nearest neighbors in your data, but for data of any appreciable size and complexity, this kind of calculation is unbearably slow. This library uses Annoy behind the scenes for approximate nearest-neighbor lookups, which are ultimately a little less accurate than pairwise calculations but much, much faster.

The library also keeps track of your data, sparing you the extra step of mapping each item in your data to its integer index in Annoy (at the potential cost of some redundancy in data storage, depending on your application).

I made Simple Neighbors because I use Annoy all the time and found myself writing and rewriting the same bits of wrapper code over and over again. I wanted to hide a little bit of the complexity of using Annoy to make it easier to build small prototypes and teach workshops using nearest-neighbor lookups.

Installation

Install with pip like so:

pip install simpleneighbors

You can also download the source code and install manually:

python setup.py install

History

0.0.1 (2018-07-13)

  • Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simpleneighbors-0.0.1.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

simpleneighbors-0.0.1-py2.py3-none-any.whl (6.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file simpleneighbors-0.0.1.tar.gz.

File metadata

File hashes

Hashes for simpleneighbors-0.0.1.tar.gz
Algorithm Hash digest
SHA256 5f74562ab6dee49c98cba5ee0feb36bd6d38d49979f04f1775abba1672a99520
MD5 56a175af44f6ab572634524bf3f6aa47
BLAKE2b-256 0c05e0d0876a71c72878d76db6ed7f007af6346a2fcf6be714b71b64ef60bd97

See more details on using hashes here.

File details

Details for the file simpleneighbors-0.0.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for simpleneighbors-0.0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 985402585d63859867d67a9d9527653ee6760b7db77bc9f2d52e5093437a4829
MD5 42b52ad37e69fbd32a0fb3dee5296fd5
BLAKE2b-256 a28eb8ca38e4305bdf5c4cac5d9bf4b65022a2d3641a978b28ce92f9e4063c7b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page