Skip to main content

closest_pairs finds the closest pairs of points in a dataset

Project description

Closest Pairs :triangular_ruler:

Find the closest pairs in an array.

Getting Started

pip install closest_pairs

or install from source:

git clone https://github.com/justinshenk/closest-pairs
cd closest_pairs
pip install .

How to use

import closest_pairs

# X is an n x m numpy array
pairs, distances = closest_pairs.solve(X, n=1)

You can specify how many pairs you want to identify with n.

Example

import closest_pairs
import numpy as np
import matplotlib.pyplot as plt

# Create dataset
X = np.random.random((100,2))
pairs, distance = closest_pairs.solve(X, n=1)

# Plot points
z, y = np.split(X, 2, axis=1)
fig, ax = plt.subplots()
ax.scatter(z, y) 

for i, txt in enumerate(X): 
    if i in pairs: 
        ax.annotate(i, (z[i], y[i]), color='red') 
    else: 
        ax.annotate(i, (z[i], y[i])) 

Check pairs:

In [10]: pairs                                                                                                                                
Out[10]: 
array([[[ 7],
        [16]],

       [[96],
        [50]]])

Output: example_plot

Caveats

closest_pairs will reduce the dimensionality with PCA of your data to two-dimensions for faster processing.

It also removes the first point in a pair if n>1. In rare cases this leads to false negatives if the data is highly overlapping.

Credit and Explanation

Python code modified from Andriy Lazorenko, packaged and made useful for >2 features by Justin Shenk.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

closest_pairs-0.1.5.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

closest_pairs-0.1.5-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file closest_pairs-0.1.5.tar.gz.

File metadata

  • Download URL: closest_pairs-0.1.5.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for closest_pairs-0.1.5.tar.gz
Algorithm Hash digest
SHA256 7f6d3e8834aa0979e86db16cc1d442c47bbb9a71f5348173aa3563687cfb4f5e
MD5 25978db8b2eae362844b65d2d4e5e7b1
BLAKE2b-256 458045956b4ef04984d3397037d9304da2aa559874ded2216087c344fff3d895

See more details on using hashes here.

File details

Details for the file closest_pairs-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: closest_pairs-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for closest_pairs-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 cc5c2900761931f962c459655e72e1415d3da76c44c8dc399625799aca33059c
MD5 a2ba88ee3dd3df98493f11be53efdbf5
BLAKE2b-256 c2c29b75f8b78dae9ebd9e8e697c28b4de0097f70640c6ffb3ce03cace188c28

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page