FNNTW: Fastest Nearest Neighbor (in the) West. A fast kdtree/kNN library.

These details have not been verified by PyPI

Project links

Source Code

Project description

FNNTW: Fastest Nearest Neighbor (in the) West

Fastest Neighbor Nearest Neighbor (in the) West is an in-development kD-tree library that aims to be one of the most, if not the most, performant parallel kNN libraries that exist.

Basic Usage

import pyfnntw;
import numpy as np;
from scipy.spatial import cKDTree as Tree
from time import time

ND = 10**5
NQ = 10**6
RUNS = 10
TRIALS = 10
WARMUP = 5

# Get some data and queries
data = np.random.uniform(size=(ND, 3))
query = np.random.uniform(size=(NQ, 3))

# Build and query the pynntw tree
tree1 = pyfnntw.Tree(data, 32, 1)
(_, ids1) = tree1.query(query)

# Build and query the scipy tree
tree2 = Tree(data, 32)
(_, ids2) = tree2.query(query)

if np.all(ids1 == ids2):
    print("Success")
else:
    print("Failure")

There are several key components of the building and querying process of the kD-trees that allows for its performance.

1. Building The Tree

Many kD-Tree build implementations are not parallel despite being very easy to parallelize. Here we discuss parallelization strategies for individual subtrees and across subtrees.

a. Using Quickselect

By using the quickselect in the form of select_nth_unstable_by in Rust's core::slice instead of an average or even some other median algorithm, we get the left and right subsets for free in the same O(N) algorithm instead of having to do yet another O(N) comparisons to obtain the left and right bins. As such, the build is still O(N log(N)), but removes a whole O(N) operation that is found in many other libraries during each splitting. For trees with >=100,000 nodes, a homemade parallel approximate median finder is used to find the split point for the subtrees. This speeds up the building of large trees tremendously, as ≈95% of a sequential tree build is spent finding medians.

b. Parallel build

Every subtree of a kD-tree is an independent kD-tree, so at each splitting one could build each subtree in parallel up to some minimum subtree size. This is done here with the user-specified par_split_level, which is the tree depth at which the parallelism begins. See note in Benchmark Against Other Codes about recommended values.

2. Unsafe Accesses

Because we know the shape of all arrays (i.e. the dimension of the tree) at compile time, and we know the tree size and topology post-build at run time, the unsafe methods get_unchecked and get_unchecked_mut are used liberally throughout the code. This means virtually no bounds checks are done.

3. Allocators

We tested many allocators, including the default allocated and jemalloc, tcmalloc, snmalloc, mimalloc, rpmalloc, and found the best performance with tcmalloc.

Benchmark Against Other Codes

This library is intended to be used by the author to calculate summary statistics in cosmology. As such, the parameters of the benchmark chosen are close to those that would be used in analyzing the output of a cosmological simulation. In such an application, often many subsamples or simulation boxes are used. So, the combined build + query time is important since many different trees may be constructed in an analysis. We use

A mock dataset of 100,000 uniform random points in the unit cube.
A query set of 1,000,000 uniform random points in the unit cube.

Over 100 realizations of the datasets and query points, the following results are obtained for the average build (serial) and 1NN query times on an AMD EPYC 7502P using 48 threads. The results are sorted by the combined build and query time.

Code	Build (ms)	Query (ms)	Total (ms)
FNNTW	12	28	39
pykdtree (python)	12	36	48
nabo-rs (rust)	25	30	55
Scipy's cKDTree (python)	31	47	78
kiddo (rust)	26	84	110

With FNNTW's parallel build capability, the build time can go as low as 5 ms on the AMD EPYC 7502P (at split_level = 2) (and under 5 ms at single precision). Since the overhead of the parallelism and atomic operations slows down the build when the number of datapoints is small, both a parallel build and non_parallel build are available via Tree:new(..) and Tree::new_parallel(..). The latter takes the aforementioned parameter par_split_level, which is the split depth after which the parallelism stops. Although for our applications of O(1e5) points we see the biggest improvement for par_split_level = 2, we expect that the optimal par_split_level will increase with the size of the dataset. For tree sizes of O(1e8) points, for example, we see peak performance at par_split_level = 4 (16 threads).

Project details

These details have not been verified by PyPI

Project links

Source Code

Release history Release notifications | RSS feed

This version

0.4.1

Mar 6, 2023

0.4.0

Mar 4, 2023

0.2.2

Dec 18, 2022

0.1.3

Aug 21, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfnntw-0.4.1.tar.gz (69.7 kB view details)

Uploaded Mar 6, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyfnntw-0.4.1-cp311-cp311-macosx_11_0_arm64.whl (350.1 kB view details)

Uploaded Mar 6, 2023 CPython 3.11macOS 11.0+ ARM64

File details

Details for the file pyfnntw-0.4.1.tar.gz.

File metadata

Download URL: pyfnntw-0.4.1.tar.gz
Upload date: Mar 6, 2023
Size: 69.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: maturin/0.14.13

File hashes

Hashes for pyfnntw-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`a043d0e732057d5ee1f090972bf370dd18158a10b7076687b880b2c60587b4e3`
MD5	`e6284248b5c25bb538710d81f8c7f07c`
BLAKE2b-256	`b6149c490fc47a6d947056713cce271d6889ec2139be2b0020f9d6e9fc203ee9`

See more details on using hashes here.

File details

Details for the file pyfnntw-0.4.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

Download URL: pyfnntw-0.4.1-cp311-cp311-macosx_11_0_arm64.whl
Upload date: Mar 6, 2023
Size: 350.1 kB
Tags: CPython 3.11, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/0.14.13

File hashes

Hashes for pyfnntw-0.4.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`52936fa4a7323666b0ec7d2cc73613e0fcf062cca954317a4935c24047bbb628`
MD5	`6f9baac35914a5135d1558734a3d0fbc`
BLAKE2b-256	`b63f0000e7261a728c12a8126b706eab17b0e6abff454adc50a877b105a5aab5`

See more details on using hashes here.

pyfnntw 0.4.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FNNTW: Fastest Nearest Neighbor (in the) West

Basic Usage

1. Building The Tree

a. Using Quickselect

b. Parallel build

2. Unsafe Accesses

3. Allocators

Benchmark Against Other Codes

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes