Skip to main content

Clustered Learning of Approximate Manifolds

Project description

CLAM: Clustered Learning of Approximate Manifolds (v0.22.3)

CLAM is a Rust/Python library for learning approximate manifolds from data. It is designed to be fast, memory-efficient, easy to use, and scalable for big data applications.

CLAM provides utilities for fast search (CAKES) and anomaly detection (CHAODA).

As of writing this document, the project is still in a pre-1.0 state. This means that the API is not yet stable and breaking changes may occur frequently.

Installation

> python3 -m pip install "abd_clam==0.22.3"

Usage

from abd_clam.search import CAKES
from abd_clam.utils import synthetic_data

# Get the data.
data, _ = synthetic_data.bullseye()
# data is a numpy.ndarray in this case but it could just as easily be a
# numpy.memmap if your data do fit in RAM. We used numpy memmaps for the
# research, though they impose file-IO costs.

model = CAKES(data, 'euclidean')
# The CAKES class provides the functionality described in our
# [CHESS paper](https://arxiv.org/abs/1908.08551).

model.build(max_depth=50)
# Build the search tree to depth of 50.
# This method can be called again with a higher depth, if needed.

query, radius, k = data[0], 0.5, 10

rnn_results = model.rnn_search(query, radius)
# This is how we perform ranged nearest neighbors search with radius 0.5 around
# the query.

knn_results = model.knn_search(query, k)
# This is how we perform k-nearest neighbors search for the 10 nearest neighbors
# of the query.

# The results are returned as a dictionary whose keys are indices into the data
# array and whose values are the distance to the query.

License

MIT

Citation

TODO

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abd_clam-0.22.3.tar.gz (36.9 kB view hashes)

Uploaded Source

Built Distribution

abd_clam-0.22.3-py3-none-any.whl (46.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page