Skip to main content

Clustered Hierarchical Entropy-Scaling Search

Project description

Clustered Hierarchical Entropy-Scaling Search of Astrophysical and Biological Data

CHESS is a search algorithm for large data sets when the data exhibits certain geometric properties. The paper is available on the arXiv.

We have extended CHESS to perform Manifold Learning and Anomaly Detection. We are working on adding Dimensionality Reduction and Visualization abilities, and on 3-d Object Recognition from point clouds. One of the major problems with this extension is that we need a new name. Stay tuned.

Language grade: Python codecov Documentation Status

Installation

python3 -m pip install CHESS-python

Usage

import numpy as np

from chess.datasets import bullseye
from chess.manifold import Manifold
from chess import criterion

# Get the data.
data, _ = bullseye()
# data is a numpy.ndarray in this case but it could just as easily be a numpy.memmap if your data cannot fit in RAM.
# We used memmaps for the research, though it does impose file-io costs.

manifold = Manifold(data=data, metric='euclidean')
# Any metric allowed by scipy's cdist function is allowed in Manifold.
# You can also define your own distance function. It will work so long as scipy allows it.

manifold.build(criterion.MaxDepth(20), criterion.MinRadius(0.25))
# Manifold.build can optionally take any number of early stopping criteria.
# chess.criterion defines some criteria that we have used in research.
# You are free to define your own.
# Take a look at chess/criterion.py for hints of how to define custom criteria.

# A sample rho-nearest neighbors search query
query, radius = data[0], 0.05
results = manifold.find_points(point=query, radius=radius)
# results is a dictionary of indexes of hits in data and the distance to those hits.

# A sample k-nearest neighbors search query
results = manifold.find_knn(point=query, k=25)

chess.Manifold relies on the Graph and Cluster classes. You can import these and work with them directly if you so choose. We have written good docs for each class and method. Go crazy.

Contributing

Pull requests and bug reports are welcome. For major changes, please first open an issue to discuss what you would like to change.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

CHESS-python-nightly-1.1.1.dev40.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

CHESS_python_nightly-1.1.1.dev40-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file CHESS-python-nightly-1.1.1.dev40.tar.gz.

File metadata

  • Download URL: CHESS-python-nightly-1.1.1.dev40.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.8.0

File hashes

Hashes for CHESS-python-nightly-1.1.1.dev40.tar.gz
Algorithm Hash digest
SHA256 550fb1b026dd93497db5bdec67e918b2b25a81dfd848786517999e96f39217e9
MD5 531a937ffbf1457c22040ccfc14c7e3f
BLAKE2b-256 57673a27be457710486e2dbf516196a907e7e7dd537bcd759b6a233faa1d1bdf

See more details on using hashes here.

File details

Details for the file CHESS_python_nightly-1.1.1.dev40-py3-none-any.whl.

File metadata

  • Download URL: CHESS_python_nightly-1.1.1.dev40-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.8.0

File hashes

Hashes for CHESS_python_nightly-1.1.1.dev40-py3-none-any.whl
Algorithm Hash digest
SHA256 8cdfcc6bea9ef7f60ac513dbefc4373c933b7da50ec7ab8e87a133b085b2bcb9
MD5 6e355d6b6f56aa976ffe4c2ffd0813f0
BLAKE2b-256 b61aef8290c7cfca647c77da61b3f8e57a6c8112d982a728bd0102caf3630e8b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page