Skip to main content

Modular methods for stochastic clustering

Project description

stoclust is a package of modularized methods for stochastic and ensemble clustering techniques.

By modular, I mean that there are few methods in this package which act as a single pipeline for clustering a dataset–––rather, the methods each form a unit of what might be a larger clustering routine.

These modular units are designed to be compatible with general clustering methods from other packages, like scipy.clustering or sklearn.cluster. However, we also provide specific methods for implementing clustering algorithms whose underlying mathematics is rooted in stochastic analysis and dynamics. Additionally, one can add a stochastic twist to any clustering method by using ensemble clustering, which uses randomness to probe the stability and robustness of clustering results.

The core of our package is currently:

  1. The two classes Aggregation and Hierarchy, which respectively formalize a single clustering or partition of a set, and a hierarchical clustering of a set, each in a manner that is amicable to numpy and pandas indexing, and allows cross-referencing between subsets and supersets;

  2. The ensemble module, which can be used to generate noisy ensembles from a base dataset and to apply clustering methods to already-generated ensembles

  3. The clustering module, which contains functions implementing selected stochastic clustering techniques;

  4. The simulation and regulators modules, which currently allows the generation of regulated Markov random walks.

In addition to these are several auxiliary modules such as distance, which contains methods for calculating simple distance metrics from data; visualization, which contains methods for easily generating Plotly visualizations of data and clusters; and utils, which contains useful miscellaneous functions.

Check out our site for documentation, examples and further info!

Installation

To install from pip, run

>>> pip install stoclust

To build from source, you can either download the zip or tarball directly, or clone the GitHub repository via

>>> git clone https://github.com/samlikesphysics/stoclust.git

Then run in the the same folder as setup.py:

>>> python setup.py build
>>> python -m pip install .

Dependencies

stoclust depends on the following packages:

Package Recommended version
numpy 1.15.0
scipy 1.1.0
plotly 4.12.0
pandas 0.25.0
tqdm 4.41.1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stoclust-0.1.6.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stoclust-0.1.6-py3-none-any.whl (25.0 kB view details)

Uploaded Python 3

File details

Details for the file stoclust-0.1.6.tar.gz.

File metadata

  • Download URL: stoclust-0.1.6.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.0

File hashes

Hashes for stoclust-0.1.6.tar.gz
Algorithm Hash digest
SHA256 f4b8886489b193ae7fec14f1789198b6ff2c06c506f63929a6fd62418239678d
MD5 74bf8b133cb6656ecb28bcdeff541100
BLAKE2b-256 8b8a60c24e945057ba35ce45a08dd4debc4ec68144f5872d541988b245a31c12

See more details on using hashes here.

File details

Details for the file stoclust-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: stoclust-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 25.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.0

File hashes

Hashes for stoclust-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 e200e1811a20f09a2cb6b0ee21b98638c1558808d8b4954f5fc5d7343c7c5396
MD5 f03ed820ba209f28fb9420bd5e06819d
BLAKE2b-256 a884725ea6441ef4ca4d5ff2d189f223f362ae774b588cfeafec7d49b45c14f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page