Skip to main content

Adaptive Density Peak Tree Clustering

Project description

Introduction

Spatiotemporal raster (STR) data employ an array of grids (i.e., raster cube) to represent temporally changed and spatially distributed information, and it is usually used in recording environmental variables and socioeconomic indices. To reveal the geographic patterns embedded in STR data, the Clustering by Fast Search and Finding of Density Peaks (CFSFDP) algorithm is considered to be effective and suitable. However, limitations still exist in this algorithm. Targeting the selection of centers, the support of a large volume of data, and the measurement of spatial-temporal-attribute coupled distance, we proposed an improved method (Spatial Temporal - Adaptive Density Peak Tree Clustering, ST-ADPTC). A strategy for automatically selecting clustering centers is introduced based on adaptive density peak tree segmentation. The k-nearest neighbors (kNN) method is employed to decrease the memory usage when facing big data. Moreover, the neighborhood of coupled spatial, temporal, and attributes is constructed to calculate the local density and to find clusters with their time-varying behaviors. Based on the proposed method, we developed an open-source Python package (Geo_ADPTC) to assist users conducting clustering analysis for STR data. Experiments on benchmarking datasets show the improvement of autoselection of clustering centers. A case study on sea surface temperature data shows that it is feasible and effective to explore spatial and temporal distribution patterns using the proposed method.

Composition of GEO_ADPTC

To help clustering analysis of STR data, the ST-ADPTC method is implemented in Python along with some functions that support exploration (data visualization, clustering tendency, evaluation of clustering results, etc.). The open source package, named Geo_ADPTC, has four major modules: the cluster algorithm module, auxiliary tool module, visualization module and validation module.

image-20220225162236227

  1. Cluster Algorithm Module: Provides interfaces for the clustering algorithm described in this paper, including the ADPTC clustering algorithm and ST-ADPTC clustering algorithm.

  2. Auxiliary Tools Module: The main functions include preprocessing of the dataset, clustering tendency analysis, similarity measure functions (Euclidean distance, Manhattan distance, Mahalanobis distance, cosine similarity, etc.), local density functions (cutoff density and Gaussian density), and the kNN algorithm.

  3. Visualization Module: Provides a variety of visualization methods to assist users in analyzing the clustering results, including two-dimensional plots, three-dimensional scatter plots and some statistical analysis charts, such as box charts and violin charts; additionally, it includes the auxiliary decision graph of the classical density peak algorithm.

  4. Verification Module: A series of methods are provided to quantitatively evaluate the clustering results, including the DB index, CH index, silhouette coefficient, etc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GEO_ADPTC-0.0.5.tar.gz (41.4 kB view details)

Uploaded Source

Built Distribution

GEO_ADPTC-0.0.5-py3-none-any.whl (42.9 kB view details)

Uploaded Python 3

File details

Details for the file GEO_ADPTC-0.0.5.tar.gz.

File metadata

  • Download URL: GEO_ADPTC-0.0.5.tar.gz
  • Upload date:
  • Size: 41.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.10.0 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/1.0.0 urllib3/1.26.15 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.13

File hashes

Hashes for GEO_ADPTC-0.0.5.tar.gz
Algorithm Hash digest
SHA256 a36b21cac7be2dc0f8e46c93770bc3f71149a4006505983afcc2c3a3f18e1885
MD5 928c3b4e5b2f0074d85a4bebc61d734a
BLAKE2b-256 576521325aa467f6bc6aa8b3c8e9f1f617a27c3bfcb77bf4ad6dfb335f984290

See more details on using hashes here.

File details

Details for the file GEO_ADPTC-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: GEO_ADPTC-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 42.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.10.0 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/1.0.0 urllib3/1.26.15 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.13

File hashes

Hashes for GEO_ADPTC-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 4d76521332194c772e085322edbc3764086578ab7d46f872e922164dd52f761c
MD5 176b658e8a400618b6475d7c9de54412
BLAKE2b-256 9892a1ec1336041c796eb1a50063472c7fb4530bd29c9c09827584daacb9b8c1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page