Skip to main content

User-friendly thresholded subspace-constrained mean shift for geospatial data

Project description

DREDGE

User-friendly thresholded subspace-constrained mean shift for geospatial data

DREDGE, short for Density Ridge Estimation Describing Geospatial Evidence, arguably an unnecessarily forced acronym, offers a new tool to find density ridges in latitude-longitude coordinates based on the subspace-constrained mean shift (SCMS) algorithm introduced by Ozertem and Erdogmus (2011). The tool approximates principal curves for a given set of coordinates, featuring various improvements over the initial algorithm and alterations to facilitate the application to geospatial data: Thresholding, as described in cosmological research by Chen et al. (2015) and Chen et al. (2015), avoids dominant density ridges in sparsely populated areas of the dataset. In addition, the haversine formula is used as a distance metric to calculate the great circle distance, which makes the tool applicable not only to city-scale data, but also to datasets spanning multiple countries by taking the Earth's curvature into consideration.

In essence, DREDGE provides density-based line points which optimize the distance to a dataset of coordinates along those lines, with larger bandwidths leading to a decrease in summed line length and an increase in the average distance to the nearest line. Since DREDGE was initially developed to be applied to crime incident data, the default bandwidth calculation follows a best-practice approach that is well-accepted within quantitative criminology, using the mean distance to a given number of nearest neighbors (Williamson et al., 1999). Since practitioners in that area of study are often interested in the highest-density regions of a dataset, the tool also features the possibility to specify a top-percentage level for a kernel density estimate that the ridge points should fall within.

Installation

DREDGE can be installed via PyPI, with a single command in the terminal:

pip install dredge

Alternatively, the file dredge.py can be downloaded from the folder dredge in this repository and used locally by placing the file into the working directory for a given project. An installation via the terminal is, however, highly recommended, as the installation process will check for the package requirements and automatically update or install any missing dependencies, thus sparing the user the effort of troubleshooting and installing them themselves.

Quickstart guide

DREDGE only requires a two-column NumPy array as its primary input (coordinates), with one data point per row, and latitude and longitude values in the columns. Four additional optional parameters can, however, be set: The number of nearest neighbors (neighbors) used to automatically calculate an optimal bandwidth can be manually changed, the bandwidth (bandwidth) itself can be forced to a certain value, and the threshold used to check for convergence between iterations can be set (threshold). The fourth parameter (percentage) unlocks an additional functionality of DREDGE, as the interest of practitioners is often constrained to high-density areas. For a user-provided percentage value p, the kernel density estimation in the tool's inner workings is used to only retain ridge points above the (100 - p)th percentile of the provided dataset's density landscape. This allows, for example, route matching to be focused on these areas.



Variables Explanations Default
coordinates The spatial data as latitude-longitude coordinates
neighbors (optional) The number of nearest neighbors to get a bandwidth 10
bandwidth (optional) The bandwidth used for kernel density estimates None
convergence (optional) The threshold used for inter-iteration convergence 0.01
percentage (optional) The aimed-for percentage of highest-density ridges None



After the installation via PyPI, or using the dredge.py file locally, the usage looks like this:

from dredge import filaments

filaments(coordinates = your_coordinates,
                        percentage = 5) 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dredge-1.0.0.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

dredge-1.0.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file dredge-1.0.0.tar.gz.

File metadata

  • Download URL: dredge-1.0.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for dredge-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d1dc9c199b681f81915730fa47bef78b8ab519781bbc109c19e94848ebe2c620
MD5 9fec5b37438cb9e47462390e0b8e8b36
BLAKE2b-256 1513d5b2d000961ca09498dca5a58aebeb735c324de2c25c6ace9240754a449b

See more details on using hashes here.

File details

Details for the file dredge-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: dredge-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.6

File hashes

Hashes for dredge-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6d421dcb255fea0ba08b9e0a549bbe0d021fbb0d85cc5cc140b0c0249bb28c28
MD5 882cf431f1a81630d4d7e7dd2bab8d26
BLAKE2b-256 a022c12a68cecd1a88b2bef3d75b9f343699841f0be71c7cbe0d0c1ead80c205

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page