📍 command-line tool for clustering geolocations.
Project description
geoclustering
📍 command-line tool for clustering geolocations.
Features
- Uses DBSCAN or OPTICS to perform clustering.
- Outputs clustering results as
json
,txt
andgeojson
. - Creates a kepler.gl visualization of clusters.
Clustering Method
A cluster is created when a certain number of points (defined with --size
) each are within a given distance (defined with --distance
) of at least one other point in the cluster.
Install
Install with pip:
# with kepler.gl visualization support
pip install geoclustering[full]
# only text-based output
pip install geoclustering
If the full
install fails, you might need to install kepler.gl build dependencies:
# macos
brew install proj gdal
Usage
Usage: geoclustering [OPTIONS] FILENAME
Tool to cluster geolocations. A cluster is created when a certain number of
points (defined with --size) each are within a given distance (defined with
--distance) of at least one other point in the cluster. Input is supplied as
a csv file. At a minimum, each row needs to have a 'lat' and a 'lon' column.
Other rows are reflected to the output.
Options:
-d, --distance FLOAT (in km) Max. distance between two points in
a cluster. [required]
-s, --size INTEGER Min. number of points in a cluster.
[required]
-o, --output PATH Output directory for results. Default:
./output
-a, --algorithm [dbscan|optics]
Clustering algorithm to be used. `optics`
produces tighter clusters but is slower.
Default: dbscan
--open Open the generated visualization in the
default browser automatically.
--debug Print debug output.
--help Show this message and exit.
Input
Inputs are supplied as a .csv
file. At a minimum, each row needs to have a lat
and a `lon`` column. Other rows are reflected to the output.
id,name,lat,lon
1,Bonnibelle Mathwen,40.1324085,64.4911086
...
Output
If at least one cluster was found, the tool outputs a folder with output as json
, geojson
, txt
, csv
files. A kepler.gl html
file is generated as well.
JSON
Encodes an array of clusters, each containing an array of points.
[
{
"cluster_id": 0,
"points": [
{
"id": 9,
"name": "Rosanna Foggo",
"lat": -6.2074293,
"lon": 106.8915948
}
]
}
]
GeoJSON
Encodes a single FeatureCollection
, containing all points as Feature
objects.
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [
106.891595,
-6.207429
]
},
"properties": {
"id": 9,
"name": "Rosanna Foggo",
"cluster_id": 0
}
}
]
}
Text
Encodes cluster as blocks separated by a newline, where each line in a cluster block contains one point.
Cluster 0
id 9, name Rosanna Foggo, lat -6.2074293, lon 106.8915948
// ...
CSV
Encodes each event in one line with cluster_id
information associated.
cluster_id,name,lat,lon
9,Rosanna Foggo,-6.2074293,106.8915948
...
kepler.gl
Develop
It is assumed that you are using Python3.9+. It is encouraged to setup a virtualenv for development.
# install dependencies & dev-dependencies
# PIP
pip install -e .[dev,full]
# PIPENV
pipenv install --dev -e .
# install a git hook that runs the code formatter before each commit.
pre-commit install
We use Black as our code formatter. If you don't want to use the pre-commit
hook, you can run the formatter manually or via an editor plugin.
Release
- Update version.py
- Run
scripts/release.sh
- Confirm GH action completed successfully
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file geoclustering-0.4.1.tar.gz
.
File metadata
- Download URL: geoclustering-0.4.1.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cfa6c0ff8a6a400faa2d12e06607910707fc938d5598f2e00ac50901d8d490dc |
|
MD5 | fc334f7568ceaeba8323350556a18f59 |
|
BLAKE2b-256 | 5eb464818861aafdf3d578de819aed784bdd4bab55559047e3ec277d082f3984 |
File details
Details for the file geoclustering-0.4.1-py3-none-any.whl
.
File metadata
- Download URL: geoclustering-0.4.1-py3-none-any.whl
- Upload date:
- Size: 10.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d63a701dc1d80f22d7da64d4800c44107ee36760a8b7b06feb71ec3e08e76cdd |
|
MD5 | cf51bca2fb088f3997e52335e4421741 |
|
BLAKE2b-256 | f6ce5354611116e525eab79bcc7117c8ac8679e4ff2eedbb12f4371f85e79e3c |