Skip to main content

A Library to convert Unsupervised Clustering Results into Geographical Maps

Project description

GitHub release (latest by date) PyPI GitHub Workflow Status Downloads DOI

GeoZ

Geographic Decision Zones (GeoZ)

GeoZ is a Python library integrating several methods and machine learning modules to create Geographic Maps based on the output of Unsupervised Machine Learning techniques. The library is geared mainly toward delineating the output from Clustering algorithms, but it can be used for other Machine Learning algorithms. GeoZ is distributed under the 3-Clause BSD license.

Installation

To install GeoZ using pip:

pip install geoz

Usage Details

The library has matured substantially since the early release in 2022, but it is still under development. As such, the user will have to provide the data in a certain format as the library is working with a fixed structure and wont fix or tolerate any deviation from the expected format.

Dataset shape and format Example

The data provided needs to have two variables, one containing the latitude and longitude (eg. latlong) and another variable that contains the predicted classes of the the points (eg. y_pred). please check the below table for illustration:

LATITUDE LONGITUDE y_pred
30 -104 2
32 -103 1
35 -105 2
33 -104 2
35 -102 3

Please make sure to write (LATITUDE, LONGITUDE) in CAPITAL LETTER, otherwise the algorithm will fail.
The algorithm has been enhanced to follow GIS conventions using positional indexing. It now expects latitude data in the first column and longitude data in the second column, regardless of the column headers names. Please note that adhering to this order is crucial; reversing it will lead to inaccurate results.

Code Example

The examples illustrated here are used to demonstrate how each method plots the data. There are several examples and detailed implementations in the accompanying jupyter notebooks mentioned below. The user is advised to read them for greater details.

In all the examples shown we use a synthetic dataset called "GeoZD" that was generated using the awesome drawdata library, as well as a simplified map of United Arab Emirates (UAE) for demonstration (both displayed below).

Points
Points Distribution
UAE Shapefile
UAE Shapefile

In the below examples, we import geoz and then load the synthetically generated dataset into a variable named "latlong". The variable should contain the latitude, longitude and the y_pred, but it can also contain only the latitude and longitude without the class. In that case you will need to provide another variable (eg. y_pred) to store the class predictions and use it in the functions calling.

import geoz
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt

state = r'\miscellaneous\shapefile\UAE Simplified.shp'
latlong = pd.read_csv('GeoZD.csv')                      # This is supposed to be the dataset that you have, it must contain the Latitude and the longitude as well as the class information, check the accompanying "GeoZD.cvs" file for further details.

# This part of the code will create a basemap so the mapping functions will draw ontop of it.
stateFile = gpd.read_file(state)
stateFile=stateFile.to_crs("EPSG:4326")
stategeo = stateFile.geometry.union_all(method='unary')
stateplot=stateFile.plot( color='none',  edgecolor='black',linewidth=5)

convex_hull Function

map1 = geoz.convex_hull_plot(latlong[['y','x']], latlong['label'])              # This Function will return a Convex Hull map of the classes

stateplot = stateFile.plot( color='none',  edgecolor='black',linewidth=5, ax=map1)  # This line will create a basemap ontop of the mapping function results.
CH For further details, the user can refer to the Convex_hull.ipynb

sklearn_plot Function

stateplot=stateFile.plot( color='none',  edgecolor='black',linewidth=5) # This line will create a basemap so the mapping function will draw ontop of it.

map2 = geoz.sklearn_plot(latlong[['y','x']], latlong[['label']], C=1000, gamma=50.0, grid_resolution=100,show_points=True, bazel=True, ax=stateplot)                # This Function will return a map drawn using Scikit-Learn "DecisionBoundaryDisplay"
SKL For further details, the user can refer to the Sklearn_plot.ipynb

mlx_plot Function

stateplot=stateFile.plot( color='none',  edgecolor='black',linewidth=5) # This line will create a basemap so the mapping function will draw ontop of it.


map3 = geoz.mlx_plot(latlong[['y','x']], latlong[['label']], bazel=True, ax=stateplot, n_jobs=-1)                    # This Function will return a map drawn using MLextend  "decision_regions"
MLX For further details, the user can refer to the Mlx_plot Notebook

voronoi_regions_plot Function

This new function was developed to address one of the main weaknesses in GeoZ library, which was its inadequacy to map small datasets. The function was developed based on the work done in Fang et al., 2024 (https://doi.org/10.1016/j.ejrh.2024.101938), if you use this function please cite them along with GeoZ please. The function works simply by creating voronoi diagrams and then color the regions for each class with the same color, resulting in different regions highlighting the different clusters.

stateplot=stateFile.plot( color='none',  edgecolor='black',linewidth=5) # This line will create a basemap so the mapping function will draw ontop of it.

map4 = geoz.voronoi_regions_plot(latlong, 'x', 'y', 'label', show_points=True, ax= stateplot, mask=stategeo)                    # This Function will return a map drawn using the new method based on Voronoi diagram "voronoi_regions_plot"
CH For further details, the user can refer to the Voronoi_regions_plot.ipynb, Voronoi_regions_plot_Minimum.ipynb Notebooks.

For further infromation or the functions other parameters, please check the accompanying Jupyter Notebooks as well as functions DocStrings as they contain more details and information.

License information

See the file (LICENSE) for information on the terms & conditions for usage, and a DISCLAIMER OF ALL WARRANTIES.

Contact

You can ask me any questions via my Twitter Account Ne-oL. and in case you encountered any bugs, please create an issue in GitHub's issue tracker and I will try my best to address it as soon as possible.

Citation

If you use GeoZ as part of your workflow in a scientific publication, please consider citing GeoZ with the following DOI:

@article{ElHaj2023,
author = {ElHaj, Khalid and Alshamsi, Dalal and Aldahan, Ala},
doi = {10.1007/s41651-023-00146-0},
issn = {2509-8829},
journal = {Journal of Geovisualization and Spatial Analysis},
number = {1},
pages = {15},
title = {{GeoZ: a Region-Based Visualization of Clustering Algorithms}},
url = {https://doi.org/10.1007/s41651-023-00146-0},
volume = {7},
year = {2023}
}

If you use the new voronoi_regions_plot function as part of your workflow in a scientific publication, please consider citing the following work in addition to GeoZ:

@article{FANG2024101938,
author = {Jinzhu Fang and Yibo Yang and Peng Yi and Ling Xiong and Jijie Shen and A. Ahmed and K. ElHaj and D. Alshamsi and A. Murad and S. Hussein and A. Aldahan}
title = {Geospatial stable isotopes signatures of groundwater in United Arab Emirates using machine learning},
journal = {Journal of Hydrology: Regional Studies},
volume = {55},
pages = {101938},
year = {2024},
issn = {2214-5818},
doi = {https://doi.org/10.1016/j.ejrh.2024.101938},
url = {https://www.sciencedirect.com/science/article/pii/S2214581824002878},
}
  • Khalid ElHaj, Dalal Alshamsi, and Ala Aldahan. 2023. “GeoZ: A Region-Based Visualization of Clustering Algorithms.” Journal of Geovisualization and Spatial Analysis 7 (1): 15. https://doi.org/10.1007/s41651-023-00146-0.
  • Fang, J.; Yang, Y.; Yi, P.; Xiong, L.; Shen, J.; Ahmed, A.; ElHaj, K.; Alshamsi, D.; Murad, A.; Hussein, S.; et al. Geospatial Stable Isotopes Signatures of Groundwater in United Arab Emirates Using Machine Learning. J. Hydrol. Reg. Stud. 2024, 55, 101938, https://doi.org/10.1016/j.ejrh.2024.101938.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geoz-2.0.1.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

geoz-2.0.1-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file geoz-2.0.1.tar.gz.

File metadata

  • Download URL: geoz-2.0.1.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for geoz-2.0.1.tar.gz
Algorithm Hash digest
SHA256 4d83b2d86565b767c2eec564c506e5656a3d610100b916767edf0bffe75dd17f
MD5 5961e7742b88aa87d330d12f34e75c12
BLAKE2b-256 20a389948e1d562fe75f4627e69bf93f87eed23790f5aa52353b03c911916a3d

See more details on using hashes here.

File details

Details for the file geoz-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: geoz-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 16.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for geoz-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ce086a94d2a83718508993120913a07dd785bde485526f699b3f5555479ad4fa
MD5 d0b4c705d7dcffbdca29fb71044cced3
BLAKE2b-256 127964b44fb91ab3967edf9b79d8f962a9f46b215c099c9bb12391171e721328

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page