Skip to main content

A Python package for computing geometric/spatial entropy metrics for data in matrix format.

Project description

GeoEntropy: A Python Package for Computing Spatial/Geometric Entropy

GeoEntropy is currently in a very early version. There is no guarantee for the accuracy or correctness of the results.

GeoEntropy is a Python package designed to compute various entropy measures for spatial data represented in matrices ( numpy arrays). GeoEntropy is inspired by the R package SpatEntropy by L. Altieri, D. Cocchi, and G. Roli and offers tools for analyzing the entropy of spatial data.

With GeoEntropy, you can easily partition spatial data and compute entropy measures such as Batty's entropy, Shannon's entropy, and more.

Installation

You can install GeoEntropy using pip:

pip install geoentropy

Usage

Convert CSV-Files to a 2D numpy array

The csv_to_matrix function converts CSV files containing coordinate data into a 2D numpy array (data matrix). Each CSV file represents a different category, and the function maps these categories to unique values in the data matrix. It includes options for prioritizing certain files and plotting the resulting matrix. If two points from different CSV files have the same coordinates, the point from the prioritized file remains in place, while the point from the other files is moved randomly to one of the neighboring cells in the von Neumann neighborhood.

from geoentropy import csv_to_matrix
import numpy as np

file_paths = ['coordinates_category_1.csv', 'coordinates_category_2.csv']

data_matrix = csv_to_matrix(file_paths, coordinate_columns=[0, 1], cell_size=1, plot_output=False, priority=0)

print(data_matrix)

Output:

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 2.]
 [0. 0. 0. 0. 2. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 2. 1. 0. 0. 1. 0.]
 [0. 0. 1. 0. 0. 0. 0. 2. 0. 0.]
 [1. 2. 0. 0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0. 0. 2. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 2. 1.]
 [0. 0. 2. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 1. 2. 0. 0. 0. 0. 0.]
 [2. 1. 0. 0. 0. 0. 0. 0. 0. 0.]]

Spatial Partitioning

The spatial_partition function partitions a 2D data matrix into spatial regions using Voronoi tessellation, which assigns each cell in the grid to the nearest partition center. This process helps analyze spatial data by grouping cells based on their proximity to these centers.

from geoentropy import spatial_partition
import numpy as np

data_matrix = np.array([
    [1, 2, 1, 3],
    [2, 1, 3, 3],
    [1, 1, 2, 2],
    [3, 3, 1, 1]
])

result = spatial_partition(data_matrix, partitions=5, cell_size=1, window=None, plot_output=False)

print("Partition Coordinates:\n", result['partition_coordinates'])
print("Data with Partitions:\n", result['data_with_partitions'].head())

Output:

Partition Coordinates:
 [[3.2190276  2.7650245 ]
 [0.54426262 0.16163646]
 [2.66600046 2.83171161]
 [3.93007506 2.1637025 ]
 [0.84704577 3.96334437]]
Data with Partitions:
      x    y  category  partition
0  0.5  0.5         1          2
1  0.5  1.5         2          2
2  0.5  2.5         1          5
3  0.5  3.5         3          5
4  1.5  0.5         2          2

Batty Entropy

The batty function calculates the Batty entropy, a measure of spatial segregation, for a given 2D data matrix. It processes the data through validation, dichotomization, spatial partitioning, and finally calculates the entropy and its range.

from geoentropy import batty
import numpy as np

data_matrix = np.array([
    [1, 2, 1, 1],
    [1, 1, 2, 2],
    [2, 2, 1, 1],
    [1, 1, 2, 2]
])

result = batty(data_matrix, category=1, cell_size=1, partitions=4, window=None, rescale=True, plot_output=False)

print("Batty Entropy:", result['batty_entropy'])
print("Entropy Range:", result['entropy_range'])
print("Relative Batty Entropy:", result['relative_batty_entropy'])

Output:

Batty Entropy: 2.7656685561977836
Entropy Range: {'minimum': np.float64(0.6931471805599453), 'maximum': np.float64(2.772588722239781)}
Relative Batty Entropy: 0.9975040776922705

Karlström Entropy

The karlstrom function calculates the Karlström entropy, a measure of spatial segregation, for a given 2D data matrix. The function processes the data through validation, dichotomization, spatial partitioning, and entropy calculation while considering neighbor relationships.

from geoentropy import karlstrom
import numpy as np

data_matrix = np.array([
    [1, 2, 1, 1],
    [1, 1, 2, 2],
    [2, 2, 1, 1],
    [1, 1, 2, 2]
])

result = karlstrom(data_matrix, category=1, cell_size=1, partition=4, observation_window=None, neighbors=4,
                   method="number", plot_output=False)

print("Karlström Entropy:", result['karlstrom_entropy'])
print("Entropy Range:", result['entropy_range'])
print("Relative Karlström Entropy:", result['relative_karlstrom_entropy'])

Output:

Karlström Entropy: 1.324293923495886
Entropy Range: {'minimum': 0, 'maximum': np.float64(2.772588722239781)}
Relative Karlström Entropy: 0.47763806902672573

Leibovici Entropy

The leibovici function calculates the Leibovici entropy, a measure of spatial organization, for a given 2D data matrix. This function validates inputs, counts adjacent pairs within a specified critical distance, computes entropy, and optionally plots the data matrix.

from geoentropy import leibovici
import numpy as np

data_matrix = np.array([
    [1, 2, 1, np.nan],
    [2, 1, np.nan, 2],
    [1, 1, 2, 1],
    [np.nan, 2, 1, 2]
])

result = leibovici(data_matrix, cell_size=1, critical_distance=2, plot_output=False)

print("Leibovici Entropy:", result['leibovici_entropy'])
print("Entropy Range:", result['entropy_range'])
print("Relative Leibovici Entropy:", result['relative_leibovici_entropy'])
print("Probability Distribution:\n", result['probability_distribution'])

Output:

Leibovici Entropy: 1.3521103558155638
Entropy Range: {'minimum': 0, 'maximum': 1.3862943611198906}
Relative Leibovici Entropy: 0.9753414525348629
Probability Distribution:
       pair  absolute_frequency  relative_frequency
0  1.0-2.0                  13            0.333333
1  1.0-1.0                  10            0.256410
2  2.0-1.0                  10            0.256410
3  2.0-2.0                   6            0.153846

O'Neill Entropy

The oneill function calculates the O'Neill entropy, a measure of spatial organization, for a given 2D data matrix. The function processes the data by validating inputs, collecting adjacent pairs, computing entropy, and optionally plotting the data matrix.

from geoentropy import oneill
import numpy as np

data_matrix = np.array([
    [1, 2, 1, np.nan],
    [2, 1, np.nan, 2],
    [1, 1, 2, 1],
    [np.nan, 2, 1, 2]
])

result = oneill(data_matrix, plot_output=False)

print("O'Neill Entropy:", result['oneill_entropy'])
print("Entropy Range:", result['entropy_range'])
print("Relative O'Neill Entropy:", result['relative_oneill_entropy'])
print("Probability Distribution:\n", result['probability_distribution'])

Output:

O'Neill Entropy: 0.9743147528693494
Entropy Range: {'minimum': 0, 'maximum': 1.3862943611198906}
Relative O'Neill Entropy: 0.7028195311147832
Probability Distribution:
       pair  absolute_frequency  relative_frequency
0  2.0-1.0                   8               0.500
1  1.0-2.0                   6               0.375
2  1.0-1.0                   2               0.125

Shannon Entropy

The shannon function calculates the Shannon entropy, which quantifies the uncertainty or diversity within a 2D data matrix. It processes the data by validating inputs, calculating probabilities of different categories, computing the entropy, and determining the variance.

from geoentropy import shannon
import numpy as np

data_matrix = np.array([
    [1, 2, 1, 3],
    [2, 1, 3, 3],
    [1, 1, 2, 2],
    [3, 3, 1, 1]
])

result = shannon(data_matrix)

print("Shannon Entropy:", result['shannon_entropy'])
print("Entropy Range:", result['shannon_entropy_range'])
print("Relative Shannon Entropy:", result['relative_shannon_entropy'])
print("Probability Distribution:\n", result['probability_distribution'])
print("Variance:", result['variance'])

Output:

Shannon Entropy: 1.0717300941124526
Entropy Range: {'minimum': 0, 'maximum': 1.0986122886681098}
Relative Shannon Entropy: 0.9755307720176264
Probability Distribution:
 [{'category': np.int64(1), 'absolute_frequency': 7, 'relative_frequency': 0.4375}, {'category': np.int64(2), 'absolute_frequency': 4, 'relative_frequency': 0.25}, {'category': np.int64(3), 'absolute_frequency': 5, 'relative_frequency': 0.3125}]
Variance: 0.05362144899780308

Shannon Z Entropy

The shannon_z function calculates the Shannon entropy for pairs of categories in a 2D data matrix. This type of entropy quantifies the uncertainty or diversity of category pairs within the data. It is important to note that Shannon entropy Z is not a spatial entropy, meaning it does not consider the spatial arrangement of elements, only the frequency distribution of category pairs.

from geoentropy import shannon_z
import numpy as np

data_matrix = np.array([
    [1, 2, 1, 3],
    [2, 1, 3, 3],
    [1, 1, 2, 2],
    [3, 3, 1, 1]
])

result = shannon_z(data_matrix)

print("Shannon Entropy Z:", result['shannon_entropy_z'])
print("Entropy Z Range:", result['shannon_entropy_z_range'])
print("Relative Shannon Entropy Z:", result['relative_entropy_z'])
print("Variance:", result['variance'])
print("Pair Probabilities:\n", result['pair_probabilities'])

Output:

Shannon Entropy Z: 1.6594506357352485
Entropy Z Range: {'minimum': 0, 'maximum': 1.791759469228055}
Relative Shannon Entropy Z: 0.9261570340410652
Variance: 0.21318393262341617
Pair Probabilities:
 [{'pair': '1-1', 'absolute_frequency': 21, 'relative_frequency': np.float64(0.175)}, {'pair': '1-2', 'absolute_frequency': 28, 'relative_frequency': np.float64(0.23333333333333334)}, {'pair': '1-3', 'absolute_frequency': 35, 'relative_frequency': np.float64(0.2916666666666667)}, {'pair': '2-2', 'absolute_frequency': 6, 'relative_frequency': np.float64(0.05)}, {'pair': '2-3', 'absolute_frequency': 20, 'relative_frequency': np.float64(0.16666666666666666)}, {'pair': '3-3', 'absolute_frequency': 10, 'relative_frequency': np.float64(0.08333333333333333)}]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GeoEntropy-0.1.0.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

GeoEntropy-0.1.0-py3-none-any.whl (15.1 kB view details)

Uploaded Python 3

File details

Details for the file GeoEntropy-0.1.0.tar.gz.

File metadata

  • Download URL: GeoEntropy-0.1.0.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.9

File hashes

Hashes for GeoEntropy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 94c7c4da7def5c32ac234bd1bfcd0a9784d2e0862e40a3611e205c8d5a883fd7
MD5 a8e97f10ea40629c573a4e77592d6076
BLAKE2b-256 e44425093df005151befe7606e2c57e0bc8df2bee88f0696319e24d427b1f01d

See more details on using hashes here.

File details

Details for the file GeoEntropy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: GeoEntropy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.9

File hashes

Hashes for GeoEntropy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd8ed061ab8b19afb458b1e7aa339a683a8d08840ed74e38bf17edb0deadd811
MD5 89ea85e18ae4bec21344d117155ccc70
BLAKE2b-256 95bf223ba16de0516750a9a527bda9cf8dcd2afac2a6adc5082e317df8f47904

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page