A Python package that automates the exploratory spatial data analysis (ESDA) process by summarizing the results in an HTML report
Project description
autoESDA
A Python package that automates the exploratory spatial data analysis (ESDA) process by summarising the results into an HTML report.
Table of Contents
- Introduction
- Key features
- Installation
- Dependancies
- Usage
- Examples
- Contributing
- License
- References
- Credits
1. Introduction
Exploratory spatial data analysis (ESDA) is a term used to describe a various functions used to gain a surface-level understanding of a spatial dataset. Currently the ESDA process is repetitive as each of these functions need to be calculated individually. This makes it quite a time consuming process and also includes a large margin for human-induced errors. Additionally, results are not often easily viewed side-by-side for easy comparison and sharing with people who may not have the technical skills to do so.
autoesda is the solution to this by allowing the user to execute one line of code to generate an information-rich HTML report that can easily be shared with others.
2. Key features
- HTML output report
- Extent map
- Dataset overview (coordinate system, number of rows/columns, which rows/columns have been included/excluded in the report)
- Descriptive statistics (count, mean, standard deviation, minimum/maximum, 25th/50th/75th percentiles)
- Sample of dataset
- Boxplot
- Histogram
- Moran's I simulation (moran's I, number of features, p-value, z-score, number of permutations)
- Local Indicator of Spatial Autocorrelation (local scatterplot, LISA cluster map)
- Choropleth maps (quantiles, equal intervals, natural breaks, and percentiles classification schemes)
- Correlation (correlation matrix/heatmap, pairwise plot)
3. Installation
autoesda is available on PyPI, to install autoesda, run this command in your terminal:
pip install autoesda
geopandas is a primary dependancy of autoesda and there are known challenges assosciated with using pip to install geopandas. The recommended strategy is thus, to use autoesda in a conda environment
.
For advanced users, you can follow this documentation which will guide you through the geopandas installation by downloading the unofficial binary files of some of the geopandas dependancies.
autoesda is also available on conda-forge. If you have Anaconda or Miniconda installed on your computer you can use this command in your Anaconda/Miniconda prompt:
conda install autoesda
4. Dependancies
5. Usage
To start off with, you need to ensure that you have imported both geopandas and autoesda.
import geopandas as gpd
import autoesda
Once both libraries have been sucessfully imported, you can import your dataset as a GeoDataFrame. This is done using geopandas. To read more about compatible file types, read the geopandas documentation. In this example, a shapefile is imported.
gdf = gpd.read_file(r'example-file-path\example-shapefile.shp')
Once your data is stored in a GeoDataFrame, you can generate the report.
autoesda.generate_report(gdf)
The report will be saved to your working file directory.
6. Example Reports
Vector Reports | Raster Reports |
---|---|
Old COJ Demographic Data | Global Terrestrial Precipitation Band 1 | Band 2 | Band 3 | Band 4 | Stacked |
AirbBnB Chicago 2015 | EU NOx Concentration Band 1 | Band 2 | Band 3 | Band 4 | Stacked |
Grid 100 | South African Population Band 1 | Band 2 | Band 3 | Band 4 |
South African 2011 Census | |
Natural Earth Country Boundaries | |
Malaria in Colombia | |
USA Election Results |
7. Contributing
Click here to report bugs
Click here to request a new feature
If you would like to assist with fixing bugs, further development or writing documentation you are most welcome to do so. Use the issues page to guide what you can assist with.
In order to make a contribution you will need to:
- Fork the autoesda repository on GitHub.
- Clone your fork locally.
- Commit your changes to your branch on GitHub
- Once you are satsfied that your work is suitable, submit a pull request through the GitHub website.
8. License
This software is available under the BSD-3-Clause license.
For more information, see the LICENSE file which contains details on the history of this software, terms & conditions for usage, and a disclaimer of all warranties.
9. References
When citing this library, please reference the following:
de Kock, N., Rautenbach, V., and Fabris-Rotelli, I.: TOWARDS AN OPEN SOURCE PYTHON LIBRARY FOR AUTOMATED EXPLORATORY SPATIAL DATA ANALYSIS, Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B4-2022, 91–98, https://doi.org/10.5194/isprs-archives-XLIII-B4-2022-91-2022, 2022.
10. Credits
This package was created with Cookiecutter and the giswqs/pypackage project template.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file autoesda-1.0.0.tar.gz
.
File metadata
- Download URL: autoesda-1.0.0.tar.gz
- Upload date:
- Size: 64.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25595faad07c6ef8770c704c8e05acd096719d9d7e0d255334d6e01150aafb2a |
|
MD5 | 06d25733cd6c6300e47857ebcf316ffe |
|
BLAKE2b-256 | 874277b1880b0aa8dfebcfa3d826b860f49296fff1fcdcc0b05576b1aaf50f30 |
File details
Details for the file autoesda-1.0.0-py2.py3-none-any.whl
.
File metadata
- Download URL: autoesda-1.0.0-py2.py3-none-any.whl
- Upload date:
- Size: 60.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7bf1a7c7b9e531a2b91fbff9c7a4879f72dfe47bb7c21db34d0979e24be3e4d |
|
MD5 | e5dbc2a2c1262253c2eae1518bc14990 |
|
BLAKE2b-256 | 706118305cdc3a67a7cf8bfa2d2e216e5099248d270495f9b3fb703d90679ef1 |