Skip to main content

🌏 A python package for wrangling geospatial datasets

Project description

Geowrangler

Geowrangler logo

Overview

License:MIT Versions Docs

Geowrangler is a Python package for geodata wrangling. It helps you build data transformation workflows that have no out-of-the-box solutions from other geospatial libraries.

We surveyed our past geospatial projects to extract these solutions for our work and hope that these will be useful for others as well.

Our audience are researchers, analysts, and engineers delivering geospatial projects.

We welcome your comments, suggestions, bug reports, and code contributions to make Geowrangler better.

Context

Geowrangler was borne out of our efforts to reduce the amount of boilerplate code in wrangling geospatial data. It builds on top of existing geospatial libraries such as geopandas, rasterio, rasterstats, morecantile, and others. Our goals are centered on the following tasks:

  • Extracting area of interest zonal statistics from vector and raster data
  • Gridding areas of interest
  • Validating geospatial datasets
  • Downloading of publically available geospatial datasets (e.g., OSM, Ookla, and Nightlights)
  • Other geospatial vector and raster data processing tasks

To make it easy to document, maintain, and extend the package, we opted to maintain the source code, tests and documentation on Jupyter notebooks. We use nbdev to generate the Python package and documentation from the notebooks. See this document to learn more about our development workflow.

By doing this, we hope to make it easy for geospatial analysts, scientists, and engineers to learn, explore, and extend this package for their geospatial processing needs.

Aside from providing reference documentation for each module, we have included extensive tutorials and use case examples in order to make it easy to learn and use.

Modules

  • Grid Tile Generation
  • Geometry Validation
  • Vector Zonal Stats
  • Raster Zonal Stats
  • Area Zonal Stats
  • Distance Zonal Stats
  • Vector to Raster Mask
  • Raster to Dataframe
  • Raster Processing
  • Demographic and Health Survey (DHS) Processing Utils
  • Geofabrik (OSM) Data Download
  • Ookla Data Download
  • Night Lights
  • Dataset Utils
  • Tile Clustering
  • Spatial Join Highest Intersection

Check this page for more details about our Roadmap.

Installation

pip install geowrangler

Exploring the Documentation

We develop the package modules alongside their documentation. Each page comes with an Open in Colab button that will open the Jupyter notebook in Colab for exploration (including this page).

Click on the Open in Colab button below to open this page as a Google Colab notebook.

# view the source of a grid component
gdf = gpd.GeoDataFrame()
grid = geowrangler.grids.SquareGridGenerator(gdf, 1)
grid??
Type:        SquareGridGenerator
String form: <geowrangler.grids.SquareGridGenerator object>
File:        ~/work/unicef-ai4d/geowrangler-1/geowrangler/grids.py
Source:     
class SquareGridGenerator:
    def __init__(
        self,
        cell_size: float,  # height and width of a square cell in meters
        grid_projection: str = "EPSG:3857",  # projection of grid output
        boundary: Union[SquareGridBoundary, List[float]] = None,  # original boundary
    ):
        self.cell_size = cell_size
        self.grid_projection = grid_projection
        self.boundary = boundary

Tutorials

Reference

[!NOTE]

All the documentation pages (including the references) are executable Jupyter notebooks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geowrangler-0.5.0.tar.gz (66.3 kB view details)

Uploaded Source

Built Distribution

geowrangler-0.5.0-py3-none-any.whl (54.7 kB view details)

Uploaded Python 3

File details

Details for the file geowrangler-0.5.0.tar.gz.

File metadata

  • Download URL: geowrangler-0.5.0.tar.gz
  • Upload date:
  • Size: 66.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for geowrangler-0.5.0.tar.gz
Algorithm Hash digest
SHA256 9282e01ab96d7febd5a67e582564521663e3e17bec0723063bf78fca8ba69686
MD5 a6b1261cf0d7b86c2a719d8e3fe730fe
BLAKE2b-256 bea38d504b6b2745f239667b44756005810cba38a52303342f6c0abe8b2d535d

See more details on using hashes here.

File details

Details for the file geowrangler-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: geowrangler-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 54.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for geowrangler-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a742e78430d4b5d034b2e9797a1df0575fd838526ece4a0eecb2f00e09c15ee7
MD5 b45cca4102093cca45efba6db5adce5d
BLAKE2b-256 b7249846fae412cef02b9fae57361724f4a87b15d1f8e5a1f8627cbbb2554fd1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page