An interface to ArcGIS RESTful-, WFS-, and WMS-based services.
Project description
Package |
Description |
Status |
---|---|---|
Navigate and subset NHDPlus (MR and HR) using web services |
||
Access topographic data through National Map’s 3DEP web service |
||
Access NWIS, NID, HCDN 2009, NLCD, and SSEBop databases |
||
Access Daymet for daily climate data both single pixel and gridded |
||
High-level API for asynchronous requests with persistent caching |
||
Send queries to any ArcGIS RESTful-, WMS-, and WFS-based services |
||
Convert responses from PyGeoOGC’s supported web services to datasets |
PyGeoOGC: Retrieve Data from RESTful, WMS, and WFS Services
Features
PyGeoOGC is a part of HyRiver software stack that is designed to aid in watershed analysis through web services. This package provides general interfaces to web services that are based on ArcGIS RESTful, WMS, and WFS. Although all these web service have limits on the number of features per requests (e.g., 1000 objectIDs for a RESTful request or 8 million pixels for a WMS request), PyGeoOGC divides the requests into smaller chunks, under-the-hood, and then merges the results. Moreover, under-the-hood, this package uses requests-cache for persistent caching that can improve the performance significantly.
There is also an inventory of URLs for some of these web services in form of a class called ServiceURL. These URLs are in four categories: ServiceURL().restful, ServiceURL().wms, ServiceURL().wfs, and ServiceURL().http. These URLs provide you with some examples of the services that PyGeoOGC supports. All the URLs are read from a YAML file located here. If you have success using PyGeoOGC with a web service please consider submitting a request to be added to this URL inventory, located at pygeoogc/static/urls.yml.
PyGeoOGC has three main classes:
ArcGISRESTful: This class can be instantiated by providing the target layer URL. For example, for getting Watershed Boundary Data we can use ServiceURL().restful.wbd. By looking at the web service’s website we see that there are nine layers. For example, 1 for 2-digit HU (Region), 6 for 12-digit HU (Subregion), and so on. We can pass the URL to the target layer directly, like this f"{ServiceURL().restful.wbd}/6" or as a separate argument via layer.
Afterward, we request for the data in two steps. First, we need to get the target object IDs using oids_bygeom (within a geometry), oids_byfield (specific field IDs), or oids_bysql (any valid SQL 92 WHERE clause) class methods. Then, we can get the target features using get_features class method. The returned response can be converted into a GeoDataFrame using json2geodf function from PyGeoUtils.
WMS: Instantiation of this class requires at least 3 arguments: service URL, layer name(s), and output format. Additionally, target CRS and the web service version can be provided. Upon instantiation, we can use getmap_bybox method class to get the target raster data within a bounding box. The box can be in any valid CRS and if it is different from the default CRS, EPSG:4326, it should be passed using box_crs argument. The service response can be converted into a xarray.Dataset using gtiff2xarray function from PyGeoUtils.
WFS: Instantiation of this class is similar to WMS. The only difference is that only one layer name can be passed. Upon instantiation there are three ways to get the data:
getfeature_bybox: Get all the target features within a bounding box in any valid CRS.
getfeature_byid: Get all the target features based on the IDs. Note that two arguments should be provided: featurename, and featureids. You can get a list of valid feature names using get_validnames class method.
getfeature_byfilter: Get the data based on any valid CQL filter.
You can convert the returned response of this function to a GeoDataFrame using json2geodf function from PyGeoUtils package.
You can find some example notebooks here.
You can even try using PyGeoOGC without installing it on you system by clicking on the binder badge below the PyGeoOGC banner. A Jupyter notebook instance with the software stack pre-installed will be launched in your web browser and you can start coding!
Please note that since this project is in early development stages, while the provided functionalities should be stable, changes in APIs are possible in new releases. But we appreciate it if you give this project a try and provide feedback. Contributions are most welcome.
Moreover, requests for additional functionalities can be submitted via issue tracker.
Installation
You can install PyGeoOGC using pip:
$ pip install pygeoogc
Alternatively, PyGeoOGC can be installed from the conda-forge repository using Conda:
$ conda install -c conda-forge pygeoogc
Quick start
We can access NHDPlus HR via RESTful service, National Wetlands Inventory from WMS, and FEMA National Flood Hazard via WFS. The output for these functions are of type requests.Response that can be converted to GeoDataFrame or xarray.Dataset using PyGeoUtils.
Let’s start the National Map’s NHDPlus HR web service. We can query the flowlines that are within a geometry as follows:
from pygeoogc import ArcGISRESTful, WFS, WMS, ServiceURL
import pygeoutils as geoutils
from pynhd import NLDI
basin_geom = NLDI().get_basins("01031500").geometry[0]
hr = ArcGISRESTful(ServiceURL().restful.nhdplushr, 2, outformat="json")
hr.oids_bygeom(basin_geom, "epsg:4326")
resp = hr.get_features()
flowlines = geoutils.json2geodf(resp)
Note oids_bygeom has three additional arguments: sql_clause, spatial_relation, and distance. We can use sql_clause for passing any valid SQL WHERE clause and spatial_relation for specifying the target predicate such as intersect, contain, cross, etc.. The default predicate is intersect (esriSpatialRelIntersects). We can use distance for specifying the buffer distance from the input geometry for getting features.
We can also submit a query based on IDs of any valid field in the database. If the measure property is desired you can pass return_m as True to the get_features class method:
hr.oids_byfield("PERMANENT_IDENTIFIER", ["103455178", "103454362", "103453218"])
resp = hr.get_features(return_m=True)
flowlines = geoutils.json2geodf(resp)
Additionally, any valid SQL 92 WHERE clause can be used. For more details look here. For example, let’s limit our first request to only include catchments with areas larger than 0.5 sqkm.
hr.oids_bygeom(basin_geom, geo_crs="epsg:4326", sql_clause="AREASQKM > 0.5")
resp = hr.get_features()
catchments = geoutils.json2geodf(resp)
A WMS-based example is shown below:
wms = WMS(
ServiceURL().wms.fws,
layers="0",
outformat="image/tiff",
crs="epsg:3857",
)
r_dict = wms.getmap_bybox(
basin_geom.bounds,
1e3,
box_crs="epsg:4326",
)
wetlands = geoutils.gtiff2xarray(r_dict, basin_geom, "epsg:4326")
Query from a WFS-based web service can be done either within a bounding box or using any valid CQL filter.
wfs = WFS(
ServiceURL().wfs.fema,
layer="public_NFHL:Base_Flood_Elevations",
outformat="esrigeojson",
crs="epsg:4269",
)
r = wfs.getfeature_bybox(basin_geom.bounds, box_crs="epsg:4326")
flood = geoutils.json2geodf(r.json(), "epsg:4269", "epsg:4326")
layer = "wmadata:huc08"
wfs = WFS(
ServiceURL().wfs.waterdata,
layer=layer,
outformat="application/json",
version="2.0.0",
crs="epsg:4269",
)
r = wfs.getfeature_byfilter(f"huc8 LIKE '13030%'")
huc8 = geoutils.json2geodf(r.json(), "epsg:4269", "epsg:4326")
Contributing
Contributions are appreciated and very welcomed. Please read CONTRIBUTING.rst for instructions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pygeoogc-0.11.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 531286240fae999887d50f47f571f3497d3dddea3455ca185ac756bf674717d6 |
|
MD5 | 1136ad5651fbcd9cf76a87e7d75265f3 |
|
BLAKE2b-256 | 3edb7e3b5bacbef60c82eed94e1a68d77995634000898c9ef605d032c9db37cf |