Skip to main content

Python erddap API client

Project description

ERDDAP python library

Build Status

About

ERDDAP is a data server that gives you a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps.

erddap-python is a python client for the ERDDAP Restful API, can obtain status metrics, it provides search methods, gives tabledap and griddap class objects for metadata and data access.

This library was initially built for CICESE, CIGOM, OORCO, and CEMIEOceano projects for the automation of reports, interactive custom visualizations and data analysis. Most of the functionality was inspired on the work of erddapy library, but designed more for a backend service construction in mind.

Full API reference can be found here.

Projects using erddap-python

Requirements

  • python 3
  • python libraries numpy, pandas, xarray, netCDF4

Installation

Using pip:

$ pip install erddap-python

Usage

Explore a ERDDAP Server

Connect to a ERDDAP Server

>>> from erddapClient import ERDDAP_Server
>>> 
>>> remoteServer = ERDDAP_Server('https://coastwatch.pfeg.noaa.gov/erddap')
>>> remoteServer
<erddapClient.ERDDAP_Server>
Server version:  ERDDAP_version=2.11

search and advancedSerch methods are available, it builds the search request URL and also can make the request to the ERDDAP restful services to obtain results.

>>> searchRequest = remoteServer.advancedSearch(searchFor="gliders")
>>> searchRequest
<erddapClient.ERDDAP_SearchResults>
Results:  1
[
  0 - <erddapClient.ERDDAP_Tabledap> scrippsGliders , "Gliders, Scripps Institution of Oceanography, 2014-present"
]

The methods returns an object with a list of the ERDDAP Tabledap or Griddap objects that matched the search filters.

Tabledap datasets

Using the ERDDAP_Tabledap class to build ERDDAP data request URL's

>>> from erddapClient import ERDDAP_Tabledap
>>> 
>>> url = 'https://coastwatch.pfeg.noaa.gov/erddap'
>>> datasetid = 'cwwcNDBCMet'
>>> remote = ERDDAP_Tabledap(url, datasetid)
>>> 
>>> remote.setResultVariables(['station','time','atmp'])
>>> print (remote.getURL('htmlTable'))

'https://coastwatch.pfeg.noaa.gov/erddap/tabledap/cwwcNDBCMet.htmlTable?station%2Ctime%2Catmp'

You can continue adding constraints, server side operations or the distinct operation to the url generation.

>>> import datetime as dt 
>>> 
>>> remote.addConstraint('time>=2020-12-29T00:00:00Z') \
          .addConstraint({ 'time<=' : dt.datetime(2020,12,31) })
>>> remote.getURL()

'https://coastwatch.pfeg.noaa.gov/erddap/tabledap/cwwcNDBCMet.csvp?station%2Ctime%2Catmp&time%3E=2020-12-29T00%3A00%3A00Z&time%3C=2020-12-31T00%3A00%3A00Z'

>>>
>>> remote.orderByClosest(['station','time/1day'])
>>> remote.getURL()

'https://coastwatch.pfeg.noaa.gov/erddap/tabledap/cwwcNDBCMet.csvp?station%2Ctime%2Catmp&time%3E=2020-12-29T00%3A00%3A00Z&time%3C=2020-12-31T00%3A00%3A00Z&orderByClosest(%22station%2Ctime/1day%22)'

>>> 

The class has methods to clear the result variables, the constraints, and the server side operations that are added in the stack: clearConstraints(), clearResultVariable(), clearServerSideFunctions or clearQuery()

An user can build the query chaining the result variables, constraints and server side operations, and make the data request in all the available formats that ERDDAP provides.

>>>
>>> remote.clearQuery()
>>>
>>> responseCSV = (
>>>     remote.setResultVariables(['station','time','atmp'])
>>>           .addConstraint('time>=2020-12-29T00:00:00Z')
>>>           .addConstraint('time<=2020-12-31T00:00:00Z')
>>>           .orderByClosest(['station','time/1day'])
>>>           .getData('csvp')
>>> )
>>> 
>>> print(responseCSV)

station,time (UTC),atmp (degree_C)
41001,2020-12-29T00:00:00Z,17.3
41001,2020-12-30T00:00:00Z,13.7
41001,2020-12-31T00:00:00Z,15.9
41004,2020-12-29T00:10:00Z,18.1
41004,2020-12-30T00:00:00Z,17.1
41004,2020-12-31T00:00:00Z,21.2
41008,2020-12-29T00:50:00Z,14.8
...
.

>>>
>>> remote.clearQuery()
>>>
>>> responsePandas = (
>>>     remote.setResultVariables(['station','time','atmp'])
>>>           .addConstraint('time>=2020-12-29T00:00:00Z')
>>>           .addConstraint('time<=2020-12-31T00:00:00Z')
>>>           .orderByClosest(['station','time/1day'])
>>>           .getDataFrame()
>>> )
>>>
>>> responsePandas

     station            time (UTC)  atmp (degree_C)
0      41001  2020-12-29T00:00:00Z             17.3
1      41001  2020-12-30T00:00:00Z             13.7
2      41001  2020-12-31T00:00:00Z             15.9
3      41004  2020-12-29T00:00:00Z             18.2
4      41004  2020-12-30T00:00:00Z             17.1
...      ...                   ...              ...
2006   YKRV2  2020-12-30T00:00:00Z              NaN
2007   YKRV2  2020-12-31T00:00:00Z              8.1
2008   YKTV2  2020-12-29T00:00:00Z             11.3
2009   YKTV2  2020-12-30T00:00:00Z              NaN
2010   YKTV2  2020-12-31T00:00:00Z              7.1

[2011 rows x 3 columns]

Griddap datasets

All the url building functions, and data request functionality is available in the ERDDAP_Griddap class. The data requests can be all the available ERDDAP data formats, plus the posibility to request a subset of the dataset and get in return a xarray or netCDF4.Dataset object.

This class can parse the griddap query, and detect if the query is malformed before requesting data from the ERDDAP server.

Usage sample:

>>> from erddapClient import ERDDAP_Griddap
>>> 
>>> url = 'https://coastwatch.pfeg.noaa.gov/erddap'
>>> datasetid = 'hycom_gom310D'
>>> remote = ERDDAP_Griddap(url, datasetid)
>>> 
>>> print(remote)

<erddapClient.ERDDAP_Griddap>
Title:       NRL HYCOM 1/25 deg model output, Gulf of Mexico, 10.04 Expt 31.0, 2009-2014, At Depths
Server URL:  https://coastwatch.pfeg.noaa.gov/erddap
Dataset ID:  hycom_gom310D
Dimensions: 
  time (double) range=(cftime.DatetimeGregorian(2009, 4, 2, 0, 0, 0, 0), cftime.DatetimeGregorian(2014, 8, 30, 0, 0, 0, 0)) 
    Standard name: time 
    Units:         seconds since 1970-01-01T00:00:00Z 
  depth (float) range=(0.0, 5500.0) 
    Standard name: depth 
    Units:         m 
  latitude (float) range=(18.09165, 31.96065) 
    Standard name: latitude 
    Units:         degrees_north 
  longitude (float) range=(-98.0, -76.40002) 
    Standard name: longitude 
    Units:         degrees_east 
Variables: 
  temperature (float) 
    Standard name: sea_water_potential_temperature 
    Units:         degC 
  salinity (float) 
    Standard name: sea_water_practical_salinity 
    Units:         psu 
  u (float) 
    Standard name: eastward_sea_water_velocity 
    Units:         m/s 
  v (float) 
    Standard name: northward_sea_water_velocity 
    Units:         m/s 
  w_velocity (float) 
    Standard name: upward_sea_water_velocity 
    Units:         m/s 

With ERDDAP_Griddap is posible to make a subset request of data, and get in return an xarray, or a netCDF4.Dataset

>>> xSubset = ( remote.setResultVariables('temperature[(2012-01-13)][0:39][(18.09165):(31.96065)][(-98.0):(-76.40002)]')
                      .getXarray() )
>>>
>>> xSubset 
<xarray.Dataset>
Dimensions:      (depth: 40, latitude: 385, longitude: 541, time: 1)
Dimensions without coordinates: depth, latitude, longitude, time
Data variables:
    temperature  (time, depth, latitude, longitude) float32 ...
Attributes:
    cdm_data_type:              Grid
    Conventions:                COARDS, CF-1.0, ACDD-1.3
    creator_email:              hycomdata@coaps.fsu.edu
    creator_name:               Naval Research Laboratory
    creator_type:               institution
    creator_url:                https://www.hycom.org
    defaultGraphQuery:          temperature[%28last%29][0][0:%28last%29][0:%2...
    Easternmost_Easting:        -76.40002
    experiment:                 31.0
    geospatial_lat_max:         31.96065
    geospatial_lat_min:         18.09165
    geospatial_lat_units:       degrees_north
    geospatial_lon_max:         -76.40002
    geospatial_lon_min:         -98.0
    geospatial_lon_resolution:  0.039999962962962966
    geospatial_lon_units:       degrees_east
    history:                    archv2ncdf3z\n2021-04-26T09:25:51Z https://td...
    infoUrl:                    https://www.hycom.org
    institution:                Naval Research Laboratory
    keywords:                   30.1h, circulation, currents, density, Earth ...
    keywords_vocabulary:        GCMD Science Keywords
    license:                    The data may be used and redistributed for fr...
    Northernmost_Northing:      31.96065
    source:                     HYCOM archive file
    sourceUrl:                  https://tds.hycom.org/thredds/dodsC/GOMl0.04/...
    Southernmost_Northing:      18.09165
    standard_name_vocabulary:   CF Standard Name Table v70
    summary:                    NRL HYCOM 1/25 deg model output, Gulf of Mexi...
    time_coverage_end:          2014-08-30T00:00:00Z
    time_coverage_start:        2009-04-02T00:00:00Z
    title:                      NRL HYCOM 1/25 deg model output, Gulf of Mexi...
    Westernmost_Easting:        -98.0

To make the above request posible, the library parses the query subset and builds an opendap request with the equivalent integer indexes.

For the ERDDAP_Griddap object, there is a dimensions property with all its corresponding metadata, but also the dimension values are downloaded to make certain operations.

>>> # Get more information about dimensions
>>> print (remote.dimensions)
<erddapClient.ERDDAP_Griddap_dimensions>
Dimension: time (nValues=1977) 
Dimension: depth (nValues=40) 
Dimension: latitude (nValues=385) 
Dimension: longitude (nValues=541) 

>>> from pprint import pprint
>>> 
>>> pprint(remote.dimensions['time'].metadata)
{'_CoordinateAxisType': 'Time',
 '_averageSpacing': '1 day',
 '_dataType': 'double',
 '_evenlySpaced': True,
 '_nValues': 1977,
 'actual_range': (cftime.DatetimeGregorian(2009, 4, 2, 0, 0, 0, 0),
                  cftime.DatetimeGregorian(2014, 8, 30, 0, 0, 0, 0)),
 'axis': 'T',
 'calendar': 'standard',
 'ioos_category': 'Time',
 'long_name': 'Time',
 'standard_name': 'time',
 'time_origin': '01-JAN-1970 00:00:00',
 'units': 'seconds since 1970-01-01T00:00:00Z'}

Make request for subsets in different formats.

>>> # Request a subset in a pandas dataframe
>>>
>>> remote.clearQuery()
>>> subset = ( remote.setResultVariables(["temperature[0:last][(0.0)][(22.5)][(-95.5)]",
                                           "salinity[0:last][(0.0)][(22.5)][(-95.5)]"])
                     .getDataFrame(header=0, 
                                   names=["time", "depth", "latitude", "longitude", "temperature", "salinity"], 
                                   parse_dates=["time"],
                                   index_col="time")  )
>>>
>>> subset.head()                              

                           depth  latitude  longitude  temperature   salinity
time                                                                         
2009-04-02 00:00:00+00:00    0.0  22.51696  -95.47998    24.801798  36.167076
2009-04-03 00:00:00+00:00    0.0  22.51696  -95.47998    24.605570  36.256450
2009-04-04 00:00:00+00:00    0.0  22.51696  -95.47998    24.477884  36.086346
2009-04-05 00:00:00+00:00    0.0  22.51696  -95.47998    24.552357  36.133224
2009-04-06 00:00:00+00:00    0.0  22.51696  -95.47998    25.761946  36.179676
...                          ...       ...        ...          ...        ...
2014-08-26 00:00:00+00:00    0.0  22.51696  -95.47998    30.277546  36.440037
2014-08-27 00:00:00+00:00    0.0  22.51696  -95.47998    30.258907  36.485844
2014-08-28 00:00:00+00:00    0.0  22.51696  -95.47998    30.298597  36.507530
2014-08-29 00:00:00+00:00    0.0  22.51696  -95.47998    30.246874  36.493400
2014-08-30 00:00:00+00:00    0.0  22.51696  -95.47998    30.387840  36.487934

[1977 rows x 5 columns]

>>> 

Sample notebooks

Check the demostration notebooks folder for more advanced usage of the library classes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

erddap-python-0.0.10b0.tar.gz (28.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

erddap_python-0.0.10b0-py3-none-any.whl (29.2 kB view details)

Uploaded Python 3

File details

Details for the file erddap-python-0.0.10b0.tar.gz.

File metadata

  • Download URL: erddap-python-0.0.10b0.tar.gz
  • Upload date:
  • Size: 28.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for erddap-python-0.0.10b0.tar.gz
Algorithm Hash digest
SHA256 85a8ae5e030be7495afc92bee6d16e081f878a41083c6ad67cb3a520f84059e2
MD5 b1aa1eec2839ec69104f92704c8cdcf9
BLAKE2b-256 ef5a81f9ec753c48c3b6b33dfc0f591bec81b141b86c4ac0ec015cf3e3741a6b

See more details on using hashes here.

File details

Details for the file erddap_python-0.0.10b0-py3-none-any.whl.

File metadata

  • Download URL: erddap_python-0.0.10b0-py3-none-any.whl
  • Upload date:
  • Size: 29.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for erddap_python-0.0.10b0-py3-none-any.whl
Algorithm Hash digest
SHA256 0f1c7a046402a9cdf94b7419ce625b8c2f05fac77d83de4188f6b02add17c736
MD5 1738306c9a2b66eec62e26dd1ff15bd4
BLAKE2b-256 72b6b616c8d8d7b30d52693f58ae3266d5d14464a80b42175a67df7ec6760495

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page