Skip to main content

Retrieve National Water Model data from Google Cloud Platform.

Project description

HydroTools :: GCP Client

This subpackage implements an interface to retrieve National Water Model (NWM) data from Google Cloud Platform. The primary use for this tool is to populate pandas.Dataframe objects with NWM streamflow data. See the GCP Client Documentation for a complete list and description of the currently available methods. To report bugs or request new features, submit an issue through the HydroTools Issue Tracker on GitHub.

Installation

In accordance with the python community, we support and advise the usage of virtual environments in any workflow using python. In the following installation guide, we use python's built-in venv module to create a virtual environment in which the tool will be installed. Note this is just personal preference, any python virtual environment manager should work just fine (conda, pipenv, etc. ).

# Create and activate python environment, requires python >= 3.8
$ python3 -m venv env
$ source env/bin/activate
$ python3 -m pip install --upgrade pip wheel

# Install gcp_client
$ python3 -m pip install hydrotools.gcp_client

Usage

The following example demonstrates how one might use hydrotools.gcp_client to retrieve NWM streamflow forecasts.

Code

# Import the GCP Client
from hydrotools.gcp_client import gcp

# Instantiate model data service
model_data_service = gcp.NWMDataService()

# Retrieve forecast data
#  By default, only retrieves data at USGS gaging sites in
#  CONUS that are used for model assimilation
forecast_data = model_data_service.get(
    configuration = "short_range",
    reference_time = "20210101T01Z"
    )

# Look at the data
print(forecast_data.info(memory_usage='deep'))
print(forecast_data[['value_time', 'value']].head())

Output

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 137628 entries, 0 to 137627
Data columns (total 8 columns):
 #   Column            Non-Null Count   Dtype         
---  ------            --------------   -----         
 0   reference_time    137628 non-null  datetime64[ns]
 1   value_time        137628 non-null  datetime64[ns]
 2   nwm_feature_id    137628 non-null  int64         
 3   value             137628 non-null  float32       
 4   usgs_site_code    137628 non-null  category      
 5   configuration     137628 non-null  category      
 6   measurement_unit  137628 non-null  category      
 7   variable_name     137628 non-null  category      
dtypes: category(4), datetime64[ns](2), float32(1), int64(1)
memory usage: 5.1 MB
None
           value_time  value
0 2021-01-01 02:00:00   5.29
1 2021-01-01 03:00:00   5.25
2 2021-01-01 04:00:00   5.20
3 2021-01-01 05:00:00   5.12
4 2021-01-01 06:00:00   5.03

System Requirements

We employ several methods to make sure the resulting pandas.DataFrame produced by gcp_client are as efficient and manageable as possible. Nonetheless, this package can potentially use a large amount of memory.

The National Water Model generates multiple forecasts per day at over 3.7 million locations across the United States. A single forecast could be spread across hundreds of files and require repeated calls to Google Cloud Platform. The intermediate steps of retrieving and processing these files into leaner DataFrame may use several GB of memory. As such, recommended minimum requirements to use this package are a 4-core consumer processor and 8 GB of RAM.

Development

This package uses a setup configuration file (setup.cfg) and assumes use of the setuptools backend to build the package. To install the package for development use:

$ python3 -m venv env
$ source env/bin/activate
$ python3 -m pip install -U pip
$ python3 -m pip install -U setuptools
$ python3 -m pip install -e .[develop]

To generate a source distribution:

$ python3 -m pip install -U wheel build
$ python3 -m build

The packages generated in dist/ can be installed directly with pip or uploaded to PyPI using twine.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hydrotools.gcp_client-4.1.2.tar.gz (953.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hydrotools.gcp_client-4.1.2-py3-none-any.whl (961.6 kB view details)

Uploaded Python 3

File details

Details for the file hydrotools.gcp_client-4.1.2.tar.gz.

File metadata

  • Download URL: hydrotools.gcp_client-4.1.2.tar.gz
  • Upload date:
  • Size: 953.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.10

File hashes

Hashes for hydrotools.gcp_client-4.1.2.tar.gz
Algorithm Hash digest
SHA256 1050e51eaa51ef2038222626d9debafd848183efd097a5d0c1f007d22ec712f1
MD5 9394e1c0260ebbf1cb18bd4b96e5914f
BLAKE2b-256 e18da5925aa9f9e4bbbf2b1ea7fe141fd8c817839dad113039d216c39baced3c

See more details on using hashes here.

File details

Details for the file hydrotools.gcp_client-4.1.2-py3-none-any.whl.

File metadata

  • Download URL: hydrotools.gcp_client-4.1.2-py3-none-any.whl
  • Upload date:
  • Size: 961.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.10

File hashes

Hashes for hydrotools.gcp_client-4.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 31eae71b70167f904be43ed040a3df536436314e2e0349e9a7ad3de81ba34342
MD5 eefe64c4249b43ee565fdeb23d080904
BLAKE2b-256 c373323d8e2d4ecacc0bbba277845ff7845350e049bccdb66deddf37d7936fef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page