async-retriever

High-level API for asynchronous requests with persistent caching.

These details have not been verified by PyPI

Project description

https://raw.githubusercontent.com/cheginit/HyRiver-examples/main/notebooks/_static/async_logo.png

Package	Description	Status
PyNHD	Navigate and subset NHDPlus (MR and HR) using web services
Py3DEP	Access topographic data through National Map’s 3DEP web service
PyGeoHydro	Access NWIS, NID, HCDN 2009, NLCD, and SSEBop databases
PyDaymet	Access Daymet for daily climate data both single pixel and gridded
AsyncRetriever	High-level API for asynchronous requests with persistent caching
PyGeoOGC	Send queries to any ArcGIS RESTful-, WMS-, and WFS-based services
PyGeoUtils	Convert responses from PyGeoOGC’s supported web services to datasets

AsyncRetriever: Asynchronous requests with persistent caching

Features

AsyncRetriever is a part of HyRiver software stack that is designed to aid in watershed analysis through web services. This package has only one purpose; asynchronously sending requests and retrieving responses as text, binary, or json objects. It uses persistent caching to speedup the retrieval even further. Moreover, thanks to nest_asyncio you can use this function in Jupyter notebooks as well. Although this package is in the HyRiver software stack, it’s applicable to any HTTP requests.

Please note that since this project is in early development stages, while the provided functionalities should be stable, changes in APIs are possible in new releases. But we appreciate it if you give this project a try and provide feedback. Contributions are most welcome.

Moreover, requests for additional functionalities can be submitted via issue tracker.

Installation

You can install async_retriever using pip:

$ pip install async_retriever

Alternatively, async_retriever can be installed from the conda-forge repository using Conda:

$ conda install -c conda-forge async_retriever

Quick start

AsyncRetriever has one public function: retrieve. By default, this function uses ./cache/aiohttp_cache.sqlite as the cache file. You can use cache_name argument to customize it. Now, let’s see it in action!

As an example for retrieving a binary response, let’s use the DAAC server to get NDVI. The function can be directly passed to xarray.open_mfdataset to get the data as an xarray Dataset.

import io
import xarray as xr
import async_retriever as ar
from datetime import datetime

west, south, east, north = (-69.77, 45.07, -69.31, 45.45)
base_url = "https://thredds.daac.ornl.gov/thredds/ncss/ornldaac/1299"
dates_itr = ((datetime(y, 1, 1), datetime(y, 1, 31)) for y in range(2000, 2005))
urls, kwds = zip(
    *[
        (
            f"{base_url}/MCD13.A{s.year}.unaccum.nc4",
            {
                "params": {
                    "var": "NDVI",
                    "north": f"{north}",
                    "west": f"{west}",
                    "east": f"{east}",
                    "south": f"{south}",
                    "disableProjSubset": "on",
                    "horizStride": "1",
                    "time_start": s.strftime("%Y-%m-%dT%H:%M:%SZ"),
                    "time_end": e.strftime("%Y-%m-%dT%H:%M:%SZ"),
                    "timeStride": "1",
                    "addLatLon": "true",
                    "accept": "netcdf",
                }
            },
        )
        for s, e in dates_itr
    ]
)
resp = ar.retrieve(urls, "binary", request_kwds=kwds, max_workers=8)
data = xr.open_mfdataset(io.BytesIO(r) for r in resp)

https://raw.githubusercontent.com/cheginit/HyRiver-examples/main/notebooks/_static/ndvi.png

For a json response example, let’s get water level recordings of a NOAA’s water level station, 8534720 (Atlantic City, NJ), during 2012, using CO-OPS API. Note that this CO-OPS product has a 31-day limit for a single request, so we have to break the request down accordingly.

import pandas as pd

station_id = "8534720"
start = pd.to_datetime("2012-01-01")
end = pd.to_datetime("2012-12-31")

s = start
dates = []
for e in pd.date_range(start, end, freq="m"):
    dates.append((s.date(), e.date()))
    s = e + pd.offsets.MonthBegin()

url = "https://api.tidesandcurrents.noaa.gov/api/prod/datagetter"

urls, kwds = zip(
    *[
        (
            url,
            {
                "params": {
                    "product": "water_level",
                    "application": "web_services",
                    "begin_date": f'{s.strftime("%Y%m%d")}',
                    "end_date": f'{e.strftime("%Y%m%d")}',
                    "datum": "MSL",
                    "station": f"{station_id}",
                    "time_zone": "GMT",
                    "units": "metric",
                    "format": "json",
                }
            },
        )
        for s, e in dates
    ]
)

resp = ar.retrieve(urls, read="json", request_kwds=kwds, cache_name="~/.cache/async.sqlite")
wl_list = []
for rjson in resp:
    wl = pd.DataFrame.from_dict(rjson["data"])
    wl["t"] = pd.to_datetime(wl.t)
    wl = wl.set_index(wl.t).drop(columns="t")
    wl["v"] = pd.to_numeric(wl.v, errors="coerce")
    wl_list.append(wl)
water_level = pd.concat(wl_list).sort_index()
water_level.attrs = rjson["metadata"]

https://raw.githubusercontent.com/cheginit/HyRiver-examples/main/notebooks/_static/water_level.png

Now, let’s see an example without any payload or headers. Here’s how we can retrieve harmonic constituents of several NOAA stations from CO-OPS:

stations = [
    "8410140",
    "8411060",
    "8413320",
    "8418150",
    "8419317",
    "8419870",
    "8443970",
    "8447386",
]

base_url = "https://api.tidesandcurrents.noaa.gov/mdapi/prod/webapi/stations"
urls = [f"{base_url}/{i}/harcon.json?units=metric" for i in stations]
resp = ar.retrieve(urls, "json")

amp_list = []
phs_list = []
for rjson in resp:
    sid = rjson["self"].rsplit("/", 2)[1]
    const = pd.DataFrame.from_dict(rjson["HarmonicConstituents"]).set_index("name")
    amp = const.rename(columns={"amplitude": sid})[sid]
    phase = const.rename(columns={"phase_GMT": sid})[sid]
    amp_list.append(amp)
    phs_list.append(phase)

amp = pd.concat(amp_list, axis=1)
phs = pd.concat(phs_list, axis=1)

https://raw.githubusercontent.com/cheginit/HyRiver-examples/main/notebooks/_static/tides.png

Contributing

Contributions are appreciated and very welcomed. Please read CONTRIBUTING.rst for instructions.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.18.0

Oct 5, 2024

0.17.1

Sep 14, 2024

0.17.0

Jul 5, 2024

0.16.1

Apr 25, 2024

0.16.0

Jan 3, 2024

0.15.2

Sep 22, 2023

0.15.0

May 7, 2023

0.14.0

Mar 5, 2023

0.3.12

Feb 10, 2023

0.3.10

Jan 9, 2023

0.3.8

Dec 11, 2022

0.3.7

Dec 9, 2022

0.3.6

Aug 30, 2022

0.3.5

Aug 29, 2022

0.3.4

Jul 31, 2022

0.3.3

Jun 14, 2022

0.3.2

Apr 3, 2022

0.3.1

Dec 31, 2021

0.3.0

Dec 27, 2021

0.2.5

Nov 10, 2021

0.2.4

Sep 10, 2021

This version

0.2.3

Aug 27, 2021

0.2.2

Aug 19, 2021

0.2.1

Jul 31, 2021

0.2.0

Jun 17, 2021

0.1.0

May 1, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

async_retriever-0.2.3.tar.gz (28.8 kB view hashes)

Uploaded Aug 27, 2021 Source

Built Distribution

async_retriever-0.2.3-py3-none-any.whl (11.8 kB view hashes)

Uploaded Aug 27, 2021 Python 3

Hashes for async_retriever-0.2.3.tar.gz

Hashes for async_retriever-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`57c89c3af27458fd2a24d2cb0254a63075287c47ce1335f3bb2e32a619fea8ba`
MD5	`84f0668d81a03b4c3dea5d441c6d915d`
BLAKE2b-256	`29830f87e7d5af254291a83a2c67967c9b922bd221a75bbec83d75ebc0c50227`

Hashes for async_retriever-0.2.3-py3-none-any.whl

Hashes for async_retriever-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cf2e7c80dcf9ecfec24e15f08f1ac3dc01dc4237b0e7ded405ccf1fa44709749`
MD5	`6fe201582e04d5170f3806f7cb0c58f5`
BLAKE2b-256	`10dd0b859eb050149c571766be8231b13f266f3cecf84ff9b44fd1b0b0114b22`