Skip to main content

Geo-Spatial Data Fetching

Project description

Fetchez

🐄🌍 [ F E T C H E Z ] 🌍🐄

The Geospatial Logistics & ETL Platform

Version License Python PyPI version project chat

Fetchez is a lightweight, modular and highly extendable Python library and command-line tool designed to discover and retrieve geospatial data from a wide variety of public repositories. Originally part of the CUDEM project, Fetchez is now a standalone tool capable of retrieving Bathymetry, Topography, Imagery, and Oceanographic data (and more!) from sources like NOAA, USGS, NASA, and the European Space Agency.


❓ Why Fetchez?

Because finding geospatial data is the hardest part of the job.

If you work with geospatial data, you know the pain:

"Where is the latest 1-meter DEM for Seattle?"

"Did the NOAA API endpoint change again?"

"How do I script a download for 5,000 files from a map viewer that only has a 'Download' button?"

Fetchez solves the "Logistics Gap."

It abstracts away the messy reality of 50+ different public repositories (USGS, NOAA, NASA, ESA) into a single, consistent interface. You ask for "Bathymetry in the Gulf of Mexico," and Fetchez handles the API keys, pagination, retries, and file management—delivering clean, standardized files to your hard drive so you can get back to the actual science.

🌎 Features

  • One command to fetch data from 50+ different modules, (SRTM, GMRT, NOAA NOS, USGS 3DEP, Copernicus, etc.).
  • Built-in download management handles retries, resume-on-failure, authentication, and mirror switching automatically.
  • Seamlessly mix disparate data types (e.g., fetch Stream Gauges (JSON), DEMs (GeoTIFF), and Coastlines (Shapefile) in one project).
  • Define automated workflows (Hooks) (e.g., download -> unzip -> reproject -> grid) using Python-based Processing Hooks.
  • Save complex processing chains (Presets) as simple reusable flags (e.g., fetchez ... --run-through-waffles).
  • Includes "FRED" (Fetchez Remote Elevation Datalist) to index and query remote or local files spatially without hitting slow APIs or maintianing a database.
  • Minimal dependencies (requests, tqdm, lxml). Optional shapely support for precise spatial filtering.
  • Supports user-defined Data Modules and Processing Hooks via ~/.fetchez/.

🧩 Where does Fetchez fit?

The geospatial ecosystem is full of powerful processing engines, translators, tansformers, converters, etc. but they all assume you already have the data ready to use. Fetchez fills the gap between the internet, your hard drive and your workflow.

In short: Use Fetchez to get the data so you can crunch the data.

📦 Installation

From Pip/PyPi

pip install fetchez

From Source:

Download and install git (If you have not already): git installation

pip install git+https://github.com/ciresdem/fetchez.git#egg=fetchez

Clone and install from source

git clone https://github.com/ciresdem/fetchez.git
cd fetchez
pip install .

💻 CLI Usage

The primary command is fetchez.

Basic Syntax

fetchez -R <region> <module> [options]

Examples

  • Fetch SRTM+ Data for a Bounding Box
# Region Format: West/East/South/North
fetchez -R -105.5/-104.5/39.5/40.5 srtm_plus
  • Discover Data Sources
# View detailed metadata card for a module
fetchez --info gmrt
  • Fetch Data Using a Place Name
# Automatically resolves "Boulder, CO" to a bounding box region
fetchez -R loc:"Boulder, CO" copernicus --datatype=1
  • Advanced Data Pipelines (Hooks)
# Fetch data, automatically unzip it, and print the final filepath
fetchez -R loc:Miami charts --hook unzip --pipe-path
  • List Available Modules
fetchez --modules

🐍 Python API

Fetchez is designed to be easily integrated into Python workflows.

Simple Fetching

import fetchez

# Define a region (West, East, South, North)
bbox = (-105.5, -104.5, 39.5, 40.5)

# Initialize a specific fetcher module
# Use the registry to load modules dynamically
SRTM = fetchez.registry.FetchezRegistry.load_module('srtm_plus')

# Configure and Run
fetcher = SRTM(src_region=bbox, verbose=True)
fetcher.run()

# Access Results (Metadata)
for result in fetcher.results:
    print(f"Downloaded: {result['dst_fn']}")
    print(f"Source URL: {result['url']}")

Data Discovery

Query the registry to find datasets that match your criteria programmatically.

from fetchez.registry import FetchezRegistry

# Search for global bathymetry datasets
matches = FetchezRegistry.search_modules('bathymetry')
print(f"Found modules: {matches}")

# Get details for a specific module
meta = FetchezRegistry.get_info('copernicus')
print(f"Resolution: {meta.get('resolution')}")
print(f"License: {meta.get('license')}")

For modules that rely on file lists (like Copernicus or NCEI), you can interact directly with the local index.

from fetchez import fred

# Load the local index
index = fred.FRED(name='copernicus')

# Search for datasets in a region
results = index.search(
    region=(-10, 10, 40, 50),
    where=["DataType = '3'"] # Filter for COP-10 (European) data
)

print(f"Found {len(results)} datasets.")

🪝 Processing Hooks

Fetchez includes a powerful Hook System that allows you to chain actions together. Hooks run in a pipeline, meaning the output of one hook (e.g. unzipping a file) becomes the input for the next (e.g. processing it).

Common Built-in Hooks:

  • unzip: Automatically extracts .zip files.

  • pipe: Prints the final absolute path to stdout (useful for piping to GDAL/PDAL).

Example:

# Download data.zip
# Extract data.tif (via unzip hook)
# Print /path/to/data.tif (via pipe-path)
fetchez charts --hook unzip --hook pipe

You can write your own custom hooks (e.g., to log downloads to a database or trigger a script) and drop them in ~/.fetchez/hooks/. See CONTRIBUTING.md for details.

🔗 Pipeline Presets (Macros)

Tired of typing the same chain of hooks every time? Presets allow you to define reusable workflow macros.

Instead of running this long command:

fetchez copernicus --hook checksum:algo=sha256 --hook enrich --hook audit:file=log.json

You can simply run:

fetchez copernicus --audit-full

*** How to use them ***

Fetchez comes with a few built-in shortcuts (check fetchez --help to see them), but the real power comes from defining your own.

  • Initialize your config: Run this command to generate a starter configuration file at ~/.fetchez/presets.json:
fetchez --init-presets
  • Define your workflow: Edit the JSON file to create a named preset. A preset is just a list of hooks with arguments.
"my-clean-workflow": {
  "help": "Unzip files and immediately remove the zip archive.",
  "hooks": [
    {"name": "unzip", "args": {"remove": "true"}},
    {"name": "pipe"}
  ]
}
  • Run it: Your new preset automatically appears as a CLI flag!
fetchez charts --my-clean-workflow

🗺️ Supported Data Sources

Fetchez supports over 50 modules categorized by data type. Run fetchez --modules to see the full list.

Category Example Modules
Topography srtm_plus, copernicus, nasadem, tnm (USGS), arcticdem
Bathymetry gmrt, emodnet, gebco, multibeam, nos_hydro
Oceanography tides, buoys, mur_sst
Reference osm (OpenStreetMap), vdatum
Generic http (Direct URL), earthdata (NASA)

🛟 Module-Specific Dependencies

Fetchez is designed to be lightweight. The core installation only includes what is strictly necessary to run the engine.

However, some data modules require extra libraries to function (e.g., boto3 for AWS data, pyshp for Shapefiles). You can install these "Extras" automatically using pip:

# Install support for AWS-based modules (BlueTopo, etc.)
pip install "fetchez[aws]"

# Install support for Vector processing (Shapefiles, etc.)
pip install "fetchez[vector]"

# Install ALL optional dependencies
pip install "fetchez[full]"

If you try to run a module without its required dependency, fetchez will exit with a helpful error message telling you exactly which extra group to install.

🐄 Plugins, Hooks & Extensions

Need to fetch data from a specialized local server? Or maybe run a custom script immediately after every download? You don't need to fork the repo!

Fetchez is designed to be extendable in two ways:

Data Modules (~/.fetchez/plugins/): Add new data sources or APIs.

Processing Hooks (~/.fetchez/hooks/): Add new post-processing steps (unzip, convert, log).

Drop your Python scripts into these configuration folders, and they will be automatically registered as commands.

Quick Start:

  1. Create the folder: mkdir ~/.fetchez/plugins
  2. Drop a python script there (e.g., my_data.py).
  3. Run it: fetchez my_data

See CONTRIBUTING.md for a full code example.

🛠 Contributing

We welcome contributions! Please see CONTRIBUTING.md for details on how to register new modules or hooks with our metadata schema.

🔱 Disclaimer on Data Persistence

We provide the tools to locate and download data from authoritative public repositories, but we do not host the data ourselves.

Government agencies reorganize websites, migrate APIs (e.g., WCS 1.0 to 2.0), or decommission servers without notice. A module that fetches perfectly today may encounter a 404 tomorrow.

Source datasets are frequently updated, reprocessed, or removed by their custodians. The "best available" data for a region can change overnight.

Remote servers (like NOAA NCEI, USGS, or Copernicus) may experience downtime, throttling, or rate limits that are entirely outside our control.

We strive to keep our modules robust and our index fresh. If you encounter a broken fetch or a changed endpoint, please open an issue. This helps the whole community keep up with the changes!

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Copyright (c) 2010-2026 Regents of the University of Colorado

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fetchez-0.4.0.tar.gz (1.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fetchez-0.4.0-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file fetchez-0.4.0.tar.gz.

File metadata

  • Download URL: fetchez-0.4.0.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fetchez-0.4.0.tar.gz
Algorithm Hash digest
SHA256 3440e49ad809a1e176fc2d986386dcf4048074876fc92c2189ac4dd9895b04c5
MD5 3f0868aebdc813644bfa03014a3d44eb
BLAKE2b-256 8c5e08fb123556a6eb4f81157909e9382f61367169192aafe697118585c6767a

See more details on using hashes here.

Provenance

The following attestation bundles were made for fetchez-0.4.0.tar.gz:

Publisher: publish.yaml on ciresdem/fetchez

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fetchez-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: fetchez-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fetchez-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 191b6fdedb383dca7100017296321ed4c6cd36615bfee3e2a0ea4de6bb83d1e8
MD5 26bd032893ee9a3111fd822af21d21f7
BLAKE2b-256 287cfa1353f2d31e4ea629b3bb9b5fceb726e19982d0bb22d9a0a91638a18036

See more details on using hashes here.

Provenance

The following attestation bundles were made for fetchez-0.4.0-py3-none-any.whl:

Publisher: publish.yaml on ciresdem/fetchez

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page