Geo-Spatial Data Fetching
Project description
Fetchez
🐄🌍 [ F E T C H E Z ] 🌍🐄
The Geospatial Logistics & ETL Platform
Fetchez is a lightweight, modular and highly extendable Python library and command-line tool designed to discover and retrieve geospatial data from a wide variety of public repositories. Originally part of the CUDEM project, Fetchez is now a standalone tool capable of retrieving Bathymetry, Topography, Imagery, and Oceanographic data (and more!) from sources like NOAA, USGS, NASA, and the European Space Agency.
❓ Why Fetchez?
Because finding geospatial data is the hardest part of the job.
If you work with geospatial data, you know the pain:
"Where is the latest 1-meter DEM for Seattle?"
"Did the NOAA API endpoint change again?"
"How do I script a download for 5,000 files from a map viewer that only has a 'Download' button?"
Fetchez solves the "Logistics Gap."
It abstracts away the messy reality of 50+ different public repositories (USGS, NOAA, NASA, ESA) into a single, consistent interface. You ask for "Bathymetry in the Gulf of Mexico," and Fetchez handles the API keys, pagination, retries, and file management—delivering clean, standardized files to your hard drive so you can get back to the actual science.
🌎 Features
- One command to fetch data from 50+ different modules, (SRTM, GMRT, NOAA NOS, USGS 3DEP, Copernicus, etc.).
- Built-in download management handles retries, resume-on-failure, authentication, and mirror switching automatically.
- Seamlessly mix disparate data types (e.g., fetch Stream Gauges (JSON), DEMs (GeoTIFF), and Coastlines (Shapefile) in one project).
- Define automated workflows (Hooks) (e.g., download -> unzip -> reproject -> grid) using Python-based Processing Hooks.
- Save complex processing chains (Presets) as simple reusable flags (e.g., fetchez ... --run-through-waffles).
- Includes "FRED" (Fetchez Remote Elevation Datalist) to index and query remote or local files spatially without hitting slow APIs or maintianing a database.
- Minimal dependencies (
requests,tqdm,lxml). Optionalshapelysupport for precise spatial filtering. - Supports user-defined Data Modules and Processing Hooks via
~/.fetchez/.
🧩 Where does Fetchez fit?
The geospatial ecosystem is full of powerful processing engines, translators, tansformers, converters, etc. but they all assume you already have the data ready to use. Fetchez fills the gap between the internet, your hard drive and your workflow.
In short: Use Fetchez to get the data so you can crunch the data.
📦 Installation
From Pip/PyPi
pip install fetchez
From Source:
Download and install git (If you have not already): git installation
pip install git+https://github.com/ciresdem/fetchez.git#egg=fetchez
Clone and install from source
git clone https://github.com/ciresdem/fetchez.git
cd fetchez
pip install .
💻 CLI Usage
The primary command is fetchez.
Basic Syntax
fetchez -R <region> <module> [options]
Examples
- Fetch SRTM+ Data for a Bounding Box
# Region Format: West/East/South/North
fetchez -R -105.5/-104.5/39.5/40.5 srtm_plus
- Discover Data Sources
# View detailed metadata card for a module
fetchez --info gmrt
- Fetch Data Using a Place Name
# Automatically resolves "Boulder, CO" to a bounding box region
fetchez -R loc:"Boulder, CO" copernicus --datatype=1
- Advanced Data Pipelines (Hooks)
# Fetch data, automatically unzip it, and print the final filepath
fetchez -R loc:Miami charts --hook unzip --pipe-path
- List Available Modules
fetchez --modules
🐍 Python API
Fetchez is designed to be easily integrated into Python workflows.
Simple Fetching
import fetchez
# Define a region (West, East, South, North)
bbox = (-105.5, -104.5, 39.5, 40.5)
# Initialize a specific fetcher module
# Use the registry to load modules dynamically
SRTM = fetchez.registry.FetchezRegistry.load_module('srtm_plus')
# Configure and Run
fetcher = SRTM(src_region=bbox, verbose=True)
fetcher.run()
# Access Results (Metadata)
for result in fetcher.results:
print(f"Downloaded: {result['dst_fn']}")
print(f"Source URL: {result['url']}")
Data Discovery
Query the registry to find datasets that match your criteria programmatically.
from fetchez.registry import FetchezRegistry
# Search for global bathymetry datasets
matches = FetchezRegistry.search_modules('bathymetry')
print(f"Found modules: {matches}")
# Get details for a specific module
meta = FetchezRegistry.get_info('copernicus')
print(f"Resolution: {meta.get('resolution')}")
print(f"License: {meta.get('license')}")
For modules that rely on file lists (like Copernicus or NCEI), you can interact directly with the local index.
from fetchez import fred
# Load the local index
index = fred.FRED(name='copernicus')
# Search for datasets in a region
results = index.search(
region=(-10, 10, 40, 50),
where=["DataType = '3'"] # Filter for COP-10 (European) data
)
print(f"Found {len(results)} datasets.")
🪝 Processing Hooks
Fetchez includes a powerful Hook System that allows you to chain actions together. Hooks run in a pipeline, meaning the output of one hook (e.g. unzipping a file) becomes the input for the next (e.g. processing it).
Common Built-in Hooks:
-
unzip: Automatically extracts .zip files.
-
pipe: Prints the final absolute path to stdout (useful for piping to GDAL/PDAL).
Example:
# Download data.zip
# Extract data.tif (via unzip hook)
# Print /path/to/data.tif (via pipe-path)
fetchez charts --hook unzip --hook pipe
You can write your own custom hooks (e.g., to log downloads to a database or trigger a script) and drop them in ~/.fetchez/hooks/. See CONTRIBUTING.md for details.
🔗 Pipeline Presets (Macros)
Tired of typing the same chain of hooks every time? Presets allow you to define reusable workflow macros.
Instead of running this long command:
fetchez copernicus --hook checksum:algo=sha256 --hook enrich --hook audit:file=log.json
You can simply run:
fetchez copernicus --audit-full
*** How to use them ***
Fetchez comes with a few built-in shortcuts (check fetchez --help to see them), but the real power comes from defining your own.
- Initialize your config: Run this command to generate a starter configuration file at
~/.fetchez/presets.json:
fetchez --init-presets
- Define your workflow: Edit the JSON file to create a named preset. A preset is just a list of hooks with arguments.
"my-clean-workflow": {
"help": "Unzip files and immediately remove the zip archive.",
"hooks": [
{"name": "unzip", "args": {"remove": "true"}},
{"name": "pipe"}
]
}
- Run it: Your new preset automatically appears as a CLI flag!
fetchez charts --my-clean-workflow
🗺️ Supported Data Sources
Fetchez supports over 50 modules categorized by data type. Run fetchez --modules to see the full list.
| Category | Example Modules |
|---|---|
| Topography | srtm_plus, copernicus, nasadem, tnm (USGS), arcticdem |
| Bathymetry | gmrt, emodnet, gebco, multibeam, nos_hydro |
| Oceanography | tides, buoys, mur_sst |
| Reference | osm (OpenStreetMap), vdatum |
| Generic | http (Direct URL), earthdata (NASA) |
🛟 Module-Specific Dependencies
Fetchez is designed to be lightweight. The core installation only includes what is strictly necessary to run the engine.
However, some data modules require extra libraries to function (e.g., boto3 for AWS data, pyshp for Shapefiles). You can install these "Extras" automatically using pip:
# Install support for AWS-based modules (BlueTopo, etc.)
pip install "fetchez[aws]"
# Install support for Vector processing (Shapefiles, etc.)
pip install "fetchez[vector]"
# Install ALL optional dependencies
pip install "fetchez[full]"
If you try to run a module without its required dependency, fetchez will exit with a helpful error message telling you exactly which extra group to install.
🐄 Plugins, Hooks & Extensions
Need to fetch data from a specialized local server? Or maybe run a custom script immediately after every download? You don't need to fork the repo!
Fetchez is designed to be extendable in two ways:
Data Modules (~/.fetchez/plugins/): Add new data sources or APIs.
Processing Hooks (~/.fetchez/hooks/): Add new post-processing steps (unzip, convert, log).
Drop your Python scripts into these configuration folders, and they will be automatically registered as commands.
Quick Start:
- Create the folder:
mkdir ~/.fetchez/plugins - Drop a python script there (e.g.,
my_data.py). - Run it:
fetchez my_data
See CONTRIBUTING.md for a full code example.
🛠 Contributing
We welcome contributions! Please see CONTRIBUTING.md for details on how to register new modules or hooks with our metadata schema.
🔱 Disclaimer on Data Persistence
We provide the tools to locate and download data from authoritative public repositories, but we do not host the data ourselves.
Government agencies reorganize websites, migrate APIs (e.g., WCS 1.0 to 2.0), or decommission servers without notice. A module that fetches perfectly today may encounter a 404 tomorrow.
Source datasets are frequently updated, reprocessed, or removed by their custodians. The "best available" data for a region can change overnight.
Remote servers (like NOAA NCEI, USGS, or Copernicus) may experience downtime, throttling, or rate limits that are entirely outside our control.
We strive to keep our modules robust and our index fresh. If you encounter a broken fetch or a changed endpoint, please open an issue. This helps the whole community keep up with the changes!
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
Copyright (c) 2010-2026 Regents of the University of Colorado
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fetchez-0.4.0.tar.gz.
File metadata
- Download URL: fetchez-0.4.0.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3440e49ad809a1e176fc2d986386dcf4048074876fc92c2189ac4dd9895b04c5
|
|
| MD5 |
3f0868aebdc813644bfa03014a3d44eb
|
|
| BLAKE2b-256 |
8c5e08fb123556a6eb4f81157909e9382f61367169192aafe697118585c6767a
|
Provenance
The following attestation bundles were made for fetchez-0.4.0.tar.gz:
Publisher:
publish.yaml on ciresdem/fetchez
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fetchez-0.4.0.tar.gz -
Subject digest:
3440e49ad809a1e176fc2d986386dcf4048074876fc92c2189ac4dd9895b04c5 - Sigstore transparency entry: 931513745
- Sigstore integration time:
-
Permalink:
ciresdem/fetchez@f14a960cf1d2a306dc430f9ea1c5fef41e83350b -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/ciresdem
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@f14a960cf1d2a306dc430f9ea1c5fef41e83350b -
Trigger Event:
push
-
Statement type:
File details
Details for the file fetchez-0.4.0-py3-none-any.whl.
File metadata
- Download URL: fetchez-0.4.0-py3-none-any.whl
- Upload date:
- Size: 1.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
191b6fdedb383dca7100017296321ed4c6cd36615bfee3e2a0ea4de6bb83d1e8
|
|
| MD5 |
26bd032893ee9a3111fd822af21d21f7
|
|
| BLAKE2b-256 |
287cfa1353f2d31e4ea629b3bb9b5fceb726e19982d0bb22d9a0a91638a18036
|
Provenance
The following attestation bundles were made for fetchez-0.4.0-py3-none-any.whl:
Publisher:
publish.yaml on ciresdem/fetchez
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fetchez-0.4.0-py3-none-any.whl -
Subject digest:
191b6fdedb383dca7100017296321ed4c6cd36615bfee3e2a0ea4de6bb83d1e8 - Sigstore transparency entry: 931513797
- Sigstore integration time:
-
Permalink:
ciresdem/fetchez@f14a960cf1d2a306dc430f9ea1c5fef41e83350b -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/ciresdem
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@f14a960cf1d2a306dc430f9ea1c5fef41e83350b -
Trigger Event:
push
-
Statement type: