Machine-readable URL catalog for AEMO NEMWEB (JSON + JSON Schema + Python SDK)
Project description
nem-catalog — Machine-readable URL catalog for AEMO NEMWEB
A versioned JSON catalog + JSON Schema that maps (NEMWEB dataset key, time range) → candidate URLs, covering all four NEMWEB repositories (Reports, MMSDM, NEMDE, FCAS_Causer_Pays). Released under MIT (code) and CC0 (catalog data).
Quick start — no install required
curl -s https://zhipenghe.me/nem-catalog/catalog.json \
| jq '.datasets["Reports:DispatchIS_Reports"].tiers.ARCHIVE'
Output:
{
"path_template": "/Reports/ARCHIVE/DispatchIS_Reports/",
"filename_template": "PUBLIC_DISPATCHIS_{date}.zip",
"filename_regex": "^PUBLIC_DISPATCHIS_\\d{8}\\.zip$",
"example": "PUBLIC_DISPATCHIS_20250407.zip",
"cadence": "daily_rollup"
}
Build the full URL: https://nemweb.com.au + path_template + filename_template (with {date} substituted as yyyymmdd). Placeholder vocabulary is in the catalog's top-level placeholders field.
Stability
v0.1 is experimental. API may change before v1.0. For reproducible research, pin the catalog version:
catalog = nem_catalog.fetch_latest(catalog_version="2026.04.18")
Python usage
pip install nem-catalog
import nem_catalog
# Primary (library-pure, deterministic):
catalog = nem_catalog.load("catalog.json")
urls = catalog.resolve(
"Reports:DispatchIS_Reports",
from_="2025-04-01",
to_="2025-04-02",
)
# → list of candidate URLs. Caller is responsible for reachability.
# Convenience (live fetch + cache + fallback):
catalog = nem_catalog.fetch_latest()
# Preview cardinality before materializing:
n = catalog.count("Reports:DispatchIS_Reports", from_="2024-01-01", to_="2024-12-31")
Expected
UserWarning:Reports:*datasets with both an ARCHIVE and a rolling CURRENT tier emit a one-line warning when you query historical (ARCHIVE-era) dates. The SDK is telling you the live tier has no data that old, so it routed to ARCHIVE. The returned URLs are correct.
Not every dataset resolves to concrete URLs in v0.1
Coverage in v0.1: roughly 1 in 6 of the 362 dataset keys resolve cleanly today (~16%, mostly
Reports:*ARCHIVE tiers). The remaining ~84% raiseNonResolvableTemplateError— including almost allMMSDM:*tables (file-sequence suffix{d2}/{nn}) and every live CURRENT tier (16-digit publish ID{aemo_id}).Per repo:
Reports53/96 (55%),MMSDM4/259 (~2%),NEMDE2/6,FCAS_Causer_Pays0/1. v0.2 will addlist_urls()for the non-temporal cases by reading NEMWEB directory listings.
resolve() only returns URLs when the tier's filename template can be built
from a date range alone. AEMO filenames in rolling CURRENT tiers often embed a
participant ID (e.g. {aemo_id}) or a file-sequence suffix (e.g. {nn}) that
the SDK cannot compute without extra input. For those, resolve() raises
NonResolvableTemplateError rather than return a broken URL string.
# Raises NonResolvableTemplateError — CURRENT filename has {aemo_id}
catalog.resolve("Reports:DispatchIS_Reports", from_="2026-04-17", to_="2026-04-18")
# Works — ARCHIVE filename is pure temporal
catalog.resolve("Reports:DispatchIS_Reports", from_="2025-04-01", to_="2025-04-02")
Inspect the raw template for any dataset via catalog.datasets[key]['tiers']
and build the URL yourself, or pin the query to an ARCHIVE-covered date range.
A future release will add an enumeration API for these datasets.
Not for you if...
- You want a pandas DataFrame of NEMWEB data → use NEMOSIS. It's the production-grade Python pipeline for researchers.
- You want forecast data (pre-dispatch, PASA) → use NEMSEER.
- You want emissions data → use NEMED.
nem-catalog serves the layer below these tools: a shared metadata + canonical JSON shape describing NEMWEB's URL grammar. Non-Python consumers (R, Julia, shell) can use the JSON directly without installing anything.
Shell cookbook (R/Julia/shell users)
See docs/cookbook.md for recipes including URL expansion, date iteration, and parallel download with xargs.
How it's built
See docs/architecture.md. Briefly: extract_patterns.py mirrors NEMWEB directory listings weekly, derives URL patterns, and a hybrid auto+curated merge produces catalog.json. Weekly GitHub Actions runs the whole pipeline and opens a PR on diffs.
Contributing
See CONTRIBUTING.md.
License
- Code: MIT. See
LICENSE. - Catalog JSON: CC0 (public domain).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nem_catalog-0.1.0.tar.gz.
File metadata
- Download URL: nem_catalog-0.1.0.tar.gz
- Upload date:
- Size: 77.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ed150c895f0b04fe06f713ea66ebd0a0d65ae7dca9207df29b6cb959580756b
|
|
| MD5 |
0b840255265f9ab3f929057ade64b5a9
|
|
| BLAKE2b-256 |
6caa619842a81ed519e09092879dffaf331b78357e0a2e2a31826bbcc44e6024
|
Provenance
The following attestation bundles were made for nem_catalog-0.1.0.tar.gz:
Publisher:
release.yml on ZhipengHe/nem-catalog
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nem_catalog-0.1.0.tar.gz -
Subject digest:
7ed150c895f0b04fe06f713ea66ebd0a0d65ae7dca9207df29b6cb959580756b - Sigstore transparency entry: 1340506531
- Sigstore integration time:
-
Permalink:
ZhipengHe/nem-catalog@35664be77e21f506f2cdc23054834608b93f775c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/ZhipengHe
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@35664be77e21f506f2cdc23054834608b93f775c -
Trigger Event:
push
-
Statement type:
File details
Details for the file nem_catalog-0.1.0-py3-none-any.whl.
File metadata
- Download URL: nem_catalog-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bcac383d3ce415cce8d15ec5a0bc6aaba4f76ff451199ee324ab439454360aa
|
|
| MD5 |
b42fbfcf5c65b64f7adf2a10db549b1d
|
|
| BLAKE2b-256 |
4425e0fd6cf81ef0d9df503b4e5780e1fe84d86379d0ba62f097eb1207e44665
|
Provenance
The following attestation bundles were made for nem_catalog-0.1.0-py3-none-any.whl:
Publisher:
release.yml on ZhipengHe/nem-catalog
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nem_catalog-0.1.0-py3-none-any.whl -
Subject digest:
8bcac383d3ce415cce8d15ec5a0bc6aaba4f76ff451199ee324ab439454360aa - Sigstore transparency entry: 1340506545
- Sigstore integration time:
-
Permalink:
ZhipengHe/nem-catalog@35664be77e21f506f2cdc23054834608b93f775c -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/ZhipengHe
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@35664be77e21f506f2cdc23054834608b93f775c -
Trigger Event:
push
-
Statement type: