Python client for the eolas.fyi statistical data API (NZ, Australia, OECD)
Project description
eolas-data
Python client for the eolas.fyi statistical data API — 1,400+ official New Zealand statistical & geospatial datasets, plus OECD data for international comparisons, served as tidy pandas DataFrames (or polars / geopandas if you prefer).
Coverage is New Zealand + OECD today. Australian sources are on the roadmap — not yet available; OECD data already includes Australia (and other OECD members) for cross-country comparisons.
pip install eolas-data
Quickstart
from eolas_data import Client
client = Client("your_api_key") # or set EOLAS_API_KEY in env
# CPI index (monthly, RBNZ M1) — the usual Treasury/analyst choice
cpi = client.rbnz("rbnz_m1_prices", start="2020-01-01")
# OECD macro indicators (quarterly YoY % — not CPI index levels)
inflation = client.oecd("nz_cpi", start="2020-01-01")
gdp = client.oecd("nz_gdp_growth")
# Discovery
all_datasets = client.list()
nz_only = client.list("Stats NZ")
client.search("cpi") # expands aliases; surfaces rbnz_m1_prices before nz_cpi
meta = client.info("rbnz_m1_prices")
Get an API key at https://eolas.fyi/signup. Free plan is 10 requests/month; Pro ($49/month) is unlimited.
Quick setup (workstation)
Two one-off commands make every future session frictionless:
1. Save your API key to the OS keyring (macOS Keychain / Windows Credential Manager / Linux Secret Service) so Client() finds it automatically — no env var, no pasting:
pip install 'eolas-data[secure]' # adds the keyring package
eolas auth save-key # interactive prompt
from eolas_data import Client
client = Client() # key read from OS keyring automatically
2. Set a library directory so downloaded bulk files land somewhere permanent instead of the transient ~/.cache/eolas/ OS cache:
eolas library set ~/eolas-library # writes to ~/.eolas/config.json
Or set the env var instead (useful for CI / Docker):
export EOLAS_LIBRARY=~/eolas-library
After setting the library, client.get_local("nz_parcels") will use ~/eolas-library/ automatically.
The keyring slot and config file are shared with the R eolas client — a key saved from Python is immediately readable from R and vice versa (see the R client README).
Command-line interface
pip install eolas-data[cli] adds an eolas command for browsing, fetching, and
scheduling — useful for shell scripts, cron jobs, and AI-agent workflows. Rich
tables by default; pass --json for newline-delimited JSON in scripts.
# one-time setup (OS keyring — recommended)
pip install 'eolas-data[secure]'
eolas auth save-key
# or config file (no extra install)
eolas auth set-key
eolas health
# discover
eolas datasets list --source "Stats NZ"
eolas datasets list --search cpi # table + CPI guidance note
eolas datasets list --search cpi --json | jq '.[].name'
eolas datasets info rbnz_m1_prices
eolas datasets preview rbnz_m1_prices --limit 5
# fetch (verb matches the Python lib's client.get())
eolas get rbnz_m1_prices --format csv > cpi.csv
eolas get nz_cpi --start 2020-01-01 --format json | jq '.[].value' # OECD YoY %
eolas get nz_meshblock_2023 --format parquet --out sa2.parquet
Scheduling
Set up recurring fetches without touching crontab/Task Scheduler syntax. Works on Linux, macOS (cron), and Windows (Task Scheduler).
eolas schedule add nz_cpi --daily --out ~/data/cpi.csv
eolas schedule add nz_gdp_growth --weekly --out ~/data/gdp.csv
eolas schedule add rbnz_b1_exchange_rates_monthly --cron "0 */6 * * *" --out ~/data/fx.csv # POSIX only
eolas schedule list
eolas schedule remove nz_cpi
Daily is the default. Pre-flight check refuses to install a schedule unless your API key is configured (otherwise the job would fail silently forever).
Integrations (Enterprise plan)
Generate ready-to-run connector configs for popular data-pipeline tools — eolas becomes a one-command source for Meltano, Fivetran, or Azure Data Factory.
eolas integrate meltano --datasets nz_cpi,nz_gdp_growth --output ./my-pipeline/
eolas integrate fivetran --datasets nz_cpi
eolas integrate azure-data-factory --datasets nz_cpi,nz_gdp_growth
The generated directory has everything needed to plug into your destination
warehouse: meltano.yml, fivetran.yml, or ADF JSON resources, plus a README.md
walking through the rest of the setup. Non-Enterprise users see a clear
upgrade pointer; the gating lives server-side so the capability is bypass-proof.
Exit codes
Distinct exit codes per error class, for shell scripts and agents:
| Code | Meaning |
|---|---|
0 |
Success |
1 |
Generic error |
2 |
Auth (AuthenticationError, including Enterprise-gate 403) |
3 |
Rate limit hit |
4 |
Dataset / resource not found |
5 |
Other API error |
64 |
Bad usage (mirrors sysexits.h) |
Performance (Arrow)
client.get() transparently negotiates Apache Arrow over the wire — same
DataFrame back, typically 5–10× faster end-to-end on large pulls, with
an automatic JSON fallback. No setup: pyarrow ships with eolas-data, so
this is on by default; format= ("json"/"csv") is only for the rare case
you want the raw text payload.
For a columnar file (CLI), use --format parquet --out FILE; via the REST
API directly, ?format=parquet. Full benchmark: docs.eolas.fyi → Python
reference → Performance.
Bulk downloads — use get_local() for whole datasets
client.get() hits the live /data endpoint (good for slices and small pulls). For whole datasets — especially large or geospatial layers — use get_local(). It syncs a CDN-cached Parquet/GeoParquet file to your library directory and reads from disk on subsequent calls.
# Whole-dataset path: nz_parcels from CDN-cached GeoParquet (seconds, not a 15-min Iceberg scan)
gdf = client.get_local("nz_parcels") # geopandas.GeoDataFrame when [geo] is installed
df = client.get_local("nz_cpi") # tidy DataFrame from cached Parquet
# Live path: date slices, row limits, licence-restricted sources (e.g. OECD)
df = client.get("nz_cpi", start="2020-01-01")
df = client.get("nz_cpi", limit=100)
Use get_local() when you need to control cache_dir, format, or freshness:
# Explicit cache+sync with extra options
gdf = client.get_local("nz_parcels")
gdf = client.get_local("nz_parcels", cache_dir="/data/eolas", freshness="monthly")
df = client.get_local("nz_cpi", format="csv_gz")
For advanced control over the sync lifecycle (sidecar tracking, atomic replace), use sync_bulk() directly. For one-shot bytes-or-path downloads, use download_bulk():
r = client.sync_bulk("nz_cpi", path="nz_cpi.parquet")
# r.status ∈ {"downloaded", "unchanged", "updated"}; r.bytes_downloaded == 0 when unchanged.
path = client.download_bulk("treasury_fiscal_spending", path="t.parquet")
Progress bars: get_local() shows two phases in interactive sessions — a download byte bar while fetching from CDN, then a read spinner while Parquet/GeoParquet is loaded (often the slow part on multi-million-row geo datasets). Control with progress=True (both), False (neither), "download", or "read". Set EOLAS_NO_PROGRESS=1 to suppress both in batch scripts. Cached files skip the download bar and print an informative message instead.
CLI mirror: eolas download <name> for one-shot, eolas sync <name> [--watch hourly] for an incremental check. Full docs: docs.eolas.fyi/bulk-downloads/.
Geospatial
Datasets with a geometry_wkt column auto-convert to geopandas.GeoDataFrame if geopandas is installed:
pip install eolas-data[geo]
gdf = client.get("nz_addresses") # GeoDataFrame
df = client.get("nz_addresses", as_geo=False) # plain DataFrame, WKT preserved
Working with large geo datasets
The 5.4M-row linz.nz_parcels table allocates ~10 GB when materialised as a GeoDataFrame. Pass as_arrow=True to skip all shapely allocation and get a zero-copy pyarrow.Table instead — geometry stays as Arrow buffers until you need it:
# Zero-copy Arrow table — no shapely allocation
tbl = client.linz("nz_parcels", as_arrow=True)
# Filter before materialising — dramatically cheaper than loading the full GeoDataFrame
import duckdb
result = duckdb.sql("""
SELECT parcel_id, geometry_wkt
FROM tbl
WHERE ST_Within(ST_GeomFromText(geometry_wkt),
ST_GeomFromText('POLYGON((174.7 -41.3, 174.8 -41.3, 174.8 -41.4, 174.7 -41.4, 174.7 -41.3))'))
""").df()
as_arrow=True works on all datasets (geo or non-geo), all routing modes (live, cached, auto), and all source helpers. It cannot be combined with as_geo=True.
Polars
pip install eolas-data[polars]
df = client.get("nz_cpi", engine="polars")
Plotting
Dataset is a pandas.DataFrame subclass — use matplotlib / seaborn / plotly
directly. No bundled plot helper, because there's no universal "right" plot for
a tidy dataset (single-series time series vs. wide multi-measure vs. WKT
geometry all need different code).
import matplotlib.pyplot as plt
df = client.statsnz("nz_cpi")
df.plot(x="date", y="value")
plt.show()
Type stubs
Dataset names are exposed as a Literal so IDEs autocomplete the catalog:
from eolas_data import Client
client = Client()
client.get("nz_") # autocomplete shows nz_cpi, nz_gdp_growth, ...
The list is regenerated from the live API at release time. Passing a name not in the snapshot still works at runtime — the type hint just won't autocomplete it. Catalog snapshot date is exposed as eolas_data._dataset_names.CATALOG_SNAPSHOT_DATE.
Testing
# unit tests (mocked HTTP — no API key needed)
pytest -q -m "not integration"
# live smoke (requires EOLAS_API_KEY)
EOLAS_API_KEY=vs_... pytest -q -m integration tests/test_smoke_live.py
CI runs the unit suite on Python 3.10, 3.12, and 3.13 on every push/PR, with coverage uploaded to Codecov. A weekly workflow optionally runs live smoke tests when EOLAS_API_KEY is configured as a repository secret.
Releasing
See docs/clients.md in the eolas data repo for the tagged-release flow and PyPI token rotation.
Before each release: python -m eolas_data._regen_names to refresh the dataset name stubs from the live API, commit the change, then tag and push.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eolas_data-1.3.17.tar.gz.
File metadata
- Download URL: eolas_data-1.3.17.tar.gz
- Upload date:
- Size: 120.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02935d99cc0db2cf7eeec7673c2b9673a0a9ad09dfd53825f391d1b73f61cdbb
|
|
| MD5 |
599c8e47c6ec7d38b0fd02873083922a
|
|
| BLAKE2b-256 |
1824379c04415839cc38820b1784fa07a635cb8eed97fdeeeeec27f452f632cf
|
File details
Details for the file eolas_data-1.3.17-py3-none-any.whl.
File metadata
- Download URL: eolas_data-1.3.17-py3-none-any.whl
- Upload date:
- Size: 86.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f601c4d320391608ff4268100efda2dec8b02e13dd49dbb388dcc15020e7b73
|
|
| MD5 |
82d5b349b7d3b2d79c55dea5dc248a04
|
|
| BLAKE2b-256 |
da66ca39ee02697c81cc617603adc43294ffd30982b44543029fdf2c2953d4b1
|