Skip to main content

Python client for the eolas.fyi statistical data API (NZ, Australia, OECD)

Project description

eolas-data

Python client for the eolas.fyi statistical data API — 1,400+ official New Zealand statistical & geospatial datasets, plus OECD data for international comparisons, served as tidy pandas DataFrames (or polars / geopandas if you prefer).

Coverage is New Zealand + OECD today. Australian sources are on the roadmap — not yet available; OECD data already includes Australia (and other OECD members) for cross-country comparisons.

pip install eolas-data

Quickstart

from eolas_data import Client

client = Client("your_api_key")   # or set EOLAS_API_KEY in env

# CPI index (monthly, RBNZ M1) — the usual Treasury/analyst choice
cpi = client.rbnz("rbnz_m1_prices", start="2020-01-01")

# OECD macro indicators (quarterly YoY % — not CPI index levels)
inflation = client.oecd("nz_cpi", start="2020-01-01")
gdp       = client.oecd("nz_gdp_growth")

# Discovery
all_datasets = client.list()
nz_only      = client.list("Stats NZ")
client.search("cpi")   # expands aliases; surfaces rbnz_m1_prices before nz_cpi
meta         = client.info("rbnz_m1_prices")

Get an API key at https://eolas.fyi/signup. Free plan is 10 requests/month; Pro ($49/month) is unlimited.

Quick setup (workstation)

Two one-off commands make every future session frictionless:

1. Save your API key to the OS keyring (macOS Keychain / Windows Credential Manager / Linux Secret Service) so Client() finds it automatically — no env var, no pasting:

pip install 'eolas-data[secure]'   # adds the keyring package
eolas auth save-key                # interactive prompt
from eolas_data import Client
client = Client()   # key read from OS keyring automatically

2. Set a library directory so downloaded bulk files land somewhere permanent instead of the transient ~/.cache/eolas/ OS cache:

eolas library set ~/eolas-library  # writes to ~/.eolas/config.json

Or set the env var instead (useful for CI / Docker):

export EOLAS_LIBRARY=~/eolas-library

After setting the library, client.get_local("nz_parcels") will use ~/eolas-library/ automatically.

The keyring slot and config file are shared with the R eolas client — a key saved from Python is immediately readable from R and vice versa (see the R client README).


Command-line interface

pip install eolas-data[cli] adds an eolas command for browsing, fetching, and scheduling — useful for shell scripts, cron jobs, and AI-agent workflows. Rich tables by default; pass --json for newline-delimited JSON in scripts.

# one-time setup (OS keyring — recommended)
pip install 'eolas-data[secure]'
eolas auth save-key

# or config file (no extra install)
eolas auth set-key
eolas health

# discover
eolas datasets list --source "Stats NZ"
eolas datasets list --search cpi          # table + CPI guidance note
eolas datasets list --search cpi --json | jq '.[].name'
eolas datasets info rbnz_m1_prices
eolas datasets preview rbnz_m1_prices --limit 5

# fetch (verb matches the Python lib's client.get())
eolas get rbnz_m1_prices --format csv > cpi.csv
eolas get nz_cpi --start 2020-01-01 --format json | jq '.[].value'   # OECD YoY %
eolas get nz_meshblock_2023 --format parquet --out sa2.parquet

Scheduling

Set up recurring fetches without touching crontab/Task Scheduler syntax. Works on Linux, macOS (cron), and Windows (Task Scheduler).

eolas schedule add nz_cpi --daily   --out ~/data/cpi.csv
eolas schedule add nz_gdp_growth --weekly  --out ~/data/gdp.csv
eolas schedule add rbnz_b1_exchange_rates_monthly --cron "0 */6 * * *" --out ~/data/fx.csv   # POSIX only

eolas schedule list
eolas schedule remove nz_cpi

Daily is the default. Pre-flight check refuses to install a schedule unless your API key is configured (otherwise the job would fail silently forever).

Integrations (Enterprise plan)

Generate ready-to-run connector configs for popular data-pipeline tools — eolas becomes a one-command source for Meltano, Fivetran, or Azure Data Factory.

eolas integrate meltano             --datasets nz_cpi,nz_gdp_growth --output ./my-pipeline/
eolas integrate fivetran            --datasets nz_cpi
eolas integrate azure-data-factory  --datasets nz_cpi,nz_gdp_growth

The generated directory has everything needed to plug into your destination warehouse: meltano.yml, fivetran.yml, or ADF JSON resources, plus a README.md walking through the rest of the setup. Non-Enterprise users see a clear upgrade pointer; the gating lives server-side so the capability is bypass-proof.

Exit codes

Distinct exit codes per error class, for shell scripts and agents:

Code Meaning
0 Success
1 Generic error
2 Auth (AuthenticationError, including Enterprise-gate 403)
3 Rate limit hit
4 Dataset / resource not found
5 Other API error
64 Bad usage (mirrors sysexits.h)

Performance (Arrow)

client.get() transparently negotiates Apache Arrow over the wire — same DataFrame back, typically 5–10× faster end-to-end on large pulls, with an automatic JSON fallback. No setup: pyarrow ships with eolas-data, so this is on by default; format= ("json"/"csv") is only for the rare case you want the raw text payload.

For a columnar file (CLI), use --format parquet --out FILE; via the REST API directly, ?format=parquet. Full benchmark: docs.eolas.fyi → Python reference → Performance.

Bulk downloads — use get_local() for whole datasets

client.get() hits the live /data endpoint (good for slices and small pulls). For whole datasets — especially large or geospatial layers — use get_local(). It syncs a CDN-cached Parquet/GeoParquet file to your library directory and reads from disk on subsequent calls.

# Whole-dataset path: nz_parcels from CDN-cached GeoParquet (seconds, not a 15-min Iceberg scan)
gdf = client.get_local("nz_parcels")   # geopandas.GeoDataFrame when [geo] is installed
df  = client.get_local("nz_cpi")       # tidy DataFrame from cached Parquet

# Live path: date slices, row limits, licence-restricted sources (e.g. OECD)
df  = client.get("nz_cpi", start="2020-01-01")
df  = client.get("nz_cpi", limit=100)

Use get_local() when you need to control cache_dir, format, or freshness:

# Explicit cache+sync with extra options
gdf = client.get_local("nz_parcels")
gdf = client.get_local("nz_parcels", cache_dir="/data/eolas", freshness="monthly")
df  = client.get_local("nz_cpi", format="csv_gz")

For advanced control over the sync lifecycle (sidecar tracking, atomic replace), use sync_bulk() directly. For one-shot bytes-or-path downloads, use download_bulk():

r    = client.sync_bulk("nz_cpi", path="nz_cpi.parquet")
# r.status ∈ {"downloaded", "unchanged", "updated"}; r.bytes_downloaded == 0 when unchanged.
path = client.download_bulk("treasury_fiscal_spending", path="t.parquet")

Progress bars: get_local() shows two phases in interactive sessions — a download byte bar while fetching from CDN, then a read spinner while Parquet/GeoParquet is loaded (often the slow part on multi-million-row geo datasets). Control with progress=True (both), False (neither), "download", or "read". Set EOLAS_NO_PROGRESS=1 to suppress both in batch scripts. Cached files skip the download bar and print an informative message instead.

CLI mirror: eolas download <name> for one-shot, eolas sync <name> [--watch hourly] for an incremental check. Full docs: docs.eolas.fyi/bulk-downloads/.

Geospatial

Datasets with a geometry_wkt column auto-convert to geopandas.GeoDataFrame if geopandas is installed:

pip install eolas-data[geo]
gdf = client.get("nz_addresses")                  # GeoDataFrame
df  = client.get("nz_addresses", as_geo=False)    # plain DataFrame, WKT preserved

Working with large geo datasets

The 5.4M-row linz.nz_parcels table allocates ~10 GB when materialised as a GeoDataFrame. Pass as_arrow=True to skip all shapely allocation and get a zero-copy pyarrow.Table instead — geometry stays as Arrow buffers until you need it:

# Zero-copy Arrow table — no shapely allocation
tbl = client.linz("nz_parcels", as_arrow=True)

# Filter before materialising — dramatically cheaper than loading the full GeoDataFrame
import duckdb
result = duckdb.sql("""
    SELECT parcel_id, geometry_wkt
    FROM tbl
    WHERE ST_Within(ST_GeomFromText(geometry_wkt),
                    ST_GeomFromText('POLYGON((174.7 -41.3, 174.8 -41.3, 174.8 -41.4, 174.7 -41.4, 174.7 -41.3))'))
""").df()

as_arrow=True works on all datasets (geo or non-geo), all routing modes (live, cached, auto), and all source helpers. It cannot be combined with as_geo=True.

Polars

pip install eolas-data[polars]
df = client.get("nz_cpi", engine="polars")

Plotting

Dataset is a pandas.DataFrame subclass — use matplotlib / seaborn / plotly directly. No bundled plot helper, because there's no universal "right" plot for a tidy dataset (single-series time series vs. wide multi-measure vs. WKT geometry all need different code).

import matplotlib.pyplot as plt

df = client.statsnz("nz_cpi")
df.plot(x="date", y="value")
plt.show()

Type stubs

Dataset names are exposed as a Literal so IDEs autocomplete the catalog:

from eolas_data import Client

client = Client()
client.get("nz_")    # autocomplete shows nz_cpi, nz_gdp_growth, ...

The list is regenerated from the live API at release time. Passing a name not in the snapshot still works at runtime — the type hint just won't autocomplete it. Catalog snapshot date is exposed as eolas_data._dataset_names.CATALOG_SNAPSHOT_DATE.

Testing

# unit tests (mocked HTTP — no API key needed)
pytest -q -m "not integration"

# live smoke (requires EOLAS_API_KEY)
EOLAS_API_KEY=vs_... pytest -q -m integration tests/test_smoke_live.py

CI runs the unit suite on Python 3.10, 3.12, and 3.13 on every push/PR. A weekly workflow optionally runs live smoke tests when EOLAS_API_KEY is configured as a repository secret.

Releasing

See docs/clients.md in the eolas data repo for the tagged-release flow and PyPI token rotation.

Before each release: python -m eolas_data._regen_names to refresh the dataset name stubs from the live API, commit the change, then tag and push.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eolas_data-1.3.15.tar.gz (116.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eolas_data-1.3.15-py3-none-any.whl (86.2 kB view details)

Uploaded Python 3

File details

Details for the file eolas_data-1.3.15.tar.gz.

File metadata

  • Download URL: eolas_data-1.3.15.tar.gz
  • Upload date:
  • Size: 116.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for eolas_data-1.3.15.tar.gz
Algorithm Hash digest
SHA256 664f9ff119f5667747c52ee0c1ab9b8d1b591a45ae9d88ba58c4d4c547c4615a
MD5 a6095ffd660c170de89bc3d31d784a8e
BLAKE2b-256 8d1e0ba4b4aa0b5813317ac3e2bd9faef820b7f9c25a84c6b26d6d273e973cce

See more details on using hashes here.

File details

Details for the file eolas_data-1.3.15-py3-none-any.whl.

File metadata

  • Download URL: eolas_data-1.3.15-py3-none-any.whl
  • Upload date:
  • Size: 86.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for eolas_data-1.3.15-py3-none-any.whl
Algorithm Hash digest
SHA256 38d4ba7223a9495b2496454e7f7b67bf540808bca4e4cfe1bba8ecc3922c239b
MD5 281ced0e9cccf11dcade890c2b807b28
BLAKE2b-256 c0fff6b32e602c6164081543ed59ada949e75832c28c3925952e1c9cca8b5ed4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page