World Bank Open Data helpers — Python library + CLI mirroring the Stata wbopendata surface (discovery, data, country-context, multilingual, linewrap).
Project description
wb-api-repo
World Bank Open Data helpers in Python (library + CLI) and Stata
(wbopendata ado package). Two surfaces over the same WB API v2, with a
shared YAML metadata cache so discovery commands stay fast and offline-safe.
Python package: wb-api-tools on PyPI —
pip install wb-api-tools (live release version shown in the PyPI badge above).
Parallel v0.x track to the upstream wbopendata-dev
Stata Journal lineage (v18.x).
What's here
| Surface | Entry point | Reference |
|---|---|---|
| Python library | wb_api_tools.{discovery,data,text} (re-exported at package root) |
docs/PYTHON_USER_GUIDE.md |
| Python CLI | wb-api-tools <subcmd> (after install) or python -m wb_api_tools <subcmd> |
--help on every subcommand |
| Stata package | src/w/wbopendata.ado (v17.4.0) |
help wbopendata in Stata, or src/w/wbopendata.sthlp |
| YAML metadata cache | ~/.cache/wbopendata/_wbopendata_{indicators,sources,topics}.yaml (XDG-aware) |
populated by wb-api-tools sync |
Install
From PyPI (once published — see pyproject.toml for the canonical name):
pip install wb-api-tools
From a git checkout (dev mode):
git clone https://github.com/jpazvd/wb-api-repo.git
cd wb-api-repo
pip install -e ".[test]"
Requires Python 3.11+. The Stata package is loaded by adding src/w/ and
src/_/ to Stata's adopath (or installed via net install once an SSC
release lands).
Quick start
The repo ships with a runnable 7-section walkthrough that exercises the
whole Python surface (discovery, live describe, get_data flag matrix,
enrich_country_context, wb_text formats):
PYTHONIOENCODING=utf-8 python examples/demo_pr_b_c.py
Captured transcript: docs/PYTHON_DEMO.md.
Python CLI
After pip install, use the wb-api-tools console script (or
python -m wb_api_tools if PATH doesn't include scripts). Each subcommand
has --help for full flag descriptions.
| Subcommand | Purpose |
|---|---|
countries |
Fetch country metadata |
indicators |
Fetch indicator metadata (legacy CSV/parquet/yaml dump) |
data |
Fetch indicator data; --no-basic skips country-context auto-merge, --geo adds capital/lat/lon, --language es switches the API path |
sources |
List WB data sources (--all for the full set) |
alltopics |
List all WB topic categories |
info <id> |
Show full metadata for one indicator (from YAML cache) |
describe <id> |
Fetch fresh metadata for one indicator (live API; --language supported) |
search [term] |
Paginated indicator search; --source, --topic, --field, --exact |
sync |
Populate / refresh the YAML metadata cache from the live WB API |
Example:
wb-api-tools data \
--indicators SP.POP.TOTL,NY.GDP.MKTP.CD \
--countries "BRA;USA;IND" \
--date 2010:2020 \
--geo --long --out _data/wb/pop_gdp_long.csv
Output is written to --out (.csv / .parquet / .yaml / .yml) or
printed as a preview if --out is omitted.
Python library
After pip install, the package is importable directly — no sys.path
hacks needed:
import wb_api_tools as wb
wb.search("poverty headcount", limit=5)
df = wb.get_data(["SP.POP.TOTL"], "BRA;USA;IND", date="2020", geo=True)
wb.wrap("long indicator title ...", width=60, fmt="stack") # for Stata graph title()
Full reference: docs/PYTHON_USER_GUIDE.md (library + CLI + Stata-parity table).
Stata package
src/w/wbopendata.ado is the v17.4.0 dispatcher; current Phase-0-through-6
surface mirrors the Python library:
wbopendata, sources / allsources / alltopics / info / search / describediscovery commandswbopendata, indicator(X) cleardata fetch withnoBASIC,geo,language(es),cache(days),synclinewrap(W) maxlength(N) linewrapformat(stack|newline|lines|smcl)for graph-title and SMCL formatting
Open src/w/wbopendata.sthlp in Stata's viewer or run help wbopendata
once the package is on the adopath. The Python-side
docs/PYTHON_USER_GUIDE.md §5 has a row-by-row
Stata ↔ Python parity table.
YAML metadata cache
The offline metadata cache lives in a per-user XDG-aware directory
(typically ~/.cache/wbopendata/ on POSIX or ~/AppData/Local/wbopendata/
on Windows; override with $WBOPENDATA_YAML_DIR):
_wbopendata_indicators.yaml— 29,511 indicators (~18 MB)_wbopendata_sources.yaml— 71 sources_wbopendata_topics.yaml— 21 topics
Discovery commands (info, search, sources, alltopics) read from
this cache for microsecond lookups. After pip install, populate it once:
wb-api-tools sync # download + write all three YAMLs (~10 min first time)
wb-api-tools sync --commit --tag # git-commit + tag (dev mode only)
A semi-monthly GitHub Action (.github/workflows/wb_metadata_nightly.yml
— file name is historical; cron runs on the 1st and 15th of every month
at 02:17 UTC, so 14–17 days apart depending on month length) keeps the
repo-committed cache fresh. Manually triggerable via workflow_dispatch.
Documentation
- docs/PYTHON_USER_GUIDE.md — Python library + CLI reference (Stata
.sthlpequivalent) - docs/PYTHON_DEMO.md — captured live-API transcript from the 7-section walkthrough
- docs/EXAMPLES.md — end-to-end workflows (API, Stata, Python)
- docs/AGE_BANDS.md — standard 5-year age band codes for population indicators
- examples/ — runnable Python examples
- CHANGELOG.md — per-release change log
- doc/VERSIONING_POLICY.md — semver policy + component-level
.adoversion headers
Development
PYTHONIOENCODING=utf-8 python -m pytest tests/ # 62 cases across discovery, wb_text, wb_api_tools
Useful Makefile targets:
make wb-update-metadata # refresh YAML cache (v0.1.0 pipeline)
make wb-metadata # legacy YAML builder (pre-Phase-0)
make wb-metadata-csv # legacy CSV builder
make wb-config # batch data pulls from config.yaml
Branch model: feature work on develop; releases tag from main. See the
v0.1.0 release notes for the full PR list.
Integration
The Python CLI and library plug into:
- Makefiles / pipelines (
make wb-update-metadata, cron, GitHub Actions) - Stata workflows (export CSV →
import delimited, or use the Stata package directly) - R workflows (
readr::read_csvorarrow::read_parquet) - Jupyter notebooks for ad-hoc analysis
License
See LICENSE.md. Developed to bridge Stata wbopendata
workflows with modern Python pipelines for reproducible UNICEF / World
Bank style analytics.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wb_api_tools-0.2.0rc1.tar.gz.
File metadata
- Download URL: wb_api_tools-0.2.0rc1.tar.gz
- Upload date:
- Size: 52.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b3cce08a2711dbd62c65fdcb398b9891a3e3c760eef11c09cda75ada970c922
|
|
| MD5 |
e625e0c43fb761c70b21b21cc93f815f
|
|
| BLAKE2b-256 |
bffdbedd17132154fccc59c10a47f8851618570ebfd36ad6ebc7dcd3413a4268
|
Provenance
The following attestation bundles were made for wb_api_tools-0.2.0rc1.tar.gz:
Publisher:
publish.yml on jpazvd/wb-api-repo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wb_api_tools-0.2.0rc1.tar.gz -
Subject digest:
1b3cce08a2711dbd62c65fdcb398b9891a3e3c760eef11c09cda75ada970c922 - Sigstore transparency entry: 1616297408
- Sigstore integration time:
-
Permalink:
jpazvd/wb-api-repo@d06d6713b073a01008d509789f630f2443d5f14c -
Branch / Tag:
refs/tags/v0.2.0-rc1 - Owner: https://github.com/jpazvd
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d06d6713b073a01008d509789f630f2443d5f14c -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file wb_api_tools-0.2.0rc1-py3-none-any.whl.
File metadata
- Download URL: wb_api_tools-0.2.0rc1-py3-none-any.whl
- Upload date:
- Size: 44.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fbeedd01a6158a9a8070b1c1c74c4dc9cd9573f25cdd190a0c958b9d36b1acd7
|
|
| MD5 |
c7bb181f817922bf0577524f6eedb10e
|
|
| BLAKE2b-256 |
c90168afcda3f02f1de2b58d12aa2d96fb7cda0967f31abda9d487feafd1a487
|
Provenance
The following attestation bundles were made for wb_api_tools-0.2.0rc1-py3-none-any.whl:
Publisher:
publish.yml on jpazvd/wb-api-repo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wb_api_tools-0.2.0rc1-py3-none-any.whl -
Subject digest:
fbeedd01a6158a9a8070b1c1c74c4dc9cd9573f25cdd190a0c958b9d36b1acd7 - Sigstore transparency entry: 1616297430
- Sigstore integration time:
-
Permalink:
jpazvd/wb-api-repo@d06d6713b073a01008d509789f630f2443d5f14c -
Branch / Tag:
refs/tags/v0.2.0-rc1 - Owner: https://github.com/jpazvd
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d06d6713b073a01008d509789f630f2443d5f14c -
Trigger Event:
workflow_dispatch
-
Statement type: