Core data types used by OWID for managing data.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ourworldindata

These details have not been verified by PyPI

Project links

Project description

owid-catalog

A Pythonic library for working with OWID data.

The owid-catalog library is the foundation of Our World in Data's data management system. It provides:

Data APIs: Access OWID's published data through unified client interfaces
Data Structures: Enhanced pandas DataFrames with rich metadata support

Installation

pip install owid-catalog

Quick Examples

Accessing OWID Data

from owid.catalog import fetch, search

# Search for charts (default)
charts = search("population")
tb = charts[0].fetch()

# Fetch data from OWID Chart at ourworldindata.org/grapher/life-expectancy
tb = fetch("life-expectancy")

# Search for tables
tables = search("population", kind="table", namespace="un")
tb = tables[0].fetch()

# Search indicators (using semantic search)
search("renewable energy", kind="indicator")

Working with Data Structures

from owid.catalog import Table
from owid.catalog import processing as pr

# Tables are pandas DataFrames with metadata
tb = Table(df, metadata={"short_name": "population"})

# Metadata propagates through operations
tb_filtered = tb[tb["year"] > 2000]  # Keeps metadata
tb_merged = pr.merge(tb1, tb2, on="country")  # Merges metadata

Documentation

For detailed documentation, see:

API Reference: ChartsAPI, IndicatorsAPI, TablesAPI
Data Structures: Dataset, Table, Variable, metadata handling
Full Documentation: Complete library documentation

Architecture

graph TB
etl -->|reads| snapshot[upstream datasets]
etl -->|generates| s3[data catalog]
catalog[owid-catalog] -->|queries| s3

This library is part of OWID's ETL project, which contains recipes for all datasets we publish.

Development

You need Python 3.10+, uv and make installed. Clone the repo, then you can simply run:

# run all unit tests and CI checks
make test

# watch for changes, then run all checks
make watch

Changelog

`v1.2.0`

Remove legacy Source metadata (origins only)
- Removed Source class from owid.catalog.core.meta
- Removed sources field from VariableMeta and DatasetMeta (use origins instead)
- Removed if_source_exists parameter from Dataset.update_metadata (use if_origins_exist)
- Removed get_unique_sources_from_indicators helper
- Removed sources aggregation from combine_indicators_metadata

`v1.1.0`

Remove processing log feature
- Removed ProcessingLog and processing_log module from owid.catalog.core
- Removed combine_indicators_processing_logs helper
- Removed update_log / amend_log methods on Indicator
- Removed processing-log tracking from Table arithmetic operations (__add__, __sub__, __mul__, etc.)
- Removed processing_log field from VariableMeta

`v1.0.1`

ResponseSet ergonomics
- Remove deprecated ResponseSet.results property (use .items instead)
- Add .to_dict() method for serializing results to plain dicts (useful for AI/LLM context windows)
- Add all_fields parameter to .to_frame() to temporarily override display mode without mutating instance state

`v1.0.0`

New unified Client API
- owid.catalog.Client as single entry point with ChartsAPI, IndicatorsAPI, TablesAPI
- Quick access via search() and fetch() convenience functions
- Rich result types: ChartResult, IndicatorResult, TableResult with ResponseSet container
Charts API
- Fetch chart data by slug, URL, or slug with query params
- Parse chart slugs from grapher/explorer URLs via parse_chart_slug()
- Explorer best-effort fetching with graceful error handling
- set_ui_advanced() / set_ui_basic() for display configuration
Tables API
- Search catalog by table, namespace, version, dataset, and channel
- Fetch tables directly by catalog path
- Embedded catalog index with local caching
Indicators API
- Semantic search via search.owid.io vector embeddings
- Sort by relevance (similarity + popularity blend) or similarity only
- fetch() for single-column indicator or fetch_table() for the full table
Search & discovery
- Fuzzy, exact, contains, and regex matching modes
- .latest() filtering to keep only newest versions
- Popularity scores (0.0-1.0) from analytics views, results sorted by popularity
- refresh_index parameter to force catalog index reload
Data structures integration
- All fetch() methods return owid.catalog.Table with full metadata
- CatalogPath helper for parsing catalog paths
- Lazy loading with load_data=False for deferred data access
Library reorganization
- Restructured into owid.catalog.core (data structures) and owid.catalog.api (remote access)
- catalog.find() deprecated in favor of Client().tables.search() (backwards compat maintained)
- Legacy code moved to owid.catalog.api.legacy
- New dependencies: pydantic v2.0+
Private data support
- Private datasets served from separate R2 bucket
- API can fetch private data from private bucket
Performance
- Vectorized operations replacing iterrows() in TablesAPI
- Embedded catalog index loading (removed ETLCatalog dependency)
- Modularized search into helper methods
Other
- Thumbnail display in ResponseSet for chart results
- JSON output format support
- Comprehensive exception handling: ChartNotFoundError, LicenseError
- API URLs immutable with Pydantic Field(frozen=True)

See previous versions

`v0.4.5`

Allow both table and dataset parameters in find() (they can now be used together)
Migrate from pyright to ty type checker for improved type checking

`v0.4.4`

Enhanced find() with better search capabilities:
- Case-insensitive search by default (use case=True for case-sensitive)
- Regex support enabled by default for table and dataset parameters
- New fuzzy search with fuzzy=True - typo-tolerant matching sorted by relevance
- Configurable fuzzy threshold (0-100) to control match strictness
New dependency: rapidfuzz for fuzzy string matching

`v0.4.3`

Fixed minor bugs

`v0.4.0`

Highlights
- Support for Python 3.10-3.13 (was 3.11-3.13)
- Drop support for Python 3.9 (breaking change)
Others
- Deprecate Walden.
- Dependencies: Change rdata for pyreadr.
- Support: indicator dimensions.
- Support: MDIMs.
- Switched from Poetry to UV package manager.
- New decorator @keep_metadata to propagate metadata in pandas functions.
Fixes: Table.apply, groupby.apply, metadata propagation, type hinting, etc.

`v0.3.11`

Add support for Python 3.12 in pypackage.toml

`v0.3.10`

Add experimental chart data API in owid.catalog.charts

`v0.3.9`

Switch from isort & black & fake8 to ruff

`v0.3.8`

Pin dataclasses-json==0.5.8 to fix error with python3.9

`v0.3.7`

Fix bugs.
Improve metadata propagation.
Improve metadata YAML file handling, to have common definitions.
Remove DatasetMeta.origins.

`v0.3.6`

Fixed tons of bugs
processing.py module with pandas-like functions that propagate metadata
Support for Dynamic YAML files
Support for R2 alongside S3

`v0.3.5`

Remove catalog.frames; use owid-repack package instead
Relax dependency constraints
Add optional channel argument to DatasetMeta
Stop supporting metadata in Parquet format, load JSON sidecar instead
Fix errors when creating new Table columns

`v0.3.4`

Bump pyarrow dependency to enable Python 3.11 support

`v0.3.3`

Add more arguments to Table.__init__ that are often used in ETL
Add Dataset.update_metadata function for updating metadata from YAML file
Python 3.11 support via update of pyarrow dependency

`v0.3.2`

Fix a bug in Catalog.__getitem__()
Replace mypy type checker by pyright

`v0.3.1`

Sort imports with isort
Change black line length to 120
Add grapher channel
Support path-based indexing into catalogs

`v0.3.0`

Update OWID_CATALOG_VERSION to 3
Support multiple formats per table
Support reading and writing parquet files with embedded metadata
Optional repack argument when adding tables to dataset
Underscore |
Get version field from DatasetMeta init
Resolve collisions of underscore_table function
Convert version to str and load json dimensions

`v0.2.9`

Allow multiple channels in catalog.find function

`v0.2.8`

Update OWID_CATALOG_VERSION to 2

`v0.2.7`

Split datasets into channels (garden, meadow, open_numbers, ...) and make garden default one
Add .find_latest method to Catalog

`v0.2.6`

Add flag is_public for public/private datasets
Enforce snake_case for table, dataset and variable short names
Add fields published_by and published_at to Source
- Added a list of supported and unsupported operations on columns
- Updated pyarrow

`v0.2.5`

Fix ability to load remote CSV tables

`v0.2.4`

Update the default catalog URL to use a CDN

`v0.2.3`

Fix methods for finding and loading data from a LocalCatalog

`v0.2.2`

Repack frames to compact dtypes on Table.to_feather()

`v0.2.1`

Fix key typo used in version check

`v0.2.0`

Copy dataset metadata into tables, to make tables more traceable
Add API versioning, and a requirement to update if your version of this library is too old

`v0.1.1`

Add support for Python 3.8

`v0.1.0`

Initial release, including searching and fetching data from a remote catalog

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ourworldindata

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.2.1

May 18, 2026

1.2.0

May 15, 2026

1.1.1

May 13, 2026

1.1.0

Apr 17, 2026

1.0.1

Feb 26, 2026

1.0.0

Feb 24, 2026

1.0.0rc2 pre-release

Jan 16, 2026

1.0.0rc1 pre-release

Dec 29, 2025

0.4.5

Dec 22, 2025

0.4.4

Dec 18, 2025

0.4.3

Oct 27, 2025

0.4.2

May 27, 2025

0.4.1

May 27, 2025

0.3.11

May 17, 2024

0.3.10

May 17, 2024

0.3.9

Jan 26, 2024

0.3.8

Oct 17, 2023

0.3.7

Oct 16, 2023

0.3.6

Sep 28, 2023

0.3.5

Jul 20, 2023

0.3.4

Dec 22, 2022

0.3.2

Sep 24, 2022

0.3.1

Sep 24, 2022

0.3.0

Aug 10, 2022

0.2.9

May 12, 2022

0.2.8

May 11, 2022

0.2.7

May 4, 2022

0.2.6

Apr 20, 2022

0.2.5

Jan 27, 2022

0.2.4

Dec 13, 2021

0.2.3

Nov 9, 2021

0.2.2

Oct 31, 2021

0.2.1

Oct 26, 2021

0.2.0

Oct 25, 2021

0.1.1

Oct 22, 2021

0.1.0

Oct 22, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

owid_catalog-1.2.1.tar.gz (343.1 kB view details)

Uploaded May 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

owid_catalog-1.2.1-py3-none-any.whl (123.1 kB view details)

Uploaded May 18, 2026 Python 3

File details

Details for the file owid_catalog-1.2.1.tar.gz.

File metadata

Download URL: owid_catalog-1.2.1.tar.gz
Upload date: May 18, 2026
Size: 343.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for owid_catalog-1.2.1.tar.gz
Algorithm	Hash digest
SHA256	`1acd90f90d632bab1a061f651cb4e1c01e5490ad229fec915d5614219f5e789c`
MD5	`8d7914dd5cb9f3e44418a28db3430da8`
BLAKE2b-256	`7bc9483ced6dbb08b57287cf1b9bd094ec270874e4b4050b97d0508827cead86`

See more details on using hashes here.

Provenance

The following attestation bundles were made for owid_catalog-1.2.1.tar.gz:

Publisher: publish-owid-packages.yml on owid/etl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: owid_catalog-1.2.1.tar.gz
- Subject digest: 1acd90f90d632bab1a061f651cb4e1c01e5490ad229fec915d5614219f5e789c
- Sigstore transparency entry: 1569127718
- Sigstore integration time: May 18, 2026
Source repository:
- Permalink: owid/etl@c97ecf004ac49793a60af3a801b47de1ddd4723e
- Branch / Tag: refs/heads/master
- Owner: https://github.com/owid
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-owid-packages.yml@c97ecf004ac49793a60af3a801b47de1ddd4723e
- Trigger Event: push

File details

Details for the file owid_catalog-1.2.1-py3-none-any.whl.

File metadata

Download URL: owid_catalog-1.2.1-py3-none-any.whl
Upload date: May 18, 2026
Size: 123.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for owid_catalog-1.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`46d6bb39f784b59c444a7508f83d08c49173bf2665d84b45d8bc33f4b0084908`
MD5	`8291b1bae782754c7260dfad9a74daba`
BLAKE2b-256	`35b527113cbcdbb401e4042cb9c3f66eeccac1a4f03834b74dad9755d907db2d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for owid_catalog-1.2.1-py3-none-any.whl:

Publisher: publish-owid-packages.yml on owid/etl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: owid_catalog-1.2.1-py3-none-any.whl
- Subject digest: 46d6bb39f784b59c444a7508f83d08c49173bf2665d84b45d8bc33f4b0084908
- Sigstore transparency entry: 1569127744
- Sigstore integration time: May 18, 2026
Source repository:
- Permalink: owid/etl@c97ecf004ac49793a60af3a801b47de1ddd4723e
- Branch / Tag: refs/heads/master
- Owner: https://github.com/owid
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-owid-packages.yml@c97ecf004ac49793a60af3a801b47de1ddd4723e
- Trigger Event: push

owid-catalog 1.2.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

owid-catalog

Installation

Quick Examples

Accessing OWID Data

Working with Data Structures

Documentation

Architecture

Development

Changelog

v1.2.0

v1.1.0

v1.0.1

v1.0.0

v0.4.5

v0.4.4

v0.4.3

v0.4.0

v0.3.11

v0.3.10

v0.3.9

v0.3.8

v0.3.7

v0.3.6

v0.3.5

v0.3.4

v0.3.3

v0.3.2

v0.3.1

v0.3.0

v0.2.9

v0.2.8

v0.2.7

v0.2.6

v0.2.5

v0.2.4

v0.2.3

v0.2.2

v0.2.1

v0.2.0

v0.1.1

v0.1.0

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`v1.2.0`

`v1.1.0`

`v1.0.1`

`v1.0.0`

`v0.4.5`

`v0.4.4`

`v0.4.3`

`v0.4.0`

`v0.3.11`

`v0.3.10`

`v0.3.9`

`v0.3.8`

`v0.3.7`

`v0.3.6`

`v0.3.5`

`v0.3.4`

`v0.3.3`

`v0.3.2`

`v0.3.1`

`v0.3.0`

`v0.2.9`

`v0.2.8`

`v0.2.7`

`v0.2.6`

`v0.2.5`

`v0.2.4`

`v0.2.3`

`v0.2.2`

`v0.2.1`

`v0.2.0`

`v0.1.1`

`v0.1.0`