Skip to main content

Python interface for the OpenAlex API, built on top of the bibliofabric framework.

Project description

Aletheca: Asynchronous Python client for the OpenAlex API

Samuel Mok -- s.mok@utwente.nl -- 2025-2026

Aletheca is an async Python client for the OpenAlex API, built on bibliofabric.

Docs: utsmok.github.io/aletheca -- PyPI: aletheca -- License: MIT

Features

  • Async by design -- built on httpx + asyncio with proper connection pooling
  • Typed throughout -- Pydantic v2 models for all entities, PEP 561 py.typed marker
  • Cursor pagination -- efficient iteration over large result sets via cursor-based auto-pagination
  • Filter serialization -- automatic conversion to OpenAlex filter=key:value syntax with Pydantic filter models
  • Safe types -- SafeList and SafeStr for None-safe traversal of API responses
  • Convenience queries -- high-level functions for common workflows (works_by_author, citing_works, etc.)

Installation

uv add aletheca

Or with pip: pip install aletheca. Requires Python >=3.12.

Quick Start

import asyncio
from aletheca import AlethecaSession

async def main():
    async with AlethecaSession() as session:
        # Get a work by OpenAlex ID
        work = await session.works.get("W1234567890")
        print(work.title)

        # Search works
        results = await session.works.search(search="machine learning", page_size=10)
        for work in results.results:
            print(f"{work.title} ({work.publication_year})")

        # Iterate all works by an author (cursor-based auto-pagination)
        async for work in session.works.iterate(
            filters={"authorships.author.id": "A1234567890"},
            page_size=200,
        ):
            print(work.title)

asyncio.run(main())

No authentication required -- the OpenAlex API works without it. For higher rate limits, see Authentication.

Examples

All examples in examples/ are dual-purpose -- run as scripts or as interactive marimo notebooks:

# As a script
uv run examples/simple_example.py

# As an interactive notebook
uv run marimo edit examples/simple_example.py
Script Description
simple_example.py Search, iterate, get works
02_filtering_and_search.py WorksFilters, AuthorsFilters, and other filter models
03_institution_research.py Works by institution, topic analysis
04_author_discovery.py Find authors, retrieve their works
05_advanced_queries.py Cursor pagination, select fields, sort
06_convenience_queries.py session.queries.* convenience functions
07_iterator_helpers.py collect(), count(), first() from bibliofabric mixins
08_safe_types_and_helpers.py SafeList, SafeStr, DOI normalization, abstract reconstruction

Authentication

Aletheca auto-detects the OpenAlex API key from environment variables or .env files (prefixed with ALETHECA_). No auth is the default if nothing is configured.

ALETHECA_OPENALEX_API_KEY=your_api_key

Or pass explicitly:

async with AlethecaSession(api_key="your_api_key") as session:
    ...

With an API key you get faster responses (dedicated pool). Without one, you use the polite pool (slower).

Basic Usage

Get a single entity

work = await session.works.get("W2741809801")
print(work.title, work.doi, work.publication_year)

Search

results = await session.works.search(search="machine learning", page_size=5)
for work in results.results:
    print(work.title)

Iterate all results

async for work in session.works.iterate(
    filters={"publication_year": 2024, "is_oa": True},
    page_size=200,
):
    print(work.title)
    break  # stop when you want

Convenience queries

citations = await session.queries.citing_works("W2741809801")
print(f"{len(citations)} citations")

Known OpenAlex API Issues

Full bug report with reproduction steps: OPENALEX_BUG_REPORT.md.

  • OpenAPI spec is substantially incomplete -- 50+ fields returned by the live API are missing from the spec schemas across all entity types. Several spec fields don't exist in the live API.
  • Wrong field names in spec -- content_url (spec) vs content_urls (live), grants_count (spec) vs awards_count (live)
  • Undocumented fields -- institution_awarded on Awards is not documented anywhere; 15+ nested Award filters are missing from the docs filter table
  • Awards endpoint missing from llms.txt -- the awards endpoint is not listed in the API quick reference
  • per_page max is 200, not 100 -- documented as 100 but the API accepts 200

Development

uv sync --all-groups --all-extras         # install everything
uv run ruff check src/ --fix              # lint
uv run ruff format src/                   # format
uvx ty check src/                         # type check
uv run pytest tests/                      # run tests
uv run pytest --cov=aletheca tests/       # coverage (CI threshold: 95%)
uv build                                  # build package
uv run mkdocs serve                       # local docs

Contributions welcome -- see Contributing.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aletheca-0.1.0.tar.gz (18.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aletheca-0.1.0-py3-none-any.whl (31.7 kB view details)

Uploaded Python 3

File details

Details for the file aletheca-0.1.0.tar.gz.

File metadata

  • Download URL: aletheca-0.1.0.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for aletheca-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a0fb567ac254294829ca1bcc1346a72c5c6492b264f84743802f157703b6895e
MD5 d9d990f7bf57481f2345a4d6aa25d790
BLAKE2b-256 44b1edfe904a5f068331af0edfa8b995b884ed6e08246ea45e864929eb9fd11c

See more details on using hashes here.

File details

Details for the file aletheca-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: aletheca-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for aletheca-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4ff09af0270eac2961479027fb3890036321649c19d419dee878321b0b48c2b0
MD5 5cb5275c1fb57f1a500439fe2b4870aa
BLAKE2b-256 0b5b7401097ada0ce9df7e4c3eceecae7ad7052290a0789bdbe8242eecece7f7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page