Skip to main content

Python client library for GDELT (Global Database of Events, Language, and Tone)

Project description

gdelt-py

CI PyPI version Python Versions License Code style: ruff Type checked: mypy

A comprehensive Python client library for the GDELT (Global Database of Events, Language, and Tone) project.

Features

  • Unified Interface: Single client covering all 6 REST APIs, 3 database tables, and NGrams dataset
  • Version Normalization: Transparent handling of GDELT v1/v2 differences with normalized output
  • Resilience: Automatic fallback to BigQuery when APIs fail or rate limit
  • Modern Python: 3.11+, Async-first, Pydantic models, type hints throughout
  • Streaming: Generator-based iteration for large datasets with memory efficiency
  • Developer Experience: Clear errors, progress indicators, comprehensive lookups

Installation

# Basic installation
pip install gdelt-py

# With BigQuery support
pip install gdelt-py[bigquery]

# With all optional dependencies
pip install gdelt-py[bigquery,pandas]

Quick Start

from py_gdelt import GDELTClient
from py_gdelt.filters import DateRange, EventFilter
from datetime import date, timedelta

async with GDELTClient() as client:
    # Query recent events
    yesterday = date.today() - timedelta(days=1)
    event_filter = EventFilter(
        date_range=DateRange(start=yesterday, end=yesterday),
        actor1_country="USA",
    )

    result = await client.events.query(event_filter)
    print(f"Found {len(result)} events")

Data Sources Covered

File-Based Endpoints

  • Events - Structured event data (who, what, when, where)
  • Mentions - Article mentions of events
  • GKG - Global Knowledge Graph (themes, entities, quotations)
  • NGrams - Word and phrase occurrences in articles

REST APIs

  • DOC 2.0 - Article search and discovery
  • GEO 2.0 - Geographic analysis and mapping
  • Context 2.0 - Contextual analysis (themes, entities, sentiment)
  • TV - Television news transcript search
  • TVAI - AI-enhanced TV transcript search

Lookup Tables

  • CAMEO - Event classification codes
  • Themes - GDELT theme taxonomy
  • Countries - Country code conversions (FIPS, ISO2, ISO3)
  • Ethnic/Religious Groups - Group classifications

Data Source Matrix

Data Type API BigQuery Raw Files Time Constraint Fallback
Articles (fulltext) DOC 2.0 - - Rolling 3 months No
Article geo heatmaps GEO 2.0 - - Rolling 7 days No
Sentence-level context Context 2.0 - - Rolling 72 hours No
TV captions TV 2.0 - - July 2009+ No
Events v2 - Yes Yes Feb 2015+ Yes
Events v1 - Yes Yes 1979 - Feb 2015 Yes
Mentions - Yes Yes Feb 2015+ Yes
GKG v2 - Yes Yes Feb 2015+ Yes
GKG v1 - Yes Yes 2013 - Feb 2015 Yes
Web NGrams 3.0 - Yes Yes Jan 2020+ Yes

Key Concepts

Async-First Design

All I/O operations are async by default for optimal performance:

async with GDELTClient() as client:
    articles = await client.doc.query(doc_filter)

Synchronous wrappers are available for compatibility:

with GDELTClient() as client:
    articles = client.doc.query_sync(doc_filter)

Streaming for Efficiency

Process large datasets without loading everything into memory:

async with GDELTClient() as client:
    async for event in client.events.stream(event_filter):
        process(event)  # Memory-efficient

Type Safety

Pydantic models throughout with full type hints:

event: Event = result[0]
assert event.goldstein_scale  # Type-checked

Configuration

Flexible configuration via environment variables, TOML files, or programmatic settings:

settings = GDELTSettings(
    timeout=60,
    max_retries=5,
    cache_dir=Path("/custom/cache"),
)

async with GDELTClient(settings=settings) as client:
    ...

Documentation

Full documentation available at: https://rbozydar.github.io/py-gdelt/

Contributing

Contributions are welcome! See Contributing Guide for details.

License

MIT License - see LICENSE file for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gdelt_py-0.1.4.tar.gz (92.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gdelt_py-0.1.4-py3-none-any.whl (116.8 kB view details)

Uploaded Python 3

File details

Details for the file gdelt_py-0.1.4.tar.gz.

File metadata

  • Download URL: gdelt_py-0.1.4.tar.gz
  • Upload date:
  • Size: 92.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gdelt_py-0.1.4.tar.gz
Algorithm Hash digest
SHA256 0c449c7929ac642617530ee86714a552e3745fbffb0545ed48d25158b4ae7fcc
MD5 086df9cd99f1982d35d4a7e0221c50ae
BLAKE2b-256 82a7c683e94fded15a751aacc06f0d9f0958e77bd6f19008e8f8fa17a6818dfb

See more details on using hashes here.

Provenance

The following attestation bundles were made for gdelt_py-0.1.4.tar.gz:

Publisher: publish.yml on RBozydar/py-gdelt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gdelt_py-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: gdelt_py-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 116.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gdelt_py-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 023377ca2ec65d6d01978af8ff757065e672c72c1de3b79135d87138b7203d71
MD5 532a55722309a940110d0e1888e9cf7b
BLAKE2b-256 36d921a94a26bb38f0095eb7d9ddfb2c1668473927b6e5beb93df75228429827

See more details on using hashes here.

Provenance

The following attestation bundles were made for gdelt_py-0.1.4-py3-none-any.whl:

Publisher: publish.yml on RBozydar/py-gdelt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page