Python based framework to retreive Global Database of
Project description
GDELT in Python with gdeltPyR
gdeltPyR is a Python-based framework to access and analyze Global Database of Events, Language, and Tone (GDELT) 1.0 and 2.0 data data in Python Pandas or R dataframes (R dataframe output feature coming soon). A user can enter a single date, date range (two strings), or several individual dates and return a tidy data set ready for scientific or data-driven exploration.
gdeltPyR retrieves Global Database of Events, Language, and Tone (GDELT) 1.0 and 2.0 data via [parallel HTTP GET requests and is an alternative to accessing GDELT data via Google BigQuery. Therefore, the more CPUs or cores you have, the less time it takes to pull more data. Moreover, the more RAM you have, the more data you can pull. And finally, for RAM-limited workflows, create a pipeline that pulls data, writes to disc, and flushes. The only limitation with data pulls gdeltPyR is you hardware.
The GDELT Project advertises as the largest, most comprehensive, and highest resolution open database of human society ever created. It monitors print, broadcast, and web news media in over 100 languages from across every country in the world to keep continually updated on breaking developments anywhere on the planet. Its historical archives stretch back to January 1, 1979 and accesses the world’s breaking events and reaction in near-realtime as both the GDELT Event and Global Knowledge Graph update every 15 minutes. Visit the GDELT website to learn more about the project.
New Features
Added geodataframe output. This can be easily converted into a shapefile or choropleth visualization.
Added continuous integration testing for Windows, OSX, and Linux (Ubuntu)
Normalized columns output; export data with SQL ready columns (no special characters, all lowercase)
Coming Soon (version 0.1.11, as of 29 May 2017)
Query Google’s BigQuery directly from gdeltPyR using the pandas.io.gbq interface; requires authentication and Google Compute account
Adding a query for GDELT Visual Knowledge Graph (VGKG)
Adding a query for GDELT American Television Global Knowledge Graph (TV-GKG)
Installation
Latest release installs from PyPi:
pip install gdelt
Latest dev version of gdeltPyR can be installed from GitHub.com:
pip install git+https://github.com/linwoodc3/gdeltPyR
Basic Usage
#############################
# Import gdeltPyR; instantiate
#############################
import gdelt
gd = gdelt.gdelt(version=2)
results = gd.Search(['2016 10 19','2016 10 22'],table='events',coverage=True)
Full-on open source project with the following contributors:
Linwood Creekmore
2016-09-25
Released 0.1 Initial check-in of gdeltPyR
- 2016-10-23
Working on 0.1 release, basic functionality all there
- 2016-10-25
Changed MANIFEST.in file; cleaned up egg.info
- 2016-10-30
Edited the warning strings and aligned some code with PEP8
- 2016-10-31
Added human readable CAMEO Codes column
- 2016-10-31
Bug fix; file not loading or downloading on install
- 2016-11-03
Added ability to pull GKG 1.0 data
- 2016-11-06
GDELT changed url structure for 2.0 events database.
- 2016-11-07
Typo on line 290 of base.py; removed and fixed bug.
- 2017-05-23
Updated to 0.1.10 Added geodataframe output Started adding unittests (datefuncs first), added docstrings for functions, and PEP8 adherence. Fixed datecheck error on current day;
- 2017-05-27
Removed datetime parsing Added unittests for events before 2013, gkg v1 before apr 1, etc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for gdelt-0.1.10.4-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 952c4ad5dcc3ba3f54c27d886570e2d189f8528d26909f9c569a94844c85850c |
|
MD5 | af3c5b9f19bef3daab7a78259355515e |
|
BLAKE2b-256 | 345d4fb5ad4ad018de41f632d20f3d74315e90407d8bb1549da8ffb2c59a13a9 |