Unified data hub for a better understanding of COVID-19 https://covid19datahub.io
Python Interface to COVID-19 Data Hub
Guidotti, E., Ardia, D., (2020).
COVID-19 Data Hub
Journal of Open Source Software, 5(51):2376
Setup and usage
Install from pip with
pip install covid19dh
Importing the main function
from covid19dh import covid19 x, src = covid19()
Package is regularly updated. Update with
pip install --upgrade covid19dh
covid19() returns 2 pandas dataframes:
- the data and
- references to the data sources.
List of country names (case-insensitive) or ISO codes (alpha-2, alpha-3 or numeric). The list of ISO codes can be found here.
Fetching data from a particular country:
x, src = covid19("USA") # Unites States
Specify multiple countries at the same time:
x, src = covid19(["ESP","PT","andorra",250])
country is omitted, the whole dataset is returned:
x, src = covid19()
Logical. Skip data cleaning? Default
raw=False, the raw data are cleaned by filling missing dates with
NaN values. This ensures that all locations share the same grid of dates and no single day is skipped. Then,
NaN values are replaced with the previous non-
NaN value or
x, src = covid19(raw = False)
Date can be specified with
datetime.date or as a
str in format
from datetime import datetime x, src = covid19("SWE", start = datetime(2020,4,1), end = "2020-05-01")
Integer. Granularity level of the data:
- Country level
- State, region or canton level
- City or municipality level
from datetime import date x, src = covid19("USA", level = 2, start = date(2020,5,1))
Logical. Memory caching? Significantly improves performance on successive calls. By default, using the cached data is enabled.
Caching can be disabled (e.g. for long running programs) by:
x, src = covid19("FRA", cache = False)
Logical. Retrieve the snapshot of the dataset that was generated at the
end date instead of using the latest version. Default
To fetch e.g. US data that were accessible on 22th April 2020 type
x, src = covid19("USA", end = "2020-04-22", vintage = True)
The vintage data are collected at the end of the day, but published with approximately 48 hour delay, once the day is completed in all the timezones.
vintage = True, but
end is not set, warning is raised and
None is returned.
x, src = covid19("USA", vintage = True) # too early to get today's vintage
UserWarning: vintage data not available yet
The data sources are returned as second value.
from covid19dh import covid19 x, src = covid19("USA") print(src)
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size covid19dh-2.0.3-py3-none-any.whl (20.8 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size covid19dh-2.0.3.tar.gz (8.7 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for covid19dh-2.0.3-py3-none-any.whl