Web Scraper for Poland COVID19 data.
Project description
Web Scraper of COVID-19 data for Poland
Python package covid19poland is part of MFRatio project.
It provides access to death data in Poland due to COVID-19 as well as overall deaths data.
Setup and usage
Install from pip with
pip install covid19poland
Several data sources are in current version
- Covid-19 deaths from Wikipedia
- Online parser of Twitter of Polish Ministry of Health
- Offline manually checked data from online parser
Package is regularly updated. Update with
pip install --upgrade covid19poland
Wikipedia
The table comes from version from beginning of June on Wikipedia page https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Poland
import covid19poland as PL
x = PL.wiki()
Once better tabular source is found, it will replace the current one.
Parametrization
Level is a setting for granularity of data
- Country level (default)
- State level
import covid19poland as PL
# country level
x1 = PL.fetch(level = 1)
# state level
x2 = PL.fetch(level = 2)
Twitter data
The data from twitter can be downloaded and parsed with
data,filtered,checklist = PL.twitter(start = "2020-06-01", end = "2020-07-01")
Turn on logs by typing following code before the twitter()
function call.
import logging
logging.basicConfig(level = logging.INFO)
The result of the twitter()
call are three values
- data - containing the deceased people with their place and date of death
- filtered - tweets, that were filtered out. Just for validation that nothing was missed.
- checklist - list of dates that the parser is not sure about
The data can be saved to output files with
with open("data/6_in.json", "w") as fd:
json.dump(data, fd)
with open("data/6_out.json", "w") as fd:
json.dump(filtered, fd)
print(checklist)
Offline data
The twitter data has already been manually checked and it is part of the package.
Use function read()
from offline
submodule to get them
import covid19poland as PL
x = PL.offline.read()
Here the result is pandas.DataFrame
with rows being each deceased person.
Contribution
Developed by Martin Benes.
Join on GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for covid19poland-0.1.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ffa383eb37fe29d1b7f8b4162bc1ed23a98fed4a9f5af114102ed3dda089bd7 |
|
MD5 | 95c27200bca5b6236bc5e3fdf615cbca |
|
BLAKE2b-256 | b77c62719656532c704975eba106f23f3fda72dea922d55b26c57bea59209299 |