Web Scraper for Poland COVID19 data.
Project description
Web Scraper of COVID-19 data for Poland
Python package covid19poland is part of MFRatio project.
It provides access to death data in Poland due to COVID-19 as well as overall deaths data.
Setup and usage
Install from pip with
pip install covid19poland
Several data sources are in current version
- Covid-19 deaths from Wikipedia
- Online parser of Twitter of Polish Ministry of Health
- Offline manually checked data from online parser
Package is regularly updated. Update with
pip install --upgrade covid19poland
Wikipedia
The table comes from version from beginning of June on Wikipedia page https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Poland
import covid19poland as PL
x = PL.wiki()
Once better tabular source is found, it will replace the current one.
Parametrization
Level is a setting for granularity of data
- Country level (default)
- State level
import covid19poland as PL
# country level
x1 = PL.fetch(level = 1)
# state level
x2 = PL.fetch(level = 2)
Twitter data
The data from twitter can be downloaded and parsed with
data,filtered,checklist = PL.twitter(start = "2020-06-01", end = "2020-07-01")
Turn on logs by typing following code before the twitter() function call.
import logging
logging.basicConfig(level = logging.INFO)
The result of the twitter() call are three values
- data - containing the deceased people with their place and date of death
- filtered - tweets, that were filtered out. Just for validation that nothing was missed.
- checklist - list of dates that the parser is not sure about
The data can be saved to output files with
with open("data/6_in.json", "w") as fd:
json.dump(data, fd)
with open("data/6_out.json", "w") as fd:
json.dump(filtered, fd)
print(checklist)
Offline data
The twitter data has already been manually checked and it is part of the package.
Use function read() from offline submodule to get them
import covid19poland as PL
x = PL.offline.read()
Here the result is pandas.DataFrame with rows being each deceased person.
Contribution
Developed by Martin Benes.
Join on GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file covid19poland-0.1.3.tar.gz.
File metadata
- Download URL: covid19poland-0.1.3.tar.gz
- Upload date:
- Size: 108.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3rc1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70a2e75121996ab2b984e2bd16eb2604b09a2c609ce1a8703c99509c4b3a9b22
|
|
| MD5 |
d790e998f350ea1cc13ea3fcf206b083
|
|
| BLAKE2b-256 |
ee2e3bc9b2d719e6f3ebefc2fbd9571cc4918c999a387b6d0cf004e2f74afe63
|
File details
Details for the file covid19poland-0.1.3-py3-none-any.whl.
File metadata
- Download URL: covid19poland-0.1.3-py3-none-any.whl
- Upload date:
- Size: 34.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3rc1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1fe49fff5a70ddcd6c30e7e5ee6859887f73db943e4b4639fcc5aa9b1967b05a
|
|
| MD5 |
b8fce4987761bbd90c139aa63ef481a9
|
|
| BLAKE2b-256 |
b830e8c59b8d41ee4b67997c3e3af2aea8b063b616a2103aaaf2744b37dd4714
|