Skip to main content

Web Scraper for Poland COVID19 data.

Project description

Web Scraper of COVID-19 data for Poland

Python package covid19poland is part of MFRatio project.

It provides access to death data in Poland due to COVID-19 as well as overall deaths data.

Setup and usage

Install from pip with

pip install covid19poland

Several data sources are in current version

  • Covid-19 deaths from Wikipedia
  • Online parser of Twitter of Polish Ministry of Health
  • Offline manually checked data from online parser

Package is regularly updated. Update with

pip install --upgrade covid19poland

Wikipedia

The table comes from version from beginning of June on Wikipedia page https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Poland

import covid19poland as PL

x = PL.wiki()

Once better tabular source is found, it will replace the current one.

Parametrization

Level is a setting for granularity of data

  1. Country level (default)
  2. State level
import covid19poland as PL

# country level
x1 = PL.fetch(level = 1)
# state level
x2 = PL.fetch(level = 2)

Twitter data

The data from twitter can be downloaded and parsed with

data,filtered,checklist = PL.twitter(start = "2020-06-01", end = "2020-07-01")

Turn on logs by typing following code before the twitter() function call.

import logging
logging.basicConfig(level = logging.INFO)

The result of the twitter() call are three values

  • data - containing the deceased people with their place and date of death
  • filtered - tweets, that were filtered out. Just for validation that nothing was missed.
  • checklist - list of dates that the parser is not sure about

The data can be saved to output files with

with open("data/6_in.json", "w") as fd:
    json.dump(data, fd)
with open("data/6_out.json", "w") as fd:
    json.dump(filtered, fd)
print(checklist)

Offline data

The twitter data has already been manually checked and it is part of the package. Use function read() from offline submodule to get them

import covid19poland as PL

x = PL.offline.read()

Here the result is pandas.DataFrame with rows being each deceased person.

Contribution

Developed by Martin Benes.

Join on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

covid19poland-0.1.3.tar.gz (108.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

covid19poland-0.1.3-py3-none-any.whl (34.1 kB view details)

Uploaded Python 3

File details

Details for the file covid19poland-0.1.3.tar.gz.

File metadata

  • Download URL: covid19poland-0.1.3.tar.gz
  • Upload date:
  • Size: 108.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3rc1

File hashes

Hashes for covid19poland-0.1.3.tar.gz
Algorithm Hash digest
SHA256 70a2e75121996ab2b984e2bd16eb2604b09a2c609ce1a8703c99509c4b3a9b22
MD5 d790e998f350ea1cc13ea3fcf206b083
BLAKE2b-256 ee2e3bc9b2d719e6f3ebefc2fbd9571cc4918c999a387b6d0cf004e2f74afe63

See more details on using hashes here.

File details

Details for the file covid19poland-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: covid19poland-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 34.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3rc1

File hashes

Hashes for covid19poland-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1fe49fff5a70ddcd6c30e7e5ee6859887f73db943e4b4639fcc5aa9b1967b05a
MD5 b8fce4987761bbd90c139aa63ef481a9
BLAKE2b-256 b830e8c59b8d41ee4b67997c3e3af2aea8b063b616a2103aaaf2744b37dd4714

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page