Skip to main content

Web Scraper for Poland COVID19 data.

Project description

Web Scraper of COVID-19 data for Poland

Python package covid19poland is part of MFRatio project.

It provides access to death data in Poland due to COVID-19 as well as overall deaths data.

Setup and usage

Install from pip with

pip install covid19poland

Several data sources are in current version

  • Covid-19 deaths from Wikipedia
  • Online parser of Twitter of Polish Ministry of Health
  • Offline manually checked data from online parser

Package is regularly updated. Update with

pip install --upgrade covid19poland

Wikipedia

The table comes from version from beginning of June on Wikipedia page https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Poland

import covid19poland as PL

x = PL.wiki()

Once better tabular source is found, it will replace the current one.

Parametrization

Level is a setting for granularity of data

  1. Country level (default)
  2. State level
import covid19poland as PL

# country level
x1 = PL.fetch(level = 1)
# state level
x2 = PL.fetch(level = 2)

Twitter data

The data from twitter can be downloaded and parsed with

data,filtered,checklist = PL.twitter(start = "2020-06-01", end = "2020-07-01")

Turn on logs by typing following code before the twitter() function call.

import logging
logging.basicConfig(level = logging.INFO)

The result of the twitter() call are three values

  • data - containing the deceased people with their place and date of death
  • filtered - tweets, that were filtered out. Just for validation that nothing was missed.
  • checklist - list of dates that the parser is not sure about

The data can be saved to output files with

with open("data/6_in.json", "w") as fd:
    json.dump(data, fd)
with open("data/6_out.json", "w") as fd:
    json.dump(filtered, fd)
print(checklist)

Offline data

The twitter data has already been manually checked and it is part of the package. Use function read() from offline submodule to get them

import covid19poland as PL

x = PL.offline.read()

Here the result is pandas.DataFrame with rows being each deceased person.

Contribution

Developed by Martin Benes.

Join on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

covid19poland-0.1.4.tar.gz (108.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

covid19poland-0.1.4-py3-none-any.whl (34.1 kB view details)

Uploaded Python 3

File details

Details for the file covid19poland-0.1.4.tar.gz.

File metadata

  • Download URL: covid19poland-0.1.4.tar.gz
  • Upload date:
  • Size: 108.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3rc1

File hashes

Hashes for covid19poland-0.1.4.tar.gz
Algorithm Hash digest
SHA256 a2918e06893dde817980c110e9b4319d12fa045999947d7e44fd21866138fb23
MD5 09f09435c734bef402e7581a27ef60de
BLAKE2b-256 a664a1be7e8d4528c6f74b8f075c51127995260d1ff27058add915fa81e249c4

See more details on using hashes here.

File details

Details for the file covid19poland-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: covid19poland-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 34.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3rc1

File hashes

Hashes for covid19poland-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 7a5aca0cbfb3758b6d27934c991461d95d69aea04cc0b274058f151ee2ae83c2
MD5 aec3c20b2149d8b1fa7277841e45a5ed
BLAKE2b-256 b76d28ad8e33a7f34b5f8bb8f77282ca362a1714e9b5f87450d81f70b51cf25a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page