Web Scraper for Eurostat data.
Project description
Eurostat
Package is a simple interface for parsing data from Eurostat:
- deaths counts
- population sizes
Download the package with
pip3 install eurostat_deaths
To import and fetch data, simply write
import eurostat_deaths as eurostat
Function deaths()
fetches the deaths, function populations()
fetches the populations.
Their description is in following sections below.
Package is regularly updated. Upgrade your local version typing
pip3 install eurostat_deaths --upgrade
Deaths
from datetime import datetime
import eurostat_deaths as eurostat
data = eurostat.deaths(start = datetime(2019,1,1))
Parameter start
sets the start of the data. The end is always now()
.
You receive per-week data of deaths. Since the total size of the data frame is about 218 MB, call takes more than 15 minutes. The usage of memory is significant.
In the future, module will be reimplemented to use Big Data framework, such as PySpark.
The pandas dataframe is returned.
from datetime import datetime
import eurostat_deaths as eurostat
# does not return, create a file with result
eurostat.deaths(output = "deaths.csv", start = datetime(2019,1,1))
Parameter output = "deaths.csv"
causes that the output is saved into file deaths.csv
before returning.
One additional setting is chunksize
to set the size of chunk, that is processed at a time. The unit used is thousands of rows.
Population
Populations in years for NUTS-2 and NUTS-3 regions can be fetched such as
import eurostat_deaths as eurostat
data = eurostat.populations()
Similarly as in deaths()
call, populations()
can be parametrized with chunksize
(in thousands of lines) and output
, forwarding the output to file rather than returning and hence saving time allocating a big data frame in main memory.
import eurostat_deaths as eurostat
# does not return, create a file with result
eurostat.populations(output = True)
Here the data volume is incomparably lower and hence the regular usage to return the data frame is possible.
Credits
Author: Martin Benes.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file eurostat_deaths-0.2.0.tar.gz
.
File metadata
- Download URL: eurostat_deaths-0.2.0.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2c21153736e167d38605a0cb1851c4ee5bdfccb3eceaa23d01af9a4d24777c6 |
|
MD5 | 41939544b2bf61ac656191aeb34ba4cd |
|
BLAKE2b-256 | ad4182663448c58784ce4eba481fd72eb444f3160097cd14406883801ff8579e |
File details
Details for the file eurostat_deaths-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: eurostat_deaths-0.2.0-py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1844ad345b4375caca7d4604fb5cdd61320c6e2d6650fa75ec84a4afdd05d7c4 |
|
MD5 | bddd84f01e6ae13c60f9e87e06b38575 |
|
BLAKE2b-256 | 0f2684414c5377a15a1dbab693c09a2c0d79bd523f91dea0d537c6772da4f64d |