A web scraping package for data on members of the European Parliament.
Project description
MEP API
MEP API is a very simple python package to scrape data on members of the European Parliament and output it in a neat JSON.
Installation
Install this repository with pip:
pip install mep_api
Usage
Scraping one MEP's information
To create an MEP object, import the package pass the URL of an MEP's EP home page. For instance:
import mep_api
mep1 = mep_api.mep("https://www.europarl.europa.eu/meps/en/113892/ERIC_ANDRIEU/home")
Then you can add the information you want to the object:
mep1.get_personal_data()
mep1.get_committees()
mep1.get_assistants()
or you can scrape everything at once:
mep1.scrape_all()
You can then either get a JSON string containing all of the MEP's information and write to a file by running:
mep1.to_json() #returns JSON string
mep1.to_json("file.json") #writes JSON file to specified path
Scraping multiple MEPs' information
It is possible to scrape data for multiple MEPs with a single line of code with batch_scrape() as follows:
url_list = ["https://www.europarl.europa.eu/meps/en/113892/ERIC_ANDRIEU/home", "https://www.europarl.europa.eu/meps/en/124831/ISABELLA_ADINOLFI/home", "https://www.europarl.europa.eu/meps/en/28161/MARGRETE_AUKEN/home"]
mep_api.batch_scrape(url_list) #return JSON string
mep_api.batch_scrape(url_list, outfile="file.json") #writes JSON file to specified path
The get_mep_urls() function returns a list of all MEP home page URLs and makes collecting data on all MEPs at once easy. It is also possible to scrape available data for so-called "outgoing" MEPs, MEPs who have left the parliament during the current parliamentary term. To do so, it is sufficient to use the batch_scrape() function with the argument add_outgoing = True which is False by default. It is possible not to pass a url_list to the function to collect data only on outgoing MEPs. It is however not possible to collect data on single outgoing MEPs as of now.
all_mep_urls = mep_api.get_mep_urls() #creates a list of all MEP URLS
mep_api.batch_scrape(all_mep_urls) #collects data on all current MEPs
mep_api.batch_scrape(all_mep_urls, add_outgoing=True) #collects data on all current MEPs and outgoing MEPs
mep_api.batch_scrape(add_outgoing=True) #collects data on outgoing MEPs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mep_api-0.1.2.tar.gz.
File metadata
- Download URL: mep_api-0.1.2.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
629bd2e2790d27ed1c0ce2cb1a7b025be9f46257c995bf2432fb88c8bb2eae0c
|
|
| MD5 |
3e1f4845aed730a45f48600689e95f43
|
|
| BLAKE2b-256 |
81c6db0220f6567d65382351886609b0452d133474e905c70f851c5637f7431e
|
File details
Details for the file mep_api-0.1.2-py3-none-any.whl.
File metadata
- Download URL: mep_api-0.1.2-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
419bcc34e0498e9be382199d7a71af1c8b2667844a636f0a05c907d6172dafcb
|
|
| MD5 |
1d171bfb00008506bcafac50fd03455c
|
|
| BLAKE2b-256 |
993e1fccca9df6ec01c7768b4f8be37a6a548b5936dbe0a9a3cda6a5c3860130
|