Skip to main content

A web scraping package for data on members of the European Parliament.

Project description

MEP API

MEP API is a very simple python package to scrape data on members of the European Parliament and output it in a neat JSON.

Installation

Install this repository with pip:

pip install mep_api

Usage

Scraping one MEP's information

To create an MEP object, import the package pass the URL of an MEP's EP home page. For instance:

import mep_api
mep1 = mep_api.mep("https://www.europarl.europa.eu/meps/en/113892/ERIC_ANDRIEU/home")

Then you can add the information you want to the object:

mep1.get_personal_data()
mep1.get_committees()
mep1.get_assistants()

or you can scrape everything at once:

mep1.scrape_all()

You can then either get a JSON string containing all of the MEP's information and write to a file by running:

mep1.to_json() #returns JSON string
mep1.to_json("file.json") #writes JSON file to specified path

Scraping multiple MEPs' information

It is possible to scrape data for multiple MEPs with a single line of code with batch_scrape() as follows:

url_list = ["https://www.europarl.europa.eu/meps/en/113892/ERIC_ANDRIEU/home", "https://www.europarl.europa.eu/meps/en/124831/ISABELLA_ADINOLFI/home", "https://www.europarl.europa.eu/meps/en/28161/MARGRETE_AUKEN/home"]
mep_api.batch_scrape(url_list) #return JSON string
mep_api.batch_scrape(url_list, outfile="file.json") #writes JSON file to specified path

The get_mep_urls() function returns a list of all MEP home page URLs and makes collecting data on all MEPs at once easy. It is also possible to scrape available data for so-called "outgoing" MEPs, MEPs who have left the parliament during the current parliamentary term. To do so, it is sufficient to use the batch_scrape() function with the argument add_outgoing = True which is False by default. It is possible not to pass a url_list to the function to collect data only on outgoing MEPs. It is however not possible to collect data on single outgoing MEPs as of now.

all_mep_urls = mep_api.get_mep_urls() #creates a list of all MEP URLS
mep_api.batch_scrape(all_mep_urls) #collects data on all current MEPs
mep_api.batch_scrape(all_mep_urls, add_outgoing=True) #collects data on all current MEPs and outgoing MEPs
mep_api.batch_scrape(add_outgoing=True) #collects data on outgoing MEPs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mep_api-0.1.2.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mep_api-0.1.2-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file mep_api-0.1.2.tar.gz.

File metadata

  • Download URL: mep_api-0.1.2.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.3

File hashes

Hashes for mep_api-0.1.2.tar.gz
Algorithm Hash digest
SHA256 629bd2e2790d27ed1c0ce2cb1a7b025be9f46257c995bf2432fb88c8bb2eae0c
MD5 3e1f4845aed730a45f48600689e95f43
BLAKE2b-256 81c6db0220f6567d65382351886609b0452d133474e905c70f851c5637f7431e

See more details on using hashes here.

File details

Details for the file mep_api-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: mep_api-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.3

File hashes

Hashes for mep_api-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 419bcc34e0498e9be382199d7a71af1c8b2667844a636f0a05c907d6172dafcb
MD5 1d171bfb00008506bcafac50fd03455c
BLAKE2b-256 993e1fccca9df6ec01c7768b4f8be37a6a548b5936dbe0a9a3cda6a5c3860130

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page