A package to download Dutch parliament debates and subtitles
Project description
tweedekamer
A pypi package for retrieving Dutch parliamentary debate data.
Using this package you can download Dutch parliament debates and subtitles. It is based on the website https://debatgemist.tweedekamer.nl. This package is not affiliated with the Dutch parliament. It is not allowed to use this package for commercial purposes.
This package is no longer in beta, but you still might expect bugs and missing features as the testing library is not finished.
Installation
Install this package through pip. The package requires Python >=3.7.9
pip install tweedekamer
Usage/Examples
Retrieving subtitles
from tweedekamer import Search
results = Search("belasting 2022", limit=1).result
results[0].subtitle.text
Retrieving video link
from tweedekamer import Search
results = Search("belasting 2022", limit=1).result
results[0].video.url
Retrieving speaker information
For each debate there is a list of speakers
from tweedekamer import Search
results = Search("belasting 2022", limit=1).result
results[0].speakers[0].name
results[0].speakers[0].party
results[0].speakers[0].speach.text
results[0].speakers[0].speach.subtitle
results[0].speakers[0].speach.tokenized
Retrieve from list of URLs
It's also possible to retrieve data from a list of URLs.
These URLs can be retrieved from the website https://debatgemist.tweedekamer.nl.
Every URL should be a string in a list and should start with https://debatgemist.tweedekamer.nl/debatten/
.
from tweedekamer import Search
Search(urls=["https://debatgemist.tweedekamer.nl/debatten/vreemdelingen-en-asielbeleid-10"]).result
Export to CSV
Export the results of your query to CSV, separate the data by speaker or keep the entire debate per row
from tweedekamer import Search
Search("belasting 2022", limit=1).to_csv("entire_debate")
Search("belasting 2022", limit=1).to_csv("debate_per_speaker", separate_speakers=True)
Features
- Retrieve date and info on debate
- Search debates by query, date range, and debate type
- Retrieve subtitle data
- Retrieve video data
Run Locally
Use these instructions if you want to edit the package locally.
Clone the project
git clone https://github.com/micheldore/tweedekamer
Go to the project directory
cd tweedekamer
Create the virtual environment (using Python 3)
python -m venv env
source env/bin/activate
Install dependencies
pip install -r requirements.txt
If you want to install the local version of the package
python -m pip install -e .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for tweedekamer-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 259fcc5725ec0acd560b0ab884a15b76050125110d7030c60723d8057cb4f4e1 |
|
MD5 | aa9029264158770c811428cfbbab9576 |
|
BLAKE2b-256 | aff431ecf13b0cefc536d60bc7d5fae9970aeec0f91ffdccac685ab6fa7f2c47 |