Client for Swiss parliament API
Project description
swissparlpy
This module provides easy access to the data of the OData webservice of the Swiss parliament.
Table of Contents
Installation
swissparlpy is available on PyPI, so to install it simply use:
$ pip install swissparlpy
Usage
See the examples
directory for more scripts.
Get tables and their variables
>>> import swissparlpy as spp
>>> spp.get_tables()[:5] # get first 5 tables
['MemberParty', 'Party', 'Person', 'PersonAddress', 'PersonCommunication']
>>> spp.get_variables('Party') # get variables of table `Party`
['ID', 'Language', 'PartyNumber', 'PartyName', 'StartDate', 'EndDate', 'Modified', 'PartyAbbreviation']
Get data of a table
>>> import swissparlpy as spp
>>> data = spp.get_data('Canton', Language='DE')
>>> data
<swissparlpy.client.SwissParlResponse object at 0x7f8e38baa610>
>>> data.count
26
>>> data[0]
{'ID': 2, 'Language': 'DE', 'CantonNumber': 2, 'CantonName': 'Bern', 'CantonAbbreviation': 'BE'}
>>> [d['CantonName'] for d in data]
['Bern', 'Neuenburg', 'Genf', 'Wallis', 'Uri', 'Schaffhausen', 'Jura', 'Basel-Stadt', 'St. Gallen', 'Obwalden', 'Appenzell A.-Rh.', 'Solothurn', 'Waadt', 'Zug', 'Aargau', 'Basel-Landschaft', 'Luzern', 'Thurgau', 'Freiburg', 'Appenzell I.-Rh.', 'Schwyz', 'Graubünden', 'Glarus', 'Tessin', 'Zürich', 'Nidwalden']
The return value of get_data
is iterable, so you can easily loop over it. Or you can use indices to access elements, e.g. data[1]
to get the second element, or data[-1]
to get the last one.
Even slicing is supported, so you can do things like only iterate over the first 5 elements using
for rec in data[:5]:
print(rec)
Use together with pandas
To create a pandas DataFrame from get_data
simply pass the return value to the constructor:
>>> import swissparlpy as spp
>>> import pandas as pd
>>> parties = spp.get_data('Party', Language='DE')
>>> parties_df = pd.DataFrame(parties)
>>> parties_df
ID Language PartyNumber ... EndDate Modified PartyAbbreviation
0 12 DE 12 ... 2000-01-01 00:00:00+00:00 2010-12-26 13:05:26.430000+00:00 SP
1 13 DE 13 ... 2000-01-01 00:00:00+00:00 2010-12-26 13:05:26.430000+00:00 SVP
2 14 DE 14 ... 2000-01-01 00:00:00+00:00 2010-12-26 13:05:26.430000+00:00 CVP
3 15 DE 15 ... 2000-01-01 00:00:00+00:00 2010-12-26 13:05:26.430000+00:00 FDP-Liberale
4 16 DE 16 ... 2000-01-01 00:00:00+00:00 2010-12-26 13:05:26.430000+00:00 LDP
.. ... ... ... ... ... ... ...
78 1582 DE 1582 ... 2000-01-01 00:00:00+00:00 2015-12-03 08:48:38.250000+00:00 BastA
79 1583 DE 1583 ... 2000-01-01 00:00:00+00:00 2019-03-07 17:24:15.013000+00:00 CVPO
80 1584 DE 1584 ... 2000-01-01 00:00:00+00:00 2019-11-08 17:28:43.947000+00:00 Al
81 1585 DE 1585 ... 2000-01-01 00:00:00+00:00 2019-11-08 17:41:39.513000+00:00 EàG
82 1586 DE 1586 ... 2000-01-01 00:00:00+00:00 2021-08-12 07:59:22.627000+00:00 M-E
[83 rows x 8 columns]
Substrings
If you want to query for substrings there are two main operators to use:
__startswith
:
>>> import swissparlpy as spp
>>> persons = spp.get_data("Person", Language="DE", LastName__startswith='Bal')
>>> persons.count
12
__contains
>>> import swissparlpy as spp
>>> co2_business = spp.get_data("Business", Title__contains="CO2", Language = "DE")
>>> co2_business.count
265
You can suffix any field with those operators to query the data.
Date ranges
To query for date ranges you can use the operators...
__gt
(greater than)__gte
(greater than or equal)__lt
(less than)__lte
(less than or equal)
...in combination with a datetime
object.
>>> import swissparlpy as spp
>>> from datetime import datetime
>>> business = spp.get_data(
... "Business",
... Language="DE",
... SubmissionDate__gt=datetime.fromisoformat('2019-09-30'),
... SubmissionDate__lte=datetime.fromisoformat('2019-10-31')
... )
>>> business.count
22
Advanced filter
Text query
It's possible to write text queries using operators like eq
(equals), ne
(not equals), lt
/lte
(less than/less than or equals), gt
/ gte
(greater than/greater than or equals), startswith()
and contains
:
import swissparlpy as spp
import pandas as pd
persons = spp.get_data(
"Person",
filter="(startswith(FirstName, 'Ste') or LastName eq 'Seiler') and Language eq 'DE'"
)
df = pd.DataFrame(persons)
print(df[['FirstName', 'LastName']])
Callable Filter
You can provide a callable as a filter which allows for more advanced filters.
swissparlpy.filter
provides or_
and and_
.
import swissparlpy as spp
import pandas as pd
# filter by FirstName = 'Stefan' OR LastName == 'Seiler'
def filter_by_name(ent):
return spp.filter.or_(
ent.FirstName == 'Stefan',
ent.LastName == 'Seiler'
)
persons = spp.get_data("Person", filter=filter_by_name, Language='DE')
df = pd.DataFrame(persons)
print(df[['FirstName', 'LastName']])
Large queries
Large queries (especially the tables Voting and Transcripts) may result in server-side errors (500 Internal Server Error). In these cases it is recommended to download the data in smaller batches, save the individual blocks and combine them after the download.
This is an example script to download all votes of the legislative period 50, session by session, and combine them afterwards in one DataFrame
:
import swissparlpy as spp
import pandas as pd
import os
__location__ = os.path.realpath(os.getcwd())
path = os.path.join(__location__, "voting50")
# download votes of one session and save as pickled DataFrame
def save_votes_of_session(id, path):
if not os.path.exists(path):
os.mkdir(path)
data = spp.get_data("Voting", Language="DE", IdSession=id)
print(f"{data.count} rows loaded.")
df = pd.DataFrame(data)
pickle_path = os.path.join(path, f'{id}.pks')
df.to_pickle(pickle_path)
print(f"Saved pickle at {pickle_path}")
# get all session of the 50 legislative period
sessions50 = spp.get_data("Session", Language="DE", LegislativePeriodNumber=50)
sessions50.count
for session in sessions50:
print(f"Loading session {session['ID']}")
save_votes_of_session(session['ID'], path)
# Combine to one dataframe
df_voting50 = pd.concat([pd.read_pickle(os.path.join(path, x)) for x in os.listdir(path)])
Documentation
The referencing table has been created and is available here. It contains the dependency diagram between all of the tables as well, some exhaustive descriptions as well as the code needed to generate such interactive documentation. The documentation can indeed be recreated using dbdiagram.io.
Below is a first look of what the dependencies are between the tables contained in the API:
Credits
This library is inspired by the R package swissparl of David Zumbach.
Ralph Straumann initial asked about a Python version of swissparl
on Twitter, which led to this project.
Development
To develop on this project, install flit
:
pip install flit
flit install -s
Release
To create a new release, follow these steps (please respect Semantic Versioning):
- Adapt the version number in
swissparlpy/__init__.py
- Update the CHANGELOG with the version
- Create a pull request to merge
develop
intomain
(make sure the tests pass!) - Create a new release/tag on GitHub (on the main branch)
- The publication on PyPI happens via GitHub Actions on every tagged commit
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file swissparlpy-0.3.0.tar.gz
.
File metadata
- Download URL: swissparlpy-0.3.0.tar.gz
- Upload date:
- Size: 688.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 02db383ec8bdad62bb894530d5116450d2521e60d3538fb9acd1401f2f25e1f4 |
|
MD5 | 332afdd398da61e2aeb762e9828db443 |
|
BLAKE2b-256 | 2d8d554a21e8f9215d33586e96b004317116366951c34c51a59198d148d83cd3 |
File details
Details for the file swissparlpy-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: swissparlpy-0.3.0-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25baec12fd2cc05710bb63369a1ebe07fad6dcd7db72f109ddb5a8003e4ff783 |
|
MD5 | 42aef5bbcf15b102e1bbb1da19e156f5 |
|
BLAKE2b-256 | d8413775842709989cf46f989ac48ca1d8ef4e5359bdc4b766fb32001ae97b60 |