Skip to main content

The GermaParlPy Python package provides functionality to deserialize, serialize, manage, and query the GermaParlTEI corpus and derived corpora.

Project description

GermaParlPy

Python package Reference DOI: 10.5281/zenodo.15180629 PyPI - Downloads

The GermaParlPy Python package provides functionality to deserialize, serialize, manage, and query the GermaParlTEI[^1] corpus and derived corpora.

The GermaParlTEI corpus comprises the plenary protocols of the German Bundestag (parliament), encoded in XML according to the TEI standard. The current version covers the first 19 legislative periods, encompassing transcribed speeches from the Bundestag's constituent session on 7 September 1949 to the final sitting of the Angela Merkel era in 2021. This makes it a valuable resource for research in various scientific disciplines.

For detailed information on the library, visit the official website.

Use Cases

Potential use cases range from the examination of research questions in political science, history or linguistics to the compilation of training data sets for AI.

In addition, this library makes it possible to access the GermaParl corpus in Python and apply powerful NLP libraries such as spacy or gensim to it. Previously, the corpus could only be accessed using the PolMineR package in the R programming language.

Installation

GermaParlPy is available on PyPi:

pip install germaparlpy

API Reference

Click here for the full API Reference.

XML Structure

Click here to learn more about the XML Structure of the underlying corpus GermaParlTEI[^1].

Tutorials

I have prepared three example scripts that showcase the utilisation and potential use cases of GermaParlPy. You can find the scripts in the /example directory or here.

Contributing

Contributions and feedback are welcome! Feel free to write an issue or open a pull request.

License

The code is licensed under the MIT License.

The GermaParl corpus, which is not part of this repository, is licensed under a CLARIN PUB+BY+NC+SA license.

Credits

Developed by Marlon-Benedikt George.

The underlying data set, the GermaParl corpus, was compiled and released by Blätte & Leonhardt (2024)[^1]. See also their R-Library PolMineR in the context of the PolMine-Project, which served as an inspiration for this library.

[^1]: Blaette, A.and C. Leonhardt. Germaparl corpus of plenary protocols. v2.2.0-rc1, Zenodo, 22 July 2024, doi:10.5281/zenodo.12795193

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

germaparlpy-1.1.0.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

germaparlpy-1.1.0-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file germaparlpy-1.1.0.tar.gz.

File metadata

  • Download URL: germaparlpy-1.1.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for germaparlpy-1.1.0.tar.gz
Algorithm Hash digest
SHA256 0ff92a07781938b6451d6d44e81a4a1bd1799952255e9752d2e3b22489298e35
MD5 5208932b7afab3fbeb300b0a04a5f525
BLAKE2b-256 80da722a3953c02ecdcd8c3595ed636ff7037c5c224b336037b7314b26dfb4ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for germaparlpy-1.1.0.tar.gz:

Publisher: publish_pypi.yml on Nolram567/GermaParlPy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file germaparlpy-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: germaparlpy-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for germaparlpy-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 609f695eb1c282fbea1923747da3d57c5c616cd35aaf3b4766de6b37d7993c2a
MD5 80be0c76d794157b32a765a05e073c74
BLAKE2b-256 fde4a19271ad0e01c239f470d162eca8fce75d6b28f90fa239badf2b4a969c76

See more details on using hashes here.

Provenance

The following attestation bundles were made for germaparlpy-1.1.0-py3-none-any.whl:

Publisher: publish_pypi.yml on Nolram567/GermaParlPy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page