Skip to main content

A small package for preprocessing german text

Project description

Preprocessing

Install: The project uses pipenv to manage dependencies. You can install all requirements with the following command:

$ pipenv install
$ pipenv shell
$ pipenv run python -m spacy download de

Still ToDo:

  • edit stopword list
  • edit Tag list
  • maybe extend custom lemmatization json file (much work, for less output?)

This Project Uses the Spacy-IWNLP Lemmatizations:

@InProceedings{liebeck-conrad:2015:ACL-IJCNLP,
  author    = {Liebeck, Matthias  and  Conrad, Stefan},
  title     = {{IWNLP: Inverse Wiktionary for Natural Language Processing}},
  booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
  year      = {2015},
  publisher = {Association for Computational Linguistics},
  pages     = {414--418},
  url       = {http://www.aclweb.org/anthology/P15-2068}
}

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for spacy-german-preprocess, version 0.0.2
Filename, size File type Python version Upload date Hashes
Filename, size spacy_german_preprocess-0.0.2-py3-none-any.whl (4.5 MB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size spacy_german_preprocess-0.0.2.tar.gz (4.1 MB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page