Skip to main content

A small package for preprocessing german text

Project description

Preprocessing

Install: The project uses pipenv to manage dependencies. You can install all requirements with the following command:

$ pipenv install
$ pipenv shell
$ pipenv run python -m spacy download de

Still ToDo:

  • edit stopword list
  • edit Tag list
  • maybe extend custom lemmatization json file (much work, for less output?)

This Project Uses the Spacy-IWNLP Lemmatizations:

@InProceedings{liebeck-conrad:2015:ACL-IJCNLP,
  author    = {Liebeck, Matthias  and  Conrad, Stefan},
  title     = {{IWNLP: Inverse Wiktionary for Natural Language Processing}},
  booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
  year      = {2015},
  publisher = {Association for Computational Linguistics},
  pages     = {414--418},
  url       = {http://www.aclweb.org/anthology/P15-2068}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy_german_preprocess-0.0.2.tar.gz (4.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spacy_german_preprocess-0.0.2-py3-none-any.whl (4.5 MB view details)

Uploaded Python 3

File details

Details for the file spacy_german_preprocess-0.0.2.tar.gz.

File metadata

  • Download URL: spacy_german_preprocess-0.0.2.tar.gz
  • Upload date:
  • Size: 4.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for spacy_german_preprocess-0.0.2.tar.gz
Algorithm Hash digest
SHA256 32daa9fe58466a1437954089f4ca2c540aaa075a0ef2d6aa21408770828232e1
MD5 350f835c5a76ea41d3aede6a0bfc66eb
BLAKE2b-256 c1f27b7edb9c429e6bb5737ac12b2ecc2f29703973772f871232410e550dc899

See more details on using hashes here.

File details

Details for the file spacy_german_preprocess-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: spacy_german_preprocess-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 4.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for spacy_german_preprocess-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d89b13638e18d138c66ba0623fda5a8ab040391ce5eabbc7e23347e669b05a13
MD5 a79e62844ea86e14d75f424c67f43ebb
BLAKE2b-256 ea78a0e4334a576f2cf8da816d60bc3977838ce672ea65ab4b90a4248efe6c97

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page