A small package for preprocessing german text
Project description
Preprocessing
Install: The project uses pipenv to manage dependencies. You can install all requirements with the following command:
$ pipenv install
$ pipenv shell
$ pipenv run python -m spacy download de
Still ToDo:
- edit stopword list
- edit Tag list
- maybe extend custom lemmatization json file (much work, for less output?)
This Project Uses the Spacy-IWNLP Lemmatizations:
@InProceedings{liebeck-conrad:2015:ACL-IJCNLP,
author = {Liebeck, Matthias and Conrad, Stefan},
title = {{IWNLP: Inverse Wiktionary for Natural Language Processing}},
booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
year = {2015},
publisher = {Association for Computational Linguistics},
pages = {414--418},
url = {http://www.aclweb.org/anthology/P15-2068}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spacy_german_preprocess-0.0.2.tar.gz.
File metadata
- Download URL: spacy_german_preprocess-0.0.2.tar.gz
- Upload date:
- Size: 4.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32daa9fe58466a1437954089f4ca2c540aaa075a0ef2d6aa21408770828232e1
|
|
| MD5 |
350f835c5a76ea41d3aede6a0bfc66eb
|
|
| BLAKE2b-256 |
c1f27b7edb9c429e6bb5737ac12b2ecc2f29703973772f871232410e550dc899
|
File details
Details for the file spacy_german_preprocess-0.0.2-py3-none-any.whl.
File metadata
- Download URL: spacy_german_preprocess-0.0.2-py3-none-any.whl
- Upload date:
- Size: 4.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d89b13638e18d138c66ba0623fda5a8ab040391ce5eabbc7e23347e669b05a13
|
|
| MD5 |
a79e62844ea86e14d75f424c67f43ebb
|
|
| BLAKE2b-256 |
ea78a0e4334a576f2cf8da816d60bc3977838ce672ea65ab4b90a4248efe6c97
|