Figurenerkennung for German literary texts.
Project description
Figurenerkennung for German literary texts
An important step in the quantitative analysis of narrative texts is the automatic recognition of references to figures, a special case of the generic NLP problem of Named Entity Recognition (NER).
Usually NER models are not designed for literary texts resulting in poor recall. This easy-to-use package is the continuation of the work of Jannidis et al., using state-of-the-art techniques from the field of Deep Learning reaching a Micro F1-Score of 95.89 and a Macro F1-Score of 67.74.
Figurenerkennung statistics
This is based on the test set of 55,868 tokens.
TP | FP | FN | TN | |
---|---|---|---|---|
AppA | 2 | 2 | 288 | 2 |
AppTdfW | 418 | 131 | 383 | 418 |
Core | 626 | 114 | 211 | 626 |
pron | 2284 | 604 | 766 | 2284 |
_ | 52538 | 1546 | 749 | 52538 |
Installation
$ pip install figur
Example
>>> import figur
>>> text = "Der Gärtner entfernte sich eilig, und Eduard folgte bald."
>>> figur.tag(text)
SentenceId Token Tag
0 0 Der _
1 0 Gärtner AppTdfW
2 0 entfernte _
3 0 sich pron
4 0 eilig _
5 0 , _
6 0 und _
7 0 Eduard Core
8 0 folgte _
9 0 bald _
10 0 . _
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
figur-0.0.2.tar.gz
(4.0 kB
view hashes)
Built Distribution
figur-0.0.2-py3-none-any.whl
(5.1 kB
view hashes)