Skip to main content

Figurenerkennung for German literary texts.

Project description

Figurenerkennung for German literary texts

Build Status

An important step in the quantitative analysis of narrative texts is the automatic recognition of references to figures, a special case of the generic NLP problem of Named Entity Recognition (NER).

Usually NER models are not designed for literary texts resulting in poor recall. This easy-to-use package is the continuation of the work of Jannidis et al., using state-of-the-art techniques from the field of Deep Learning reaching a Micro F1-Score of 95.89 and a Macro F1-Score of 67.74.

Figurenerkennung statistics

This is based on the test set of 55,868 tokens.

TP FP FN TN
AppA 2 2 288 2
AppTdfW 418 131 383 418
Core 626 114 211 626
pron 2284 604 766 2284
_ 52538 1546 749 52538

Installation

$ pip install figur

Example

>>> import figur
>>> text = "Der Gärtner entfernte sich eilig, und Eduard folgte bald."
>>> figur.tag(text)
    SentenceId      Token      Tag
0            0        Der        _
1            0    Gärtner  AppTdfW
2            0  entfernte        _
3            0       sich     pron
4            0      eilig        _
5            0          ,        _
6            0        und        _
7            0     Eduard     Core
8            0     folgte        _
9            0       bald        _
10           0          .        _

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

figur-0.0.2.tar.gz (4.0 kB view hashes)

Uploaded Source

Built Distribution

figur-0.0.2-py3-none-any.whl (5.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page