A package for German Open Informtion Extraction

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

turCy

An Open Information Extraction System mainly designed for German.

Installation

pip install turcy

Can be applied to other languages as well, however some extrawork is necessary as no patterns for english are shipped. Therefore, you would have to build your own patterns first. For building patterns, a `pattern_builder module is available.

How it works

1. Building a Pattern

2. Extraction

Load the German Language Model from spaCy.
Add turCy to the nlp-Pipeline.
Pass the document to the pipeline.
Iterate over the sentences in the document and access the triples in each sentence.

def example():
    nlp = spacy.load("de_core_news_lg", exclude=["ner"])
    nlp.max_length = 2096700
    turcy.add_to_pipe(nlp)  # apply/use current patterns in list
    pipeline_params = {"attach_triple2sentence": {"pattern_list": "small"}}
    doc = nlp("Nürnberg ist eine Stadt in Deutschland.", component_cfg=pipeline_params)
    for sent in doc.sents:
        print(sent)
        for triple in sent._.triples:
            (subj, pred, obj) = triple["triple"]
            print(f"subject:'{subj}', predicate:'{pred}' and object: '{obj}'")

3. Results

References

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.42

Mar 11, 2023

0.0.35

Jun 30, 2021

0.0.34

Jun 30, 2021

0.0.33

May 31, 2021

0.0.32

May 31, 2021

0.0.31

Apr 15, 2021

0.0.28

Apr 15, 2021

0.0.27

Apr 15, 2021

0.0.25

Apr 15, 2021

0.0.24

Apr 15, 2021

0.0.23

Apr 15, 2021

0.0.14

Jan 13, 2021

0.0.13

Jan 13, 2021

0.0.12

Jan 13, 2021

0.0.1

Jan 13, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turcy-0.0.42.tar.gz (513.7 kB view hashes)

Uploaded Mar 11, 2023 Source

Built Distribution

turcy-0.0.42-py3-none-any.whl (1.2 MB view hashes)

Uploaded Mar 11, 2023 Python 3

Hashes for turcy-0.0.42.tar.gz

Hashes for turcy-0.0.42.tar.gz
Algorithm	Hash digest
SHA256	`d37115f2b5c0f7777f8d36f92d8f98268e082690669bc268c6e6403dbda67062`
MD5	`f11b6de5aea0541e5b077c0757a3a4da`
BLAKE2b-256	`150fb9de25302b4c769d231092102a2bd8e16384564eda74f52a5a4ff0706798`

Hashes for turcy-0.0.42-py3-none-any.whl

Hashes for turcy-0.0.42-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9d2e07509732881b694ba972813f87ebe371b80051b7d9e52f2e1f2e371f9c19`
MD5	`933760fa8e92cbb1afd5586b7fcddb58`
BLAKE2b-256	`0d710f2412a10191908466f8a7211e21448ebac1e68b2b6a3529dec3df39d0a7`