Skip to main content

A package for German Open Informtion Extraction

Project description

turCy

An Open Information Extraction System mainly designed for German.

Installation

pip install turcy

Can be applied to other languages as well, however some extrawork is necessary as no patterns for english are shipped. Therefore, you would have to build your own patterns first. For building patterns, a `pattern_builder module is available.

How it works

img_3.png

1. Building a Pattern

img_2.png

img_1.png

2. Extraction

  1. Load the German Language Model from spaCy.
  2. Add turCy to the nlp-Pipeline.
  3. Pass the document to the pipeline.
  4. Iterate over the sentences in the document and access the triples in each sentence.
def example():
    nlp = spacy.load("de_core_news_lg", exclude=["ner"])
    nlp.max_length = 2096700
    turcy.add_to_pipe(nlp)  # apply/use current patterns in list
    pipeline_params = {"attach_triple2sentence": {"pattern_list": "small"}}
    doc = nlp("Nürnberg ist eine Stadt in Deutschland.", component_cfg=pipeline_params)
    for sent in doc.sents:
        print(sent)
        for triple in sent._.triples:
            (subj, pred, obj) = triple["triple"]
            print(f"subject:'{subj}', predicate:'{pred}' and object: '{obj}'")

3. Results

img_5.png

img_6.png

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turcy-0.0.42.tar.gz (513.7 kB view details)

Uploaded Source

Built Distribution

turcy-0.0.42-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file turcy-0.0.42.tar.gz.

File metadata

  • Download URL: turcy-0.0.42.tar.gz
  • Upload date:
  • Size: 513.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.6.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for turcy-0.0.42.tar.gz
Algorithm Hash digest
SHA256 d37115f2b5c0f7777f8d36f92d8f98268e082690669bc268c6e6403dbda67062
MD5 f11b6de5aea0541e5b077c0757a3a4da
BLAKE2b-256 150fb9de25302b4c769d231092102a2bd8e16384564eda74f52a5a4ff0706798

See more details on using hashes here.

File details

Details for the file turcy-0.0.42-py3-none-any.whl.

File metadata

  • Download URL: turcy-0.0.42-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.6.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for turcy-0.0.42-py3-none-any.whl
Algorithm Hash digest
SHA256 9d2e07509732881b694ba972813f87ebe371b80051b7d9e52f2e1f2e371f9c19
MD5 933760fa8e92cbb1afd5586b7fcddb58
BLAKE2b-256 0d710f2412a10191908466f8a7211e21448ebac1e68b2b6a3529dec3df39d0a7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page