Skip to main content

A Python package to extract narrative statements from text

Project description

relatio

A Python package to extract underlying narrative statements from text.

What can this package do?

  1. Identify Agent-Verb-Patient (AVP) / Subject-Verb-Object (SVO) triplets in the text

    • AVPs are obtained via Semantic Role Labeling.
    • SVOs are obtained via Dependency Parsing.
    • A concrete example of AVP/SVO extraction:

    Original sentence: "Taxes kill jobs and hinder innovation."

    Triplets: [('taxes', 'kill', 'jobs'), ('taxes','hinder','innovation')]

  2. Group agents and patients into interpretable entities in two ways:

    • Supervised classification of entities. Simply provide a list of entities and we will filter the triplets for you (e.g., ['Barack Obama', 'government', ...]).
    • Unsupervised classification via clustering of entities. We represent agents and patients as text embeddings and cluster them via KMeans or HDBSCAN. The optimal number of topics is data-driven.
    • A concrete example of a cluster:

    Interpretable entity: "tax"
    Related phrases: ['income tax', 'the tax rates', 'taxation in this country', etc.]

  3. Visualize clusters and resulting narratives.

We currently support French and English out-of-the-box. You can also provide us with a custom SVO-extraction function for any language supported by spaCy.

Installation

Runs on Linux and macOS (x86 platform) and it requires Python 3.7 (or 3.8) and pip.
It is highly recommended to use a virtual environment (or conda environment) for the installation.

# upgrade pip, wheel and setuptools
python -m pip install -U pip wheel setuptools

# install the package
python -m pip install -U relatio

In case you want to use Jupyter make sure that you have it installed in the current environment.

Quickstart

Please see our hands-on tutorials:

Team

relatio is brought to you by

with a special thanks for support of ETH Scientific IT Services.

If you are interested in contributing to the project please read the Development Guide.

Disclaimer

Remember that this is a research tool :)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

relatio-0.3.0.tar.gz (27.0 kB view details)

Uploaded Source

Built Distribution

relatio-0.3.0-py3-none-any.whl (28.7 kB view details)

Uploaded Python 3

File details

Details for the file relatio-0.3.0.tar.gz.

File metadata

  • Download URL: relatio-0.3.0.tar.gz
  • Upload date:
  • Size: 27.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.15

File hashes

Hashes for relatio-0.3.0.tar.gz
Algorithm Hash digest
SHA256 31079e301f12e1a3f0911b3676a70b290d9b6de1c5fb823723aaaf7fe8295a57
MD5 a8cc6b1232892b96740aeb4d58102eed
BLAKE2b-256 72c11e2b310c584dc5349b64e886777dee5066798a57d492c1ea7a3dc3ffbb28

See more details on using hashes here.

File details

Details for the file relatio-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: relatio-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 28.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.15

File hashes

Hashes for relatio-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 745d3b6f3735c75f647115d49116f308c9ab0e1d658b2bd5bac101cd4b92b35b
MD5 40a1d0b38dd96eacedaac275241511cf
BLAKE2b-256 3bc10b76648b2f6f6bc101110e6eb48d438f6771d585c179cd74bd3026e60d0e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page