Skip to main content

A Python package to extract narrative statements from text

Project description

relatio

A Python package to extract underlying narrative statements from text.

What can this package do?

  1. Identify Agent-Verb-Patient (AVP) / Subject-Verb-Object (SVO) triplets in the text

    • AVPs are obtained via Semantic Role Labeling.
    • SVOs are obtained via Dependency Parsing.
    • A concrete example of AVP/SVO extraction:

    Original sentence: "Taxes kill jobs and hinder innovation."

    Triplets: [('taxes', 'kill', 'jobs'), ('taxes','hinder','innovation')]

  2. Group agents and patients into interpretable entities in two ways:

    • Supervised classification of entities. Simply provide a list of entities and we will filter the triplets for you (e.g., ['Barack Obama', 'government', ...]).
    • Unsupervised classification via clustering of entities. We represent agents and patients as text embeddings and cluster them via KMeans or HDBSCAN. The optimal number of topics is data-driven.
    • A concrete example of a cluster:

    Interpretable entity: "tax"
    Related phrases: ['income tax', 'the tax rates', 'taxation in this country', etc.]

  3. Visualize clusters and resulting narratives.

We currently support French and English out-of-the-box. You can also provide us with a custom SVO-extraction function for any language supported by spaCy.

Installation

Runs on Linux and macOS (x86 platform) and it requires Python 3.7 (or 3.8) and pip.
It is highly recommended to use a virtual environment (or conda environment) for the installation.

# upgrade pip, wheel and setuptools
python -m pip install -U pip wheel setuptools

# install the package
python -m pip install -U relatio

In case you want to use Jupyter make sure that you have it installed in the current environment.

Quickstart

Please see our hands-on tutorials:

Team

relatio is brought to you by

with a special thanks for support of ETH Scientific IT Services.

If you are interested in contributing to the project please read the Development Guide.

Disclaimer

Remember that this is a research tool :)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

relatio-0.3.0.tar.gz (27.0 kB view hashes)

Uploaded Source

Built Distribution

relatio-0.3.0-py3-none-any.whl (28.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page