Skip to main content

A package for extracting causal chains from text

Project description

Causal Chain Extractor

This code implements a tool to extract causal chains from text by summarizing the text using the bart-cause-effect model from Hugging Face Transformers and then linking the causes and effects with cosine similarity calculated using the Sentence Transformer model.

image

Installation

Library can be installed with pip install causal-chains

Usage

  1. Initialize a CausalChain object with a list of chunks of text as input.
  2. Run the create_effects method to get the cause and effect pairs from the text and then link the events based on cosine similarity of their embeddings.
  3. Run the visualize method to see the largest chain.

Methods:

The class "CausalChain" has the following methods:

  1. create_effects: This method uses the pipeline from the transformers library to analyze the text for cause-effect relationships. The text is first split into chunks, with each chunk containing a maximum of 3 sentences. This is done to limit the output length and avoid memory issues. The pipeline then generates summaries of the cause-effect relationships in the text. The triggers and effects of these relationships are stored in separate lists.

  2. create_connections: This method uses the SentenceTransformer from the sentence-transformers library to encode the triggers and effects generated in the create_effects method. The method then calculates the cosine similarity between the triggers and effects to determine if there is a cause-effect relationship between them. If the cosine similarity score is above a certain threshold (default is 0.6), a connection is established between the trigger and effect. The connections between triggers and effects are stored in a dictionary.

  3. find_biggest_chain: This method finds the longest chain of cause-effect relationships in the text. It starts from an effect and follows the connections stored in the connections dictionary to find other effects that are connected to it. The method continues this process until there are no more connections or the chain reaches a certain length (default is 10). The longest chain found is stored in the biggest_chain attribute.

  4. visualize: This method displays a given chain using pydot.

Example Usage:

from causal_chain_extractor import CausalChain, util
import wikipedia 

text = wikipedia.page("ChristopherColumbus").content
chunks util.create_chunks(text)
cc = CausalChain(chunks,device=0)
cc.create_connections()
biggest_chain = cc.biggest_chain
cc.visualize(biggest_chain)

The display that this code produces is shown at the top of this page.

Google Colab Demo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

causal_chains-1.0.8.tar.gz (2.6 kB view details)

Uploaded Source

Built Distribution

causal_chains-1.0.8-py3-none-any.whl (2.6 kB view details)

Uploaded Python 3

File details

Details for the file causal_chains-1.0.8.tar.gz.

File metadata

  • Download URL: causal_chains-1.0.8.tar.gz
  • Upload date:
  • Size: 2.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for causal_chains-1.0.8.tar.gz
Algorithm Hash digest
SHA256 54301b8bfb54055431fd4aeac6b74fe3d6663e3759375ad8d86e2a5e6559aa8e
MD5 61090146d780175d75b1bfa9f408a33d
BLAKE2b-256 d63d9a686c6ca31059b771c76b93fc0148c2a86663113d408f504824403d0dc2

See more details on using hashes here.

File details

Details for the file causal_chains-1.0.8-py3-none-any.whl.

File metadata

File hashes

Hashes for causal_chains-1.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 588d53977ab08fae5b14f3c00298fc592d29d7b371cc7690e9e8cd3367e4625a
MD5 ce1d7b2020c2893659a53fe5d0c8ee27
BLAKE2b-256 64a9e28886847b9e86c902bdcd74b6b644a1e02a64d6e119a7988d6969bd6fe7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page