Skip to main content

Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and prompting mass-media news into datasets for ML-model training

Project description

AREkit 0.25.2

PyPI downloads

AREkit (Attitude and Relation Extraction Toolkit) -- is a python toolkit, devoted to document level Attitude and Relation Extraction between text objects from mass-media news.

Description

This toolkit aims at memory-effective data processing in Relation Extraction (RE) related tasks.

Figure: AREkit pipelines design. More on ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction paper

In particular, this framework serves the following features:

  • pipelines and iterators for handling large-scale collections serialization without out-of-memory issues.
  • 🔗 EL (entity-linking) API support for objects,
  • ➰ avoidance of cyclic connections,
  • :straight_ruler: distance consideration between relation participants (in terms or sentences),
  • 📑 relations annotations and filtering rules,
  • *️⃣ entities formatting or masking, and more.

The core functionality includes:

  • API for document presentation with EL (Entity Linking, i.e. Object Synonymy) support for sentence level relations preparation (dubbed as contexts);
  • API for contexts extraction;
  • Relations transferring from sentence-level onto document-level, and more.

Installation

pip install git+https://github.com/nicolay-r/AREkit.git@0.25.2-rc

Usage

Please follow the tutorial section on project Wiki for mode details.

How to cite

A great research is also accompanied by the faithful reference. if you use or extend our work, please cite as follows:

@inproceedings{rusnachenko2024arelight,
  title={ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction},
  author={Rusnachenko, Nicolay and Liang, Huizhi and Kolomeets, Maxim and Shi, Lei},
  booktitle={European Conference on Information Retrieval},
  year={2024},
  organization={Springer}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arekit-0.25.2.tar.gz (95.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arekit-0.25.2-py3-none-any.whl (132.3 kB view details)

Uploaded Python 3

File details

Details for the file arekit-0.25.2.tar.gz.

File metadata

  • Download URL: arekit-0.25.2.tar.gz
  • Upload date:
  • Size: 95.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.5

File hashes

Hashes for arekit-0.25.2.tar.gz
Algorithm Hash digest
SHA256 018fbc0d048f9e2e974f1669786ef78416c8694cbf61f7d480f917d495794d39
MD5 58436afbdf7f7e48c5371530aa667349
BLAKE2b-256 ee5c882a0936074b30942b9845cfee7dbaede98415fde251bd0264ace017db39

See more details on using hashes here.

File details

Details for the file arekit-0.25.2-py3-none-any.whl.

File metadata

  • Download URL: arekit-0.25.2-py3-none-any.whl
  • Upload date:
  • Size: 132.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.5

File hashes

Hashes for arekit-0.25.2-py3-none-any.whl
Algorithm Hash digest
SHA256 75f48b65d57bc0dc1469d68c46f54cae862d565426d27cd30b6682140be82ff4
MD5 3ae98e5b6b6a049e4b4824587e69f8e4
BLAKE2b-256 d2be3b0b37c37ccc0c5e90bcba100c63cce479faf607554ad70c6a1738b26712

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page