Skip to main content

A BERT-based inference module for negation detection (cue, scope) -> planning to add focus and event in the near future

Project description

neg-detect

neg-detect is a Python package for detecting negation cues and their scopes in text using fine-tuned BERT models. It provides a pipeline to process batched text inputs, identify negation cues (e.g., "not", "n't"), and determine the scope of negation within sentences. The package leverages the Hugging Face Transformers library and PyTorch for efficient inference.

Features

  • Negation Cue Detection: Identifies negation cues (e.g., "not", "n't") using the CueBertInference class.
  • Negation Scope Detection: Determines the scope of negation in text using the ScopeBertInference class.
  • Pipeline Processing: Combines cue and scope detection in a single pipeline for streamlined processing.
  • Batch Processing: Supports batched inputs for efficient inference.
  • GPU Support: Utilizes CUDA for accelerated inference on compatible hardware.
  • TODO: In the future there will be negation event and focus detection components added to the Pipeline.

Installation

Prerequisites

  • Python 3.6 or higher
  • PyTorch
  • Hugging Face Transformers
  • CUDA-enabled GPU (optional, for faster inference)

Install via PyPI

pip install neg-detect

Install Dependencies

Ensure dependencies are installed:

pip install torch transformers

Usage

Basic Example

The following example demonstrates how to use the Pipeline class to detect negation cues and scopes in a batch of sentences.

from neg_detect import Pipeline

# Define input sentences
batch_tokens = [
    "Your sample input does n't go here .".split(" "),
    "This is not another test sentence .".split(" ")
]

# Initialize pipeline with default models
pipe = Pipeline()

# Run inference
results = pipe.run(batch_tokens)

# Pretty print results
Pipeline.pretty_print(results)

Output:

Your            X
sample          X
input           X
does            X
n't             C
go              S
here            S
,               X
i               X
live            X
in              X
Germany         X
.               X

This            X
is              X
not             C
another         S
test            S
sentence        S
.               X

Advanced Usage

For custom models or tokenizers, you can initialize the pipeline with specific components:

from neg_detect import Pipeline, CueBertInference, ScopeBertInference

# Load custom models and tokenizers
mcue_path = "Lelon/8449368577"
mscope_path = "Lelon/5556020097"
model_cue, tokenizer_cue = CueBertInference.load_model_and_tokenizer(mcue_path, mcue_path)
model_scope, tokenizer_scope = ScopeBertInference.load_model_and_tokenizer(mscope_path, mscope_path)

# Initialize pipeline with custom components
pipe = Pipeline(
    components=[CueBertInference, ScopeBertInference],
    models=[model_cue, model_scope],
    tokenizers=[tokenizer_cue, tokenizer_scope]
)

# Define input
batch_tokens = [
    "This is not another test sentence .".split(" ")
]

# Run inference
results = pipe.run(batch_tokens, device="cuda:0", max_length=128)

# Print results
Pipeline.pretty_print(results)

Package Structure

  • CueBertInference: Detects negation cues (labeled as "C" for cues, "X" otherwise).
  • ScopeBertInference: Identifies the scope of negation (labeled as "S" for scope, "X" otherwise).
  • Pipeline: Combines CueBertInference and ScopeBertInference for end-to-end negation detection.
  • Special Tokens:
    • [CUE]: Marks negation cues.
    • [SCO]: Marks negation scope.

Requirements

See requirements.txt for a full list of dependencies. Key dependencies include:

  • torch>=1.9.0
  • transformers>=4.9.0

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please open an issue on the GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neg_detect-0.1.1.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neg_detect-0.1.1-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file neg_detect-0.1.1.tar.gz.

File metadata

  • Download URL: neg_detect-0.1.1.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for neg_detect-0.1.1.tar.gz
Algorithm Hash digest
SHA256 aba111008e3763073488cdada0f50e30967c5405e709c42c3f3e29cdb1b29b09
MD5 56e856c1adf4f4e519c6a6a4d460efef
BLAKE2b-256 7a0d3887086286f27433180e6102e7eac0d6230d34fcd8867cd95011aef3936c

See more details on using hashes here.

File details

Details for the file neg_detect-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: neg_detect-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for neg_detect-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4424d76c08ac2ab8d7b3dcb4d190289371bd9eae409d3c2e26368e81dd117530
MD5 20ba598a188ee6a7e837c1a4bd33e7c6
BLAKE2b-256 a1e65c0752ddd78f443d4804b006ab5bb7639834193fa5d193b00888020bc441

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page