Skip to main content

A small wrapper for BratEval that encapsulates the java commands

Project description

BratEval Wrapper for NLP

This library wraps the Java-based BratEval utility to evaluate annotation data for named-entity-recognition (NER). Given the availability of the Git, Java JDK and Maven, it clones and compiles brateval and wraps the io interactions into Python.

Note that currently the release v0.3.2 of brateval is used. See: https://github.com/READ-BioMed/brateval/tree/v0.3.2

Note that a valid Java JDK and Maven environment must be set up correctly.

Install

First, make sure to install a Java JDK environment as well as Maven (for compiling the JAR file) before using the wrapper.

[Click to show] Preparation instructions for Ubuntu
# Install Java (11, or 21), Maven and Git first
sudo apt install -y openjdk-21-jdk-headless maven git

# Add JAVA_HOME variable to ~/.bashrc
echo 'export JAVA_HOME=$(readlink -f /usr/bin/javac | sed "s:/bin/javac::")' >> $HOME/.bashrc

# Login again to re-load the JAVA_HOME environment variable (or export the variable manually)
export JAVA_HOME=$(readlink -f /usr/bin/javac | sed "s:/bin/javac::")

Eventually, install the package using pip: python3 -m pip install bratevalwrapper4nlp

Example

The following script demonstrates the use of the evaluate function.

import json
from bratevalwrapper4nlp import evaluate

# Define document (ground truth and prediction)
doc_ground_truth = {
    "text": "This is a fine example.",
    "label": [
        (10, 14, "LABEL2"),
        (15, 22, "LABEL1"),
    ]
}
doc_prediction = {
    "text": "This is a fine example.",
    "label": [
        (10, 22, "LABEL1"),
    ]
}

# Verify the text spans
for src, doc in {"Ground Truth": doc_ground_truth, "Prediction": doc_prediction}.items():
    for lbl_start, lbl_stop, lbl_cls in doc.get("label", []):
        print("[{}] {} has label {}".format(
            src,
            repr(doc["text"][lbl_start:lbl_stop]),
            lbl_cls
        ))

# Run evaluation
score_response = evaluate(
    doc_ground_truth,     # or list of docs: [doc_ground_truth]
    doc_prediction,       # or list of docs: [doc_prediction]
    span_match="overlap", # "overlap", "exact", or float (overlap percentage between 1.0 and 0.0)
    type_match="exact"    # "exact", or "inexact" (ignores label classes)
)
scores = score_response["scores"]

print("Obtained scores:")
print(json.dumps(scores, indent=2))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bratevalwrapper4nlp-0.0.2.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

bratevalwrapper4nlp-0.0.2-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file bratevalwrapper4nlp-0.0.2.tar.gz.

File metadata

  • Download URL: bratevalwrapper4nlp-0.0.2.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for bratevalwrapper4nlp-0.0.2.tar.gz
Algorithm Hash digest
SHA256 5372b0c095d41779d34a43ee42d8b5e0164ca808c3eb93b1b4f179c61484a80e
MD5 a5c05f2496382e2d28cba00dc4c26a17
BLAKE2b-256 3b13f495ed6da63257ac8ef5dab58d9e7767be5f58311e10c174de6f86efb32e

See more details on using hashes here.

File details

Details for the file bratevalwrapper4nlp-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for bratevalwrapper4nlp-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 56321a398966389bdb7250a948dfb65c5584337c4b718f9036e6e32451b1aca4
MD5 2fca1a8ee9dbc060b1a44148d9338cc9
BLAKE2b-256 1c9c116e22731cc35c636c892fec06fd1be8b26090427ff477e070599485b30b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page