Skip to main content

A small wrapper for BratEval that encapsulates the java commands

Project description

BratEval Wrapper for NLP

This library wraps the Java-based BratEval util to evaluate annotation data for named-enitity-recognition (NER). Given the availability of the Git, Java JDK and Maven, it clones and compiles brateval and wraps the io interactions into Python.

Note that currently the release v0.3.2 of brateval is used. See: https://github.com/READ-BioMed/brateval/tree/v0.3.2

Note that a valid Java JDK and Maven environment must be set up correctly.

Install

First, make sure to install a Java JDK environment as well as Maven (for compiling the JAR file) before using the wrapper.

For Ubuntu, use the following commands:

# Install Java (11) and Maven first
sudo apt install -y openjdk-11-jre-headless and maven

# Add JAVA_HOME variable to ~/.bashrc
echo 'export JAVA_HOME=$(readlink -f /usr/bin/javac | sed "s:/bin/javac::")' >> $HOME/.bashrc

# Login again to re-load the JAVA_HOME environment variable (or export the variable manually)
export JAVA_HOME=$(readlink -f /usr/bin/javac | sed "s:/bin/javac::")

# Finally, install the package using pip
python3 -m pip install bratevalwrapper4nlp

Example

The following script demonstrates some transformation of annotation data.

import json
from bratevalwrapper4nlp import evaluate

# Define document
doc_ground_truth = {
    "text": "This is a fine example.",
    "label": [
        (10, 14, "LABEL2"),
        (15, 22, "LABEL1"),
    ]
}
doc_prediction = {
    "text": "This is a fine example.",
    "label": [
        (10, 22, "LABEL1"),
    ]
}

for src, doc in {"Ground Truth": doc_ground_truth, "Prediction": doc_prediction}.items():
    for lbl_start, lbl_stop, lbl_cls in doc.get("label", []):
        print("[{}] {} has label {}".format(
            src,
            repr(doc["text"][lbl_start:lbl_stop]),
            lbl_cls
        ))

score_response = evaluate(
    doc_ground_truth,
    doc_prediction,
    span_match="overlap",
    type_match="exact"
)
scores = score_response["scores"]

print("Obtained scores:")
print(json.dumps(scores, indent=2))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bratevalwrapper4nlp-0.0.1.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

bratevalwrapper4nlp-0.0.1-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file bratevalwrapper4nlp-0.0.1.tar.gz.

File metadata

  • Download URL: bratevalwrapper4nlp-0.0.1.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for bratevalwrapper4nlp-0.0.1.tar.gz
Algorithm Hash digest
SHA256 e946a3e96280ed91f11112bdab2401e717d42e429b87feaf77c7ca4ee37e40a1
MD5 365021c98f185efabc2b6e2204995f08
BLAKE2b-256 365f90e68c59262e678b8915da35c6bcb34cbec868e77e2ae3709e9d1f21ddb9

See more details on using hashes here.

File details

Details for the file bratevalwrapper4nlp-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for bratevalwrapper4nlp-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 df0e60a2e063b3f35ead3daaac136726b207f2e9a6a5c19d61258f4bc374b6f2
MD5 85f7c7dae7ea1dd22dd453cd424130ed
BLAKE2b-256 9fee11eb1ed01ab0938419001774d850c3aa22f80704f6eb5083f052fa705baf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page