Few-Shot Named Entity Recognition using Span Markers

These details have not been verified by PyPI

Project links

Project description

SpanMarker for Named Entity Recognition

🤗 Models | 🛠️ Getting Started In Google Colab | 📄 Documentation

SpanMarker is a framework for training powerful Named Entity Recognition models using familiar encoders such as BERT, RoBERTa and DeBERTa. Tightly implemented on top of the 🤗 Transformers library, SpanMarker can take advantage of its valuable functionality.

Based on the PL-Marker paper, SpanMarker breaks the mold through its accessibility and ease of use. Crucially, SpanMarker works out of the box with many common encoders such as bert-base-cased and roberta-large, and automatically works with datasets using the IOB, IOB2, BIOES, BILOU or no label annotation scheme.

Documentation

Feel free to have a look at the documentation.

Installation

You may install the span_marker Python module via pip like so:

pip install span_marker

Quick Start

Please have a look at our Getting Started notebook for details on how SpanMarker is commonly used. It explains the following snippet in more detail.

from datasets import load_dataset
from span_marker import SpanMarkerModel, Trainer
from transformers import TrainingArguments

dataset = load_dataset("DFKI-SLT/few-nerd", "supervised")
labels = dataset["train"].features["ner_tags"].feature.names

model_name = "bert-base-cased"
model = SpanMarkerModel.from_pretrained(model_name, labels=labels)

args = TrainingArguments(
    output_dir="my_span_marker_model",
    learning_rate=5e-5,
    gradient_accumulation_steps=2,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    num_train_epochs=1,
    save_strategy="steps",
    eval_steps=200,
    logging_steps=50,
    fp16=True,
    warmup_ratio=0.1,
)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=dataset["train"].select(range(8000)),
    eval_dataset=dataset["validation"].select(range(2000)),
)

trainer.train()
trainer.save_model("my_span_marker_model/checkpoint-final")

metrics = trainer.evaluate()
print(metrics)

Because this work is based on PL-Marker, you may expect similar results to its Papers with Code Leaderboard results. Tests, documentation and further information on expected performance will come soon.

Pretrained Models

tomaarsen/span-marker-bert-base-fewnerd-fine-super is a model that I have trained in just 4 hours on the finegrained, supervised Few-NERD dataset. It reached a 0.7020 Test F1, competitive in the all-time Few-NERD leaderboard. My training script resembles the one that you can see above.
- See this Weights and Biases report for training details.

Changelog

See CHANGELOG.md for news on all SpanMarker versions.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.7.0

Jan 8, 2025

1.6.0

Dec 5, 2024

1.5.0

Oct 31, 2023

1.4.0

Sep 29, 2023

1.3.0

Aug 24, 2023

1.2.5

Aug 24, 2023

1.2.4

Jul 18, 2023

1.2.3

Jun 20, 2023

1.2.2

Jun 20, 2023

1.2.1

Jun 19, 2023

1.2.0

Jun 15, 2023

1.1.1

Jun 13, 2023

1.1.0

Jun 10, 2023

1.0.1

May 1, 2023

1.0.0

May 1, 2023

0.2.2

Apr 13, 2023

0.2.1

Apr 7, 2023

This version

0.2.0

Apr 6, 2023

0.1.1

Mar 31, 2023

0.1.0

Mar 30, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

span_marker-0.2.0.tar.gz (23.3 kB view details)

Uploaded Apr 6, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

span_marker-0.2.0-py3-none-any.whl (22.4 kB view details)

Uploaded Apr 6, 2023 Python 3

File details

Details for the file span_marker-0.2.0.tar.gz.

File metadata

Download URL: span_marker-0.2.0.tar.gz
Upload date: Apr 6, 2023
Size: 23.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for span_marker-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`e9728b4dd050b1d60d645c3c82d983ff57125bb4f417d41466eda801843f1b5b`
MD5	`b450c4f62adedade7d3de195345d049c`
BLAKE2b-256	`6d80b5aed3a636b9b240ef73fba54ecd367eebac67d61fb062868d0b1a606993`

See more details on using hashes here.

File details

Details for the file span_marker-0.2.0-py3-none-any.whl.

File metadata

Download URL: span_marker-0.2.0-py3-none-any.whl
Upload date: Apr 6, 2023
Size: 22.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for span_marker-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d4bc8d991ac17d6c678973a340b668a53da011a3cc8e55355688c1aa298a9a7b`
MD5	`8e182e1c5b366a94f51956891b771787`
BLAKE2b-256	`a8c1c73034eaeaa4f2701ac204bd47f83dcbd93dd7cbcf77bfdb2dee6b8b78e6`

See more details on using hashes here.

span-marker 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

SpanMarker for Named Entity Recognition

Documentation

Installation

Quick Start

Pretrained Models

Changelog

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes