Few-Shot Named Entity Recognition using Span Markers
Project description
SpanMarker for Named Entity Recognition
SpanMarker is a framework for training powerful Named Entity Recognition models using familiar encoders such as BERT, RoBERTa and DeBERTa. Tightly implemented on top of the 🤗 Transformers library, SpanMarker can take advantage of its valuable functionality.
Based on the PL-Marker paper, SpanMarker breaks the mold through its accessibility and ease of use. Crucially, SpanMarker works out of the box with many common encoders such as bert-base-cased
and roberta-large
, and automatically works with datasets using the IOB
, IOB2
, BIOES
, BILOU
or no label annotation scheme.
Installation
You may install the span_marker
Python module via pip
like so:
pip install span_marker
Quick Start
Please have a look at our Getting Started jupyter notebook for details on how SpanMarker is commonly used. That notebook explains the following snippet in more detail.
from datasets import load_dataset
from span_marker import SpanMarkerModel, Trainer
from transformers import TrainingArguments
dataset = load_dataset("DFKI-SLT/few-nerd", "supervised")
labels = dataset["train"].features["ner_tags"].feature.names
model_name = "bert-base-cased"
model = SpanMarkerModel.from_pretrained(model_name, labels=labels)
args = TrainingArguments(
output_dir="my_span_marker_model",
learning_rate=5e-5,
gradient_accumulation_steps=2,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
num_train_epochs=1,
save_strategy="steps",
eval_steps=200,
logging_steps=50,
bf16=True,
warmup_ratio=0.1,
)
trainer = Trainer(
model=model,
args=args,
train_dataset=dataset["train"].select(range(8000)),
eval_dataset=dataset["validation"].select(range(2000)),
)
trainer.train()
trainer.save_model("my_span_marker_model/checkpoint-final")
metrics = trainer.evaluate()
print(metrics)
For this work is based on PL-Marker, you may expect similar results to its Papers with Code Leaderboard. Tests, documentation and further information on expected performance will come soon.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for span_marker-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 17f533762e2eeb981dc929e2d5e878229f1f4cd6d3ac950e0573c2464032b88d |
|
MD5 | 8ec171a7ce17f5dc8be43569ef988c54 |
|
BLAKE2b-256 | 91a481312648da34f968894b6e26eb4d6eb2e7b037420bdf464196d00f3152b2 |