spaCy pipeline component for CRF entity extraction
Project description
spacy_crfsuite: CRF entity tagger for spaCy.
✨ Features
- spaCy NER component for Conditional Random Field entity extraction (via sklearn-crfsuite).
- train & eval command line and example notebook.
- supports JSON, CoNLL and Markdown annotations
Installation
Python
pip install spacy_crfsuite
🚀 Quickstart
Usage as a spaCy pipeline component
spaCy pipeline
import spacy
from spacy_crfsuite import CRFEntityExtractor
nlp = spacy.blank('en')
pipe = CRFEntityExtractor(nlp).from_disk("model.pkl")
nlp.add_pipe(pipe)
doc = nlp("show mexican restaurents up north")
for ent in doc.ents:
print(ent.text, "--", ent.label_)
# Output:
# mexican -- cuisine
# north -- location
Follow this example notebook to train the CRF entity tagger from few restaurant search examples.
Train & evaluate CRF entity tagger
Set up configuration file
$ cat << EOF > config.json
{"c1": 0.03, "c2": 0.06}
EOF
Run training
$ python -m spacy_crfsuite.train examples/example.md -o model/ -c config.json
ℹ Loading config: config.json
ℹ Training CRF entity tagger with 15 examples.
ℹ Saving model to disk
✔ Successfully saved model to file.
/Users/talmago/git/spacy_crfsuite/model/model.pkl
Evaluate on a dataset
$ python -m spacy_crfsuite.eval examples/example.md -m model/model.pkl
ℹ Loading model from file
model/model.pkl
✔ Successfully loaded CRF tagger
<spacy_crfsuite.crf_extractor.CRFExtractor object at 0x126e5f438>
ℹ Loading dev dataset from file
examples/example.md
✔ Successfully loaded 15 dev examples.
⚠ f1 score: 1.0
precision recall f1-score support
- 1.000 1.000 1.000 2
B-cuisine 1.000 1.000 1.000 1
L-cuisine 1.000 1.000 1.000 1
U-cuisine 1.000 1.000 1.000 5
U-location 1.000 1.000 1.000 2
micro avg 1.000 1.000 1.000 11
macro avg 1.000 1.000 1.000 11
weighted avg 1.000 1.000 1.000 11
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spacy_crfsuite-1.0.1.tar.gz
(15.1 kB
view hashes)
Built Distribution
Close
Hashes for spacy_crfsuite-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86eea1b017fcb79b2f6a042801a64238c76d773d671cfb6cbe3197dce2196e32 |
|
MD5 | 6804e0f39b079c61584199c42dbd6254 |
|
BLAKE2b-256 | b8dc7b025c49a6a03285a8aa7836f6331458a14e50a55d2f5b8018e220056cc8 |