spaCy pipeline component for CRF entity extraction
Project description
spacy_crfsuite: CRF entity tagger for spaCy.
✨ Features
- spaCy NER component for Conditional Random Field entity extraction (via sklearn-crfsuite).
- train & eval command line and example notebook.
- supports JSON, CoNLL and Markdown annotations
Installation
Python
pip install spacy_crfsuite
🚀 Quickstart
Usage as a spaCy pipeline component
spaCy pipeline
import spacy
from spacy_crfsuite import CRFEntityExtractor
nlp = spacy.blank('en')
pipe = CRFEntityExtractor(nlp).from_disk("model.pkl")
nlp.add_pipe(pipe)
doc = nlp("show mexican restaurents up north")
for ent in doc.ents:
print(ent.text, "--", ent.label_)
# Output:
# mexican -- cuisine
# north -- location
Follow this example notebook to train the CRF entity tagger from few restaurant search examples.
Train & evaluate CRF entity tagger
Set up configuration file
$ cat << EOF > config.json
{"c1": 0.03, "c2": 0.06}
EOF
Run training
$ python -m spacy_crfsuite.train examples/example.md -o model/ -c config.json
ℹ Loading config: config.json
ℹ Training CRF entity tagger with 15 examples.
ℹ Saving model to disk
✔ Successfully saved model to file.
/Users/talmago/git/spacy_crfsuite/model/model.pkl
Evaluate on a dataset
$ python -m spacy_crfsuite.eval examples/example.md -m model/model.pkl
ℹ Loading model from file
model/model.pkl
✔ Successfully loaded CRF tagger
<spacy_crfsuite.crf_extractor.CRFExtractor object at 0x126e5f438>
ℹ Loading dev dataset from file
examples/example.md
✔ Successfully loaded 15 dev examples.
⚠ f1 score: 1.0
precision recall f1-score support
- 1.000 1.000 1.000 2
B-cuisine 1.000 1.000 1.000 1
L-cuisine 1.000 1.000 1.000 1
U-cuisine 1.000 1.000 1.000 5
U-location 1.000 1.000 1.000 2
micro avg 1.000 1.000 1.000 11
macro avg 1.000 1.000 1.000 11
weighted avg 1.000 1.000 1.000 11
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spacy_crfsuite-1.0.2.tar.gz
(15.2 kB
view hashes)
Built Distribution
Close
Hashes for spacy_crfsuite-1.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c9450afc8ec7e60c58e317238a8627bd2591ac8648a2db38cea0469a032c2fb |
|
MD5 | 9e2f09b9f0f881d3d526e3f55e6e7ce1 |
|
BLAKE2b-256 | 1a6cfcfd5a58d3085b642b21cf09c0f0fa0be1bd1ed842ef8ed072cc0a01b0ef |