spaCy pipeline component for CRF entity extraction
Project description
spacy_crfsuite: CRF entity tagger for spaCy.
✨ Features
- spaCy NER component for Conditional Random Field entity extraction (via sklearn-crfsuite).
- train & eval command line and example notebook.
- supports JSON, CoNLL and Markdown annotations
Installation
Python
pip install spacy_crfsuite
🚀 Quickstart
Usage as a spaCy pipeline component
spaCy pipeline
import spacy
from spacy_crfsuite import CRFEntityExtractor
nlp = spacy.blank('en')
pipe = CRFEntityExtractor(nlp).from_disk("model.pkl")
nlp.add_pipe(pipe)
doc = nlp("show mexican restaurents up north")
for ent in doc.ents:
print(ent.text, "--", ent.label_)
# Output:
# mexican -- cuisine
# north -- location
Follow this example notebook to train the CRF entity tagger from few restaurant search examples.
Train & evaluate CRF entity tagger
Set up configuration file
$ cat << EOF > config.json
{"c1": 0.03, "c2": 0.06}
EOF
Run training
$ python -m spacy_crfsuite.train examples/example.md -o model/ -c config.json
ℹ Loading config: config.json
ℹ Training CRF entity tagger with 15 examples.
ℹ Saving model to disk
✔ Successfully saved model to file.
/Users/talmago/git/spacy_crfsuite/model/model.pkl
Evaluate on a dataset
$ python -m spacy_crfsuite.eval examples/example.md -m model/model.pkl
ℹ Loading model from file
model/model.pkl
✔ Successfully loaded CRF tagger
<spacy_crfsuite.crf_extractor.CRFExtractor object at 0x126e5f438>
ℹ Loading dev dataset from file
examples/example.md
✔ Successfully loaded 15 dev examples.
⚠ f1 score: 1.0
precision recall f1-score support
- 1.000 1.000 1.000 2
B-cuisine 1.000 1.000 1.000 1
L-cuisine 1.000 1.000 1.000 1
U-cuisine 1.000 1.000 1.000 5
U-location 1.000 1.000 1.000 2
micro avg 1.000 1.000 1.000 11
macro avg 1.000 1.000 1.000 11
weighted avg 1.000 1.000 1.000 11
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spacy_crfsuite-1.0.2.tar.gz
(15.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spacy_crfsuite-1.0.2.tar.gz.
File metadata
- Download URL: spacy_crfsuite-1.0.2.tar.gz
- Upload date:
- Size: 15.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f0c033da14fd61ed6ee1d170cce081913a4c011000fcd3e9027fdbaed508df67
|
|
| MD5 |
b256a3d16d7004b52e552aa887b873d2
|
|
| BLAKE2b-256 |
ad90ce8f5601341d4273d19c1ab24f13dc529246c1293bf28a5f462304ee0e6b
|
File details
Details for the file spacy_crfsuite-1.0.2-py3-none-any.whl.
File metadata
- Download URL: spacy_crfsuite-1.0.2-py3-none-any.whl
- Upload date:
- Size: 17.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c9450afc8ec7e60c58e317238a8627bd2591ac8648a2db38cea0469a032c2fb
|
|
| MD5 |
9e2f09b9f0f881d3d526e3f55e6e7ce1
|
|
| BLAKE2b-256 |
1a6cfcfd5a58d3085b642b21cf09c0f0fa0be1bd1ed842ef8ed072cc0a01b0ef
|