spaCy pipeline for crfsuite entity extraction
Project description
spacy_crfsuite: crfsuite entity extraction for spaCy.
spacy_crfsuite
is an entity extraction pipeline for spaCy based .
Install
Python
pip install spacy_crfsuite
Usage
Spacy usage
import os
import spacy
from spacy_crfsuite import CRFEntityExtractorFactory
# load spacy language model
nlp = spacy.blank('en')
# Will look for ``crf.pkl`` in current working dir
pipe = CRFEntityExtractorFactory(nlp, model_dir=os.getcwd())
nlp.add_pipe(pipe)
# Use CRF to extract entities
doc = nlp("given we launched L&M a couple of years ago")
for ent in doc.ents:
print(ent.text, "--", ent.label_)
Train a model
python -m spacy_crfsuite.trainer train <TRAIN> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>
Evaluate a model
python -m spacy_crfsuite.trainer eval <DEV> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>
Gold annotations example (markdown)
## Header
- what is my balance <!-- no entity -->
- how much do I have on my [savings](source_account) <!-- entity "source_account" has value "savings" -->
- how much do I have on my [savings account](source_account:savings) <!-- synonyms, method 1-->
- Could I pay in [yen](currency)? <!-- entity matched by lookup table -->
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spacy_crfsuite-0.1.0.tar.gz
(12.8 kB
view hashes)
Built Distribution
Close
Hashes for spacy_crfsuite-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c88647cd51331977cce0ebb36473fdd69f3bb7bf6993d04ff313562bb3adff36 |
|
MD5 | df3823166cc9b4c3911efcf1f22ba454 |
|
BLAKE2b-256 | 1c62c5a3f30208577c2f4df4d7c576b39898614fa742d5347ee306249dc3ec7f |