spaCy pipeline for crfsuite entity extraction
Project description
spacy_crfsuite: crfsuite entity extraction for spaCy.
spacy_crfsuite is an entity extraction pipeline for spaCy based .
Install
Python
pip install spacy_crfsuite
Usage
Spacy usage
import os
import spacy
from spacy_crfsuite import CRFEntityExtractorFactory
# load spacy language model
nlp = spacy.blank('en')
# Will look for ``crf.pkl`` in current working dir
pipe = CRFEntityExtractorFactory(nlp, model_dir=os.getcwd())
nlp.add_pipe(pipe)
# Use CRF to extract entities
doc = nlp("given we launched L&M a couple of years ago")
for ent in doc.ents:
print(ent.text, "--", ent.label_)
Train a model
python -m spacy_crfsuite.trainer train <TRAIN> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>
Evaluate a model
python -m spacy_crfsuite.trainer eval <DEV> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>
Gold annotations example (markdown)
## Header
- what is my balance <!-- no entity -->
- how much do I have on my [savings](source_account) <!-- entity "source_account" has value "savings" -->
- how much do I have on my [savings account](source_account:savings) <!-- synonyms, method 1-->
- Could I pay in [yen](currency)? <!-- entity matched by lookup table -->
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spacy_crfsuite-0.1.0.tar.gz
(12.8 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spacy_crfsuite-0.1.0.tar.gz.
File metadata
- Download URL: spacy_crfsuite-0.1.0.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2dc50e7d90b4d6df6da04647aeb02984fc934f75427fa074731c6c78a44a671e
|
|
| MD5 |
4c98d90aa33c33dc2450e27ca8f9aee0
|
|
| BLAKE2b-256 |
d96eefa05fa497d3e95c5edfad2a2282d10927074992d305bbeb43eb911698b3
|
File details
Details for the file spacy_crfsuite-0.1.0-py3-none-any.whl.
File metadata
- Download URL: spacy_crfsuite-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c88647cd51331977cce0ebb36473fdd69f3bb7bf6993d04ff313562bb3adff36
|
|
| MD5 |
df3823166cc9b4c3911efcf1f22ba454
|
|
| BLAKE2b-256 |
1c62c5a3f30208577c2f4df4d7c576b39898614fa742d5347ee306249dc3ec7f
|