Skip to main content

spaCy pipeline for crfsuite entity extraction

Project description

spacy_crfsuite: crfsuite entity extraction for spaCy.

spacy_crfsuite is an entity extraction pipeline for spaCy based .

Install

Python

pip install spacy_crfsuite

Usage

Spacy usage

import os
import spacy

from spacy_crfsuite import CRFEntityExtractorFactory

# load spacy language model
nlp = spacy.blank('en')

# Will look for ``crf.pkl`` in current working dir
pipe = CRFEntityExtractorFactory(nlp, model_dir=os.getcwd())
nlp.add_pipe(pipe)

# Use CRF to extract entities
doc = nlp("given we launched L&M a couple of years ago")
for ent in doc.ents:
    print(ent.text, "--", ent.label_)

Train a model

python -m spacy_crfsuite.trainer train <TRAIN> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>

Evaluate a model

python -m spacy_crfsuite.trainer eval <DEV> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>

Gold annotations example (markdown)

## Header
- what is my balance <!-- no entity -->
- how much do I have on my [savings](source_account) <!-- entity "source_account" has value "savings" -->
- how much do I have on my [savings account](source_account:savings) <!-- synonyms, method 1-->
- Could I pay in [yen](currency)?  <!-- entity matched by lookup table -->

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy_crfsuite-0.1.0.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spacy_crfsuite-0.1.0-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file spacy_crfsuite-0.1.0.tar.gz.

File metadata

  • Download URL: spacy_crfsuite-0.1.0.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for spacy_crfsuite-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2dc50e7d90b4d6df6da04647aeb02984fc934f75427fa074731c6c78a44a671e
MD5 4c98d90aa33c33dc2450e27ca8f9aee0
BLAKE2b-256 d96eefa05fa497d3e95c5edfad2a2282d10927074992d305bbeb43eb911698b3

See more details on using hashes here.

File details

Details for the file spacy_crfsuite-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: spacy_crfsuite-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for spacy_crfsuite-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c88647cd51331977cce0ebb36473fdd69f3bb7bf6993d04ff313562bb3adff36
MD5 df3823166cc9b4c3911efcf1f22ba454
BLAKE2b-256 1c62c5a3f30208577c2f4df4d7c576b39898614fa742d5347ee306249dc3ec7f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page