Skip to main content

spaCy pipeline for crfsuite entity extraction

Project description

spacy_crfsuite: crfsuite entity extraction for spaCy.

spacy_crfsuite is an entity extraction pipeline for spaCy based .

Install

Python

pip install spacy_crfsuite

Usage

Spacy usage

import os
import spacy

from spacy_crfsuite import CRFEntityExtractorFactory

# load spacy language model
nlp = spacy.blank('en')

# Will look for ``crf.pkl`` in current working dir
pipe = CRFEntityExtractorFactory(nlp, model_dir=os.getcwd())
nlp.add_pipe(pipe)

# Use CRF to extract entities
doc = nlp("given we launched L&M a couple of years ago")
for ent in doc.ents:
    print(ent.text, "--", ent.label_)

Train a model

python -m spacy_crfsuite.trainer train <TRAIN> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>

Evaluate a model

python -m spacy_crfsuite.trainer eval <DEV> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>

Gold annotations example (markdown)

## Header
- what is my balance <!-- no entity -->
- how much do I have on my [savings](source_account) <!-- entity "source_account" has value "savings" -->
- how much do I have on my [savings account](source_account:savings) <!-- synonyms, method 1-->
- Could I pay in [yen](currency)?  <!-- entity matched by lookup table -->

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy_crfsuite-0.1.1.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spacy_crfsuite-0.1.1-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file spacy_crfsuite-0.1.1.tar.gz.

File metadata

  • Download URL: spacy_crfsuite-0.1.1.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for spacy_crfsuite-0.1.1.tar.gz
Algorithm Hash digest
SHA256 13775e43d39fc469b2e08fe10dc8657d3ec113c9caca41cf81669e2c3954626b
MD5 983f48b975732cf2eea7b5952daf33fa
BLAKE2b-256 4785bbc7842091f2784045d54b2348964210ab7a337867f1f161cde522953d09

See more details on using hashes here.

File details

Details for the file spacy_crfsuite-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: spacy_crfsuite-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for spacy_crfsuite-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6be20ebe4388e901324eaf9d10557fd7d4b7a84455b00434124d03b4235c7bb0
MD5 08c5a2e7a2272fad7bade924046ace5a
BLAKE2b-256 fa1f19c073883ac4b7a40ceb0833622368d5d6b64b50c11590e1e9539bee22fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page