Skip to main content

Heritage Connector NLP

Project description

heritage-connector-nlp

Text processing for the Heritage Connector: a set of NLP utilities for the Heritage sector.

--- IN DEVELOPMENT ---

(note about spaCy: the master branch and all releases after 0.2.1 use spaCy v3, which is currently in nightly and not meant for production use.)

Includes:

  • information extraction (NER, NEL, relation classification)
  • labelling (Label Studio)
  • test suite for models

Usage

Label Studio

Setting up (first time):

  1. Run label-studio start labelling --init, which will start up Label Studio and take you to a configuration wizard.
  2. Select Named Entity Recognition from the top menu, and fill in the entity types you want to annotate

Running: Run label-studio start labelling from the root directory.

Useful parameters:

  • --sampling=uniform: have Label Studio show documents in a random order
  • --label-config label_studio_config_sample.xml: load config from a file

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hc-nlp-0.3.0.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

hc_nlp-0.3.0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file hc-nlp-0.3.0.tar.gz.

File metadata

  • Download URL: hc-nlp-0.3.0.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for hc-nlp-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e62ce2ff10fadb5f361b6a1173a17d87919c879ddcbc0f64e64d38058afa1a4e
MD5 bb427ccc2e9353d8ff9f7edb9f341a33
BLAKE2b-256 4bdb8be448bb88e5caa39f12a70a9765993e6f5a79920fd07bb4cf8d782b376a

See more details on using hashes here.

File details

Details for the file hc_nlp-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: hc_nlp-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for hc_nlp-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c070321fbf3a6fdac16e7ab76beaa3c8a52a0bc5f48bc287a95ba00c8684f5d7
MD5 9d6b307b331047c6df22fb8e45ba7a8a
BLAKE2b-256 15f9049b0511a4b847844fa7337d8d0b2fc7b45599ed95ebe39b7085d48da316

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page