Skip to main content

A sparv plugin for computing word neighbours using a BERT model.

Project description

sparv-word-prediction-plugin

CI PyPI version

Plugin for applying bert masking as a Sparv annotation.

Install

First, install Sparv, as suggested:

pipx install sparv-pipeline

Then install install sparv-word-prediction-plugin with

pipx inject sparv-pipeline sparv-word-prediction-plugin

Usage

Depending on how many exlicit exports of annotations you have you can decide to use this annotation exclusively by adding it as the only annotation to export under xml_export:

xml_export:
    annotations:
        - <token>:word_precition.word-prediction--kb-bert

To use it together with other annotations you might add it under export:

export:
    annotations:
        - <token>:word_prediction.transformer-neighbour
        ...

6167652d656e6372797074696f6e2e6f72672f76310a2d3e20736372797074207a656d436e4c4c78765047666e43362f587044376b412031380a7161396d38716b42516259707833424a4947734158754c63302f456d615150485675356856357a73354f490a2d2d2d204c625a6a4d4c613873664e676b777569776b614738556a663348363934704c593165666f484332763859550aa3249e5de404da772ecc434222b10b2b8d8a92dc31a15df728502bc9995b8d5d0ac223efe57cbe708f19cadaa04bf8c67dbe5afa9575ccd5571ba6a2b10225b608d02b03204f3555c98182f2afb78654a7fc123ee31d574ae9f838f16d8638ce4a3e758e542fb23909cf22ba99991362132a0d7c81e34f60f0f16e581f01ed171b3a3efd23d1a6f3ac9c722da324ee2285db5a85a98372a022e3e375f194b7aab5412f50d940cbfd2bb303f2abfd2527ec0e4533f468f5b5a8acbd4289d11af9e0c7bdad2f1740d08ba97369592cc6274edecf

Configuration

You can configure this plugin by choosing a huggingface model, huggingface transformer and the number of neighbours to generate.

Model

The model defaults to KBLab/bert-base-swedish-cased but can be configured in config.yaml:

word_prediction:
    model: "KBLab/bert-base-swedish-cased"

Tokenizer

The tokenizer defaults to KBLab/bert-base-swedish-cased but can be configured in config.yaml:

word_prediction:
    tokenizer: "KBLab/bert-base-swedish-cased"

Number of Neighbours

The number of neighbours defaults to 5 but can be configured in config.yaml:

word_prediction:
    num_neighbours: 5

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparv_word_prediction_plugin-0.3.0.tar.gz (104.0 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file sparv_word_prediction_plugin-0.3.0.tar.gz.

File metadata

File hashes

Hashes for sparv_word_prediction_plugin-0.3.0.tar.gz
Algorithm Hash digest
SHA256 35bd315eea543c4b264290b102b8b093c605b89741777af65e170a308e1ca9e9
MD5 f4925300e0e49cdf9f3295b0b6b8635e
BLAKE2b-256 f1b5aba8fe4ed4192a930328e40fcde30811a4894fdccf5a94ea16146c724c46

See more details on using hashes here.

File details

Details for the file sparv_word_prediction_plugin-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sparv_word_prediction_plugin-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 215f31432d34ea6b50f48211ea742296ddc859b8e6c948666f9b239508b16280
MD5 8ec08b98170a22a51682f2448b4cef47
BLAKE2b-256 770c19440fea61f1252cc6dd6d687de51ef75227fd2d59cb76079e429b0c2d29

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page