Skip to main content

A sparv plugin for computing word neighbours using a BERT model.

Project description

sparv-word-prediction-plugin

CI PyPI version

Plugin for applying bert masking as a Sparv annotation.

Install

First, install Sparv, as suggested:

pipx install sparv-pipeline

Then install install sparv-word-prediction-plugin with

pipx inject sparv-pipeline sparv-word-prediction-plugin

Usage

Depending on how many exlicit exports of annotations you have you can decide to use this annotation exclusively by adding it as the only annotation to export under xml_export:

xml_export:
    annotations:
        - <token>:word_precition.word-prediction--kb-bert

To use it together with other annotations you might add it under export:

export:
    annotations:
        - <token>:word_prediction.transformer-neighbour
        ...

6167652d656e6372797074696f6e2e6f72672f76310a2d3e20736372797074207a656d436e4c4c78765047666e43362f587044376b412031380a7161396d38716b42516259707833424a4947734158754c63302f456d615150485675356856357a73354f490a2d2d2d204c625a6a4d4c613873664e676b777569776b614738556a663348363934704c593165666f484332763859550aa3249e5de404da772ecc434222b10b2b8d8a92dc31a15df728502bc9995b8d5d0ac223efe57cbe708f19cadaa04bf8c67dbe5afa9575ccd5571ba6a2b10225b608d02b03204f3555c98182f2afb78654a7fc123ee31d574ae9f838f16d8638ce4a3e758e542fb23909cf22ba99991362132a0d7c81e34f60f0f16e581f01ed171b3a3efd23d1a6f3ac9c722da324ee2285db5a85a98372a022e3e375f194b7aab5412f50d940cbfd2bb303f2abfd2527ec0e4533f468f5b5a8acbd4289d11af9e0c7bdad2f1740d08ba97369592cc6274edecf

Configuration

You can configure this plugin by choosing a huggingface model, huggingface transformer and the number of neighbours to generate.

Model

The model defaults to KBLab/bert-base-swedish-cased but can be configured in config.yaml:

word_prediction:
    model: "KBLab/bert-base-swedish-cased"

Tokenizer

The tokenizer defaults to KBLab/bert-base-swedish-cased but can be configured in config.yaml:

word_prediction:
    tokenizer: "KBLab/bert-base-swedish-cased"

Number of Neighbours

The number of neighbours defaults to 5 but can be configured in config.yaml:

word_prediction:
    num_neighbours: 5

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparv_word_prediction_plugin-0.3.0.tar.gz (104.0 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page