A sparv plugin for computing word neighbours using a BERT model.
Project description
sparv-word-prediction-plugin
Plugin for applying bert masking as a Sparv annotation.
Install
First, install Sparv, as suggested:
pipx install sparv-pipeline
Then install install sparv-word-prediction-plugin
with
pipx inject sparv-pipeline sparv-word-prediction-plugin
Usage
Depending on how many exlicit exports of annotations you have you can decide to use this
annotation exclusively by adding it as the only annotation to export under xml_export
:
xml_export:
annotations:
- <token>:word_precition.word-prediction--kb-bert
To use it together with other annotations you might add it under export
:
export:
annotations:
- <token>:word_prediction.transformer-neighbour
...
6167652d656e6372797074696f6e2e6f72672f76310a2d3e20736372797074207a656d436e4c4c78765047666e43362f587044376b412031380a7161396d38716b42516259707833424a4947734158754c63302f456d615150485675356856357a73354f490a2d2d2d204c625a6a4d4c613873664e676b777569776b614738556a663348363934704c593165666f484332763859550aa3249e5de404da772ecc434222b10b2b8d8a92dc31a15df728502bc9995b8d5d0ac223efe57cbe708f19cadaa04bf8c67dbe5afa9575ccd5571ba6a2b10225b608d02b03204f3555c98182f2afb78654a7fc123ee31d574ae9f838f16d8638ce4a3e758e542fb23909cf22ba99991362132a0d7c81e34f60f0f16e581f01ed171b3a3efd23d1a6f3ac9c722da324ee2285db5a85a98372a022e3e375f194b7aab5412f50d940cbfd2bb303f2abfd2527ec0e4533f468f5b5a8acbd4289d11af9e0c7bdad2f1740d08ba97369592cc6274edecf
Configuration
You can configure this plugin by choosing a huggingface model, huggingface transformer and the number of neighbours to generate.
Model
The model defaults to KBLab/bert-base-swedish-cased
but can be configured in config.yaml
:
word_prediction:
model: "KBLab/bert-base-swedish-cased"
Tokenizer
The tokenizer defaults to KBLab/bert-base-swedish-cased
but can be configured in config.yaml
:
word_prediction:
tokenizer: "KBLab/bert-base-swedish-cased"
Number of Neighbours
The number of neighbours defaults to 5
but can be configured in config.yaml
:
word_prediction:
num_neighbours: 5
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sparv_word_prediction_plugin-0.3.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35bd315eea543c4b264290b102b8b093c605b89741777af65e170a308e1ca9e9 |
|
MD5 | f4925300e0e49cdf9f3295b0b6b8635e |
|
BLAKE2b-256 | f1b5aba8fe4ed4192a930328e40fcde30811a4894fdccf5a94ea16146c724c46 |
Hashes for sparv_word_prediction_plugin-0.3.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 215f31432d34ea6b50f48211ea742296ddc859b8e6c948666f9b239508b16280 |
|
MD5 | 8ec08b98170a22a51682f2448b4cef47 |
|
BLAKE2b-256 | 770c19440fea61f1252cc6dd6d687de51ef75227fd2d59cb76079e429b0c2d29 |