Sparv plugin for using wsd-rs with Sparv.
Project description
sparv-sbx-wsd-rs
This plugin to sparv is a rewrite of the the internal module wsd, that use saldowsd-rs instead of saldowsd.jar.
The improvements to the internal module wsd are:
- Easier setup, the binary
saldowsdis installed as a Python package from PyPI, instead of manual install ofsaldowsd.jar. - Faster analysis,
saldowsd-rsis about 13% faster thansaldowsd.jar. - Uses less memory,
saldowsd-rsuses about 35% less memory thansaldowsd.jar.
Faster than using saldowsd.jar
The running time for sbx_wsd_rs that uses saldowsd-rs is 12.8% faster than using Java version. See results from
running both annotations on vivill on our server wombat.
[username@wombat vivill]$ sparv run --stats
Task Time taken Percentage
wsd:annotate 0:11:25 1.3%
sbx_wsd_rs:annotate 0:09:57 1.1%
Memory usage
Loading models and running a simple example (not using Sparv for this). Rust version uses 35% less memory.
Measured with heaptrack.
| Tool | Top-RSS |
|---|---|
saldowsd (Rust) |
914 Mb |
saldowsd.jar (Java) |
1.4 Gb |
An example of the output from Sparv can be seen here.
The annotations are probalistic, so they always differ a bit (wsd.sense differs with itself for different runs).
Example of differences:
- anslag:
|anslag..1:0.993|anslag..2:0.004|anslag..3:0.004|!=|anslag..1:0.992|anslag..3:0.004|anslag..2:0.004| - avvikelse:
|avvikelse..1:0.978|avvikelse..2:0.022|!=|avvikelse..1:0.977|avvikelse..2:0.023| - utskottets:
|utskott..2:0.835|utskott..3:0.109|utskott..1:0.056|!=|utskott..2:0.843|utskott..3:0.101|utskott..1:0.056| - särskilda:
|särskilja..1:0.587|särskild..1:0.413|!=|särskilja..1:0.589|särskild..1:0.411|
Changelog
This project keeps a changelog.
Minimum supported Python version
This library tries to support as many Python versions as possible. When a Python version is added or dropped, this library's minor version is bumped.
- v0.1.0: Python 3.11
License
This repository is licensed under the MIT license.
Development
Development prerequisites
For starting to develop on this repository:
- Clone the repo
git clone https://github.com/spraakbanken/sparv-sbx-wsd-rs.git - Setup environment:
make dev - Install
pre-commithooks:pre-commit install
Do your work.
Tasks to do:
- Test the code with
make testormake test-w-coverage. - Test the examples with
make test-example-small. - Lint the code with
make lint. - Check formatting with
make check-fmt. - Format the code with
make fmt. - Type-check the code with
make type-check.
This repo uses conventional commits.
Release a new version
- Prepare the CHANGELOG:
make prepare-releaseand then editCHANGELOG.md. - Add to git:
git add CHANGELOG.md - Commit with
git commit -m 'chore(release): prepare release'orcog commit chore 'prepare release' release. - Bump version (depends on `bump-my-version)
- install with
uv tool install bump-my-version - Major:
make bumpversion part=major - Minor:
make bumpversion part=minor - Patch:
make bumpversion part=patchormake bumpversion
- install with
- Push
mainand tags to GitHub:git push main --tagsormake publish- GitHub Actions will build, test and publish the package to PyPi.
- Add metadata for Språkbanken's resource
- Generate metadata:
make generate-metadata - Upload the files from
examples/metadata/export/sbx_metadata/analysisto https://github.com/spraakbanken/metadata/tree/main/yaml/analysis.
- Generate metadata:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sparv_sbx_wsd_rs-0.1.1.tar.gz.
File metadata
- Download URL: sparv_sbx_wsd_rs-0.1.1.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.11 {"installer":{"name":"uv","version":"0.9.11"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f23487d25a6bac3242910c8e4443ea957216b5e30a7f947b5a40fe3ff2fec4c5
|
|
| MD5 |
8c15f723fbdf90c19415d7164e115935
|
|
| BLAKE2b-256 |
b2ca203bf316f928658b31f9f71dc416621ba096c2ac2a26d7fa84f2057008ec
|
File details
Details for the file sparv_sbx_wsd_rs-0.1.1-py3-none-any.whl.
File metadata
- Download URL: sparv_sbx_wsd_rs-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.11 {"installer":{"name":"uv","version":"0.9.11"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6276762392f559478a0ae186d98037de98cad141d8a3028e997440457c22bacc
|
|
| MD5 |
b0fa0fc0fdccddb17a18119193573a83
|
|
| BLAKE2b-256 |
afe61998738c76c7b45c8abe2864f1b20a9e3d0441eeb2acfe1162906804b7c0
|