WFST for Ukrainian Inverse Text Normalization (ITN) based on NVIDIA NeMo and Pynini
Project description
WFST for Ukrainian ITN
Simple WFST for Ukrainian ITN based on NVIDIA NeMo and Pynini
Usage
from ukr.wfst import graph, apply_fst_text
apply_fst_text("це трапилося дві тисячі девятнадцятого числа", graph) # це трапилося 2019 числа
apply_fst_text("мінус пять цілих одна десята відсотка", graph) # -5.1 %
apply_fst_text("двадцять дві тисячі сто один", graph) # 22101
How it works
We have two king of FST: taggers and verbalizers
This is a tagger:
from ukr.wfst import tMeasureFst, apply_fst_text
apply_fst_text("мінус пять цілих одна десята відсотка", tMeasureFst)
will return "measure { decimal { negative: "true" integer_part: "5" fractional_part: "1" } units: "%" }"
And this is a verbalizers
from ukr.wfst import vMeasureFst, apply_fst_text
apply_fst_text('measure { decimal { negative: "true" integer_part: "5" fractional_part: "1" } units: "%" }', vMeasureFst)
will return -5.1 %
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ukr_itn-0.1.1.tar.gz
(12.0 kB
view details)
Built Distribution
ukr_itn-0.1.1-py3-none-any.whl
(19.9 kB
view details)
File details
Details for the file ukr_itn-0.1.1.tar.gz
.
File metadata
- Download URL: ukr_itn-0.1.1.tar.gz
- Upload date:
- Size: 12.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c762861b9319f04a50bd5f2cda3c562d6fae3b7386de1c262740d3f6a1b91cd4 |
|
MD5 | e82ba9a787a496f54d1be8d39d1386a6 |
|
BLAKE2b-256 | 74554ef89102c385579932c08a1b5f6768c4662b17f35721a3cd096449288006 |
File details
Details for the file ukr_itn-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: ukr_itn-0.1.1-py3-none-any.whl
- Upload date:
- Size: 19.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e7240fd1ef3e3817581297d3292c7d63a6bcda759ce913a069f49800ec7ad6f |
|
MD5 | 1e8aaa9cc557f989409f3c0765993dc2 |
|
BLAKE2b-256 | 05c7654725d93cf01be7824360ad4f5adefa67d7b444140269101bd2498a616d |