Skip to main content

WFST for Ukrainian Inverse Text Normalization (ITN) based on NVIDIA NeMo and Pynini

Project description

WFST for Ukrainian ITN

Simple WFST for Ukrainian ITN based on NVIDIA NeMo and Pynini

Usage

from ukr.wfst import graph, apply_fst_text

apply_fst_text("це трапилося дві тисячі девятнадцятого числа", graph)  # це трапилося 2019 числа
apply_fst_text("мінус пять цілих одна десята відсотка", graph)  # -5.1 %
apply_fst_text("двадцять дві тисячі сто один", graph)  # 22101

How it works

We have two king of FST: taggers and verbalizers

This is a tagger:

from ukr.wfst import tMeasureFst, apply_fst_text

apply_fst_text("мінус пять цілих одна десята відсотка", tMeasureFst)  

will return "measure { decimal { negative: "true" integer_part: "5" fractional_part: "1" } units: "%" }"

And this is a verbalizers

from ukr.wfst import vMeasureFst, apply_fst_text

apply_fst_text('measure { decimal { negative: "true" integer_part: "5" fractional_part: "1" } units: "%" }', vMeasureFst)  

will return -5.1 %

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ukr_itn-0.1.1.tar.gz (12.0 kB view hashes)

Uploaded Source

Built Distribution

ukr_itn-0.1.1-py3-none-any.whl (19.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page