Skip to main content

WFST for Ukrainian Inverse Text Normalization (ITN) based on NVIDIA NeMo and Pynini

Project description

WFST for Ukrainian ITN

Simple WFST for Ukrainian ITN based on NVIDIA NeMo and Pynini

Installation

pip install ukr-itn

Usage

from ukr.wfst import graph, apply_fst_text

apply_fst_text("це трапилося дві тисячі дев'ятнадцятого числа", graph)  # це трапилося 2019 числа
apply_fst_text("мінус пять цілих одна десята відсотка", graph)  # -5.1 %
apply_fst_text("двадцять дві тисячі сто один", graph)  # 22101

JSON output

For more advanced usage you can get json output

from ukr.wfst import json_graph, apply_fst_text

apply_fst_text("це трапилося дві тисячі дев'ятнадцятого числа", json_graph)
# >>> '[{"word": "це"}, {"word": "трапилося"}, {"ordinal": "2019"}, {"word": "числа"}]' 

How it works

We have two king of FST: taggers and verbalizers

This is a tagger:

from ukr.wfst import classifyFst, apply_fst_text

apply_fst_text("мінус пять цілих одна десята відсотка", classifyFst.fst)  

will return "measure { decimal { negative: "true" integer_part: "5" fractional_part: "1" } units: "%" }"

And this is a verbalizers

from ukr.wfst import verbalizeFinalFst, apply_fst_text

apply_fst_text('measure { decimal { negative: "true" integer_part: "5" fractional_part: "1" } units: "%" }', verbalizeFinalFst.fst)  

will return -5.1 %

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ukr_itn-0.1.5.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

ukr_itn-0.1.5-py3-none-any.whl (34.4 kB view details)

Uploaded Python 3

File details

Details for the file ukr_itn-0.1.5.tar.gz.

File metadata

  • Download URL: ukr_itn-0.1.5.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.13

File hashes

Hashes for ukr_itn-0.1.5.tar.gz
Algorithm Hash digest
SHA256 2a40a97afaa9bdce1159cf62de2ee89e6b10b1899ed6831a08e63d41da3ab931
MD5 4498f8ce8c8dfae6081d256350726a54
BLAKE2b-256 d458d6bf6e226674b4fb557c8e3c8225300ce7869c6a24556f6c9899c28faaf0

See more details on using hashes here.

File details

Details for the file ukr_itn-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: ukr_itn-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 34.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.13

File hashes

Hashes for ukr_itn-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c02acbef3540d606c5a2faa26c40e02a6b6a0a1b3a33be73d4ef2950ac20aeac
MD5 0c84eda69109b231b4a5df9e415722a7
BLAKE2b-256 b2ac4c88b2f0ed56469b1eab3743059cf4f0cd0dc3f0be95d7753f404a93cced

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page