Skip to main content

Converts English tokens into the equivalent Sinhala representation using IPA (International Phonetic Alphabet)

Project description

SEETM (Sinhala-English Equivalent Token Mapper) allows creating equivalent token maps and replace them with a base token to avoid OOV tokens and generate a single feature for all equivalent tokens in a Sinhala-English code-switching dataset in rasa-based conversational AIs.

Features

  • Allows mapping multiple equivalent tokens into a base token
  • Fully supports rasa 2.8.x projects
  • Provides an easy-to-use CLI
  • Provides an efficient server-based GUI
  • Provides a fully-functional custom whitespace tokenizer
  • Fully-supports Sinhala in the GUI

What's Cooking?

  • Mapping suggestions in the SEETM server GUI
  • Automatically generated mappings

Limitations and Known Issues

  • Should manually add the SEETM tokenizer to the rasa pipeline or else the token maps are not taking any effect
  • IPA-based suggestions could contain slight changes based on th IPA mapping origin. (SEETM uses CMU)

Resources and References

📒 Docs: https://seetm.github.io
📦 PyPi: https://pypi.org/project/seetm/1.1.1/
🪵 Full Changelog: https://github.com/SEETM-NLP/seetm/blob/main/CHANGELOG.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seetm-1.1.1.tar.gz (964.8 kB view details)

Uploaded Source

Built Distribution

seetm-1.1.1-py3-none-any.whl (990.9 kB view details)

Uploaded Python 3

File details

Details for the file seetm-1.1.1.tar.gz.

File metadata

  • Download URL: seetm-1.1.1.tar.gz
  • Upload date:
  • Size: 964.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for seetm-1.1.1.tar.gz
Algorithm Hash digest
SHA256 2fc4767d0a197c11f95e7820ff3cbb5a5f5eec076e5b6cd3f4e00ea38a11d8d5
MD5 0448aa9d1901094ce5f3f0820c58e0a0
BLAKE2b-256 c2ab269f62ba42b6881bf733db7e6da88af9f57d4ea0869f8b339c7a44254209

See more details on using hashes here.

File details

Details for the file seetm-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: seetm-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 990.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for seetm-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 89585e74dc4b0d44c98654c8709442967861679373af72d3464fb038ea5fec5d
MD5 734a6cfaa7126d5fbc62e840757b6731
BLAKE2b-256 22263c9787637967becb38301d5f9cffdc67bff1e99740d6699c3c1d8e471ba2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page