Converts English tokens into the equivalent Sinhala representation using IPA (International Phonetic Alphabet)
Project description
SEETM 0.0.1a1 Release
SEETM (Sinhala-English Equivalent Token Mapper) allows creating equivalent token maps and replace them with a base token to avoid OOV tokens and generate a single feature for all equivalent tokens in a Sinhala-English code-switching dataset in rasa-based conversational AIs.
Features
- Allows mapping multiple equivalent tokens into a base token
- Fully supports rasa 2.8.x projects
- Provides an easy-to-use CLI
- Provides an efficient server-based GUI
- Provides a fully-functional custom whitespace tokenizer
- Fully-supports Sinhala in the GUI
What's Cooking?
- Mapping suggestions in the SEETM server GUI
- Automatically generated mappings
Limitations and Known Issues
- Should manually add the SEETM tokenizer to the rasa pipeline or else the token maps are not taking any effect
- IPA-based suggestions could contain slight changes based on th IPA mapping origin. (SEETM uses CMU)
Resources and References
- CMU Pronunciation Dictionary
- eng-to-ipa pip package (GitHub)
📒 Docs: https://seetm.github.io
📦 PyPi: https://pypi.org/project/seetn/0.0.1a1/
🪵 Full Changelog: Refer the relevant GitHub branch (v0.0.1a1)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
seetm-0.0.1a2.tar.gz
(42.5 kB
view details)
Built Distribution
seetm-0.0.1a2-py3-none-any.whl
(990.2 kB
view details)
File details
Details for the file seetm-0.0.1a2.tar.gz
.
File metadata
- Download URL: seetm-0.0.1a2.tar.gz
- Upload date:
- Size: 42.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 614806b604c7653a0a24cde7d5a8c8ccd59ebd0963b1b39db3248a393cafa651 |
|
MD5 | f74c0a6fcc6fa10bf876afc263617a95 |
|
BLAKE2b-256 | 5505fa2940f86ccc6bea13d9f3ae662d3549d4abf1edd378aaf9514797333598 |
File details
Details for the file seetm-0.0.1a2-py3-none-any.whl
.
File metadata
- Download URL: seetm-0.0.1a2-py3-none-any.whl
- Upload date:
- Size: 990.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2627a52d89e97d2310ed6c29e14a574c4a996e21aac3c00277e502165dff498f |
|
MD5 | 3249a8fb4e0f616d732caa185ade4a0f |
|
BLAKE2b-256 | 52415b66b04ac1fa20497cbb7d7b22d4268f812821afb0eebef9e4e3d9e7c155 |