Converting universal tags to Apertium tags.
Project description
apertium2ud
Obtaining the mapping between the two tagsets based on the information from Apertium Wiki.
Loosely based on this code, hence the GPLv3 license.
To install, run
python -m pip install apertium2ud
The latest uploaded version is 0.0.4.
NB! The latest version from PyPI (yes, you can install the tool via pip) is equipped with the apertium-kir .udx
file rules.
To build the machine-readable mapping, run
python apertium_wiki_parser.py
Apertium to Universal tags
>>> from apertium2ud.convert import a2ud
>>> tags = ["n", "pl", "acc"]
>>> a2ud(tags)
(['NOUN'], ['Number=Plur', 'Case=Acc'])
>>> tags_sophisticated = ["v", "tv", "ger", "nom", "cop", "aor", "p3", "pl"]
>>> a2ud(tags_sophisticated)
(['VERB', 'AUX'], ['Subcat=Tran', 'VerbForm=Vnoun', 'Case=Nom', 'Tense=Past', 'Person=3', 'Number=Plur'])
Universal tags to Apertium
So far the conversion is far from perfect
Кыз NOUN {'Number[psor]=Sing', 'Number=Sing', 'Case=Nom', 'Person[psor]=3', 'Person=3'} ->
<px3sg><n><subj?nom?><sg><p3><px3sp>
досуна NOUN {'Number[psor]=Sing', 'Number=Sing', 'Person[psor]=3', 'Case=Dat', 'Person=3'} ->
<px3sg><n><sg><dat><p3><px3sp>
кат NOUN {'Case=Nom', 'Person=3', 'Number=Sing'} ->
<n><subj?nom?><sg><p3>
жазган VERB {'Aspect=Perf', 'Polarity=Pos', 'Number=Sing', 'Tense=Past', 'Person=3', 'Evident=Fh'} ->
<past3p><vblex?v?vbmod?><sg><aff><aor?past?pret?><perf><p3>
. PUNCT set() ->
<sent?apos?percent?clb?punct?>
TODO
- Should sections
chunks
and XML tags be added? No. - Tests: Apertium -> UD -> Apertium, UD -> Apertium -> UD (sometimes losses are inevitable)
- Add the possibility to add the rules based on a
.udx
file, which usually describes custom tags
How to cite
Greatly appreciated, if you use this work.
@misc{apertium2ud2023alekseev,
title = {{alexeyev/apertium2ud: mapping tagsets}},
year = {2023},
url = {https://github.com/alexeyev/apertium2ud}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
apertium2ud-0.0.7.tar.gz
(24.2 kB
view details)
Built Distribution
File details
Details for the file apertium2ud-0.0.7.tar.gz
.
File metadata
- Download URL: apertium2ud-0.0.7.tar.gz
- Upload date:
- Size: 24.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a0aae24b3311324d6fadb16d9094a32959183e28f5c425003ce9ffe0d874a5f |
|
MD5 | 38e7acd12a008dbc8586d49db0e35e4d |
|
BLAKE2b-256 | 72751b97a636d2bb1eece963ba8dd40ccaadd2269ee116b80cbbc66543bc12e7 |
File details
Details for the file apertium2ud-0.0.7-py3-none-any.whl
.
File metadata
- Download URL: apertium2ud-0.0.7-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3d5d6d9a41d3a685caa1bf692e8d3362e6f2f8d0ca751fec7c551b51b310d6be |
|
MD5 | 7d07df9a6d9689de208664b35d96c98b |
|
BLAKE2b-256 | a8bcfa7abe4fea29e4e17b29e0dba2e41b383ca7c62b0879c893611dcb56abef |