Morphological Analyser for Tatar language
Project description
Morphological Parser of Tatar language. Uses HFST-tool. Web form which uses this tool: http://tatmorphan.pythonanywhere.com/
To install:
$ pip install py_tat_morphan
or
$ git clone https://yaugear@bitbucket.org/yaugear/py_tat_morphan.git
$ cd py_tat_morphan
$ python setup.pu install
To use lookup:
$ tat_morphan_lookup
To process text:
$ tat_morphan_process_text <filename>
To process whole folder:
$ tat_morphan_process_folder <path_from>
or
$ tat_morphan_process_folder <path_from> <path_to>
Note: if you do not provide <path_to>, programm puts analyzed texts into folder near initial with ‘_analyzed’ postfix. Eg, if <path_from>=’/home/ramil/mytexts/’, then <path_to>=’/home/ramil/mytexts_analyzed/’.
To use as python module:
>>> from py_tat_morphan.morphan import Morphan
>>> morphan = Morphan()
>>> print(morphan.analyse('урманнарга'))
урман+N+PL(ЛАр)+DIR(ГА);
>>> print(morphan.lemma('урманнарга'))
[u'\u0443\u0440\u043c\u0430\u043d']
>>> print(morphan.pos('урманнарга'))
[u'N']
>>> print(morphan.process_text('Без урманга барабыз.'))
Без
без+N+Sg+Nom;без+PN;
урманга
урман+N+Sg+DIR(ГА);
барабыз
бар+V+PRES(Й)+1PL(бЫз);
.
Type1
>>> print(morphan.analyse_text('Без урманга барабыз.'))
[[(u'\u0411\u0435\u0437', u'\u0431\u0435\u0437+N+Sg+Nom;\u0431\u0435\u0437+PN;'), (u'\u0443\u0440\u043c\u0430\u043d\u0433\u0430', u'\u0443\u0440\u043c\u0430\u043d+N+Sg+DIR(\u0413\u0410);'), (u'\u0431\u0430\u0440\u0430\u0431\u044b\u0437', u'\u0431\u0430\u0440+V+PRES(\u0419)+1PL(\u0431\u042b\u0437);'), (u'.', 'Type1')]]
>>> print(morphan.disambiguate_text('Язгы ташуларда көймә йөздерәбез.'))
[[(u'\u042f\u0437\u0433\u044b', u'\u044f\u0437\u0433\u044b+Adj;'), (u'\u0442\u0430\u0448\u0443\u043b\u0430\u0440\u0434\u0430', u'\u0442\u0430\u0448\u0443+N+PL(\u041b\u0410\u0440)+LOC(\u0414\u0410);\u0442\u0430\u0448\u044b+V+VN_1(\u0443/\u04af/\u0432)+PL(\u041b\u0410\u0440)+LOC(\u0414\u0410);'), (u'\u043a\u04e9\u0439\u043c\u04d9', u'\u043a\u04e9\u0439\u043c\u04d9+N+Sg+Nom'), (u'\u0439\u04e9\u0437\u0434\u0435\u0440\u04d9\u0431\u0435\u0437', u'\u0439\u04e9\u0437+V+CAUS(\u0414\u042b\u0440)+PRES(\u0419)+1PL(\u0431\u042b\u0437);\u0439\u04e9\u0437\u0434\u0435\u0440+V+PRES(\u0419)+1PL(\u0431\u042b\u0437);'), (u'.', 'Type1')]]
To test:
$ python setup.py test
For feedback:
Versions:
1.2.1 | Uses HFST python package
1.2.2 | Add tat_morphan_lookup and tat_morphan_process_text scripts to bin/
1.2.3 | Fixed exception dictionary
1.2.4 | Fixed to use C HFST package
1.2.5 | Fixed morphophonetic and morphotacktic rules
1.2.6 | Fixed dictionary collection
1.2.7 | Added morphological disambiguation stage using contextual rules methods
1.2.8 | Fixed bug with ‘-’
1.2.9 | Fixed exception dictionary
1.2.10 | Fixed bug with disambiguation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file py_tat_morphan-1.2.10.tar.gz.
File metadata
- Download URL: py_tat_morphan-1.2.10.tar.gz
- Upload date:
- Size: 3.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4dd7e4a415976035c5db7e55f24b52c05d5d85ed7f268f5cd4f9b0a04283962
|
|
| MD5 |
a71eb2622a323375131a60b686e733ed
|
|
| BLAKE2b-256 |
66cdded925bfe6893ce6d54d28577aaff3a6535694e30bc810ad809ed21cb177
|