Annotated morphology in the world's languages
Project description
UniMorph: The Universal Morphology Initiative
The Universal Morphology (UniMorph) project is a collaborative effort to improve how NLP handles complex morphology in the world’s languages. The goal of UniMorph is to annotate morphological data in a universal schema that allows an inflected word from any language to be defined by its lexical meaning, typically carried by the lemma, and by a rendering of its inflectional form in terms of a bundle of morphological features from our schema. The specification of the schema is described in Sylak-Glassman (2016).
This tool provides turnkey command-line access to morphological annotations in over 100 languages.
To install the UniMorph Python extension, install it from PyPI:
pip3 install unimorph
The tool will then be available to you from the command-line as unimorph
. To see the features available, run unimorph --help
.
Usage
Query the available UniMorph languages' ISO 639-3 codes.
unimorph list
Give the complete paradigm for a lemma.
unimorph inflect --word recken --lang deu
Get a particular form of the lemma.
unimorph inflect --word recken --features V;IND;PRS;2;SG --lang deu
Analyze a word form: What are its lemma and features?
unimorph analyze --word gereckt --lang deu
(You can also use short param names.)
unimorph analyze -w gereckt -l deu
Records in UniMorph's inflectional databases cannot hope to exhaustively cover a language's lexicon, especially in light of novel words. If a word is missing, let us know.
Contribution
UniMorph is an open project! We want you!
Found a bug? Want to contribute source code? Submit an issue or pull request to the appropriate GitHub repository. Language-specific corrections or additions should be marked in their corresponding repository; improvements to the unimorph
command-line tool should be noted in the unimorph
repository.
Citation
If you use the latest version of the UniMorph datasets (v2.0), please cite Kirov et al. (2018):
@inproceedings{kirov-etal-2018-unimorph,
title = "{U}ni{M}orph 2.0: Universal Morphology",
author = {Kirov, Christo and
Cotterell, Ryan and
Sylak-Glassman, John and
Walther, G{\'e}raldine and
Vylomova, Ekaterina and
Xia, Patrick and
Faruqui, Manaal and
Mielke, Sebastian and
McCarthy, Arya and
K{\"u}bler, Sandra and
Yarowsky, David and
Eisner, Jason and
Hulden, Mans},
booktitle = "Proceedings of the Eleventh International Conference on Language Resources and Evaluation ({LREC} 2018)",
month = may,
year = "2018",
address = "Miyazaki, Japan",
publisher = "European Language Resources Association (ELRA)",
url = "https://www.aclweb.org/anthology/L18-1293",
}
If you refer to the latest version of the universal annotation schema, please cite Sylak-Glassman et al. (2015):
@inproceedings{sylak-glassman-etal-2015-language,
title = "A Language-Independent Feature Schema for Inflectional Morphology",
author = "Sylak-Glassman, John and
Kirov, Christo and
Yarowsky, David and
Que, Roger",
booktitle = "Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)",
month = jul,
year = "2015",
address = "Beijing, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/P15-2111",
doi = "10.3115/v1/P15-2111",
pages = "674--680",
}
Advanced usage
unimorph
stores language databases in a default location. This can be overridden by setting the shell environment variable UNIMORPH
to the preferred folder.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file unimorph-0.0.4.tar.gz
.
File metadata
- Download URL: unimorph-0.0.4.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.6.0.post20191030 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 307a80b99017febc782fab965cb219c4814472fa298850d54c73d4ed55c5dd57 |
|
MD5 | 8af42303321bf7992a8ac27ec3a11b8d |
|
BLAKE2b-256 | 9f2f9e48c2635a86f7d2c29ec0d05dedde0fd2935f5a29264cf4235906f9a010 |
File details
Details for the file unimorph-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: unimorph-0.0.4-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.6.0.post20191030 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5edba2fa8a617b1fe5f6463b60d1ce18c0a116e6699804dffa4164a5c514bb69 |
|
MD5 | 75b1d453eee1119dc5b5ace9f0591cdf |
|
BLAKE2b-256 | 3c2a653f5cb0449b04019052f0f86fb9cdd89b55fa8b7e173ef91122641679ac |