An NLP library for small Uralic languages such as Skolt Sami, Moksha and such
Project description
Uralic NLP is a Python library for processing small Uralic languages. The languages that are currently supported are Skolt Sami, Ingrian, Meadow & Eastern Mari, Votic, Olonets-Karelian, Erzya, Moksha, Hill Mari, Udmurt, Tundra Nenets, Komi-Permyak and Finnish…
Currently, this tool provides uralicApi functionality which uses the API of sanat.csc.fi. Over this API, it’s possible to do morphological analysis, morphological generation, lemmatization and dictionary search for these languages. It is also possible to download the morphological models and constraint grammars to your computer for faster processing (see Further information for more).
This library provides Omorfi as a service for Finnish.
Usage
from uralicNLP import uralicApi
print uralicApi.analyze("voita", "fin") #Morphological analysis for the Finnish word form voita
print uralicApi.generate("käsi+N+Sg+Par", "fin") #Generates the singular partitive form of the Finnish word käsi
print uralicApi.dictionary_search("car", "sms") #Does a dictionary search for the word car in the Skolt Sami dictionary
print uralicApi.lemmatize("voita", "fin") #Lists possible lemmas for the Finnish word form voita
from uralicNLP.cg3 import Cg3
uralicApi.download("fin") #Downloads the CG and morphological models for Finnish
cg = Cg3("fin") #Creates a constraint grammar (CG) disambiguator object for Finnish
cg.disambiguate(["Kissa","voi","nauraa", "!"]) #Uses the CG to disambiguate the words in a tokenized sentence
Further information
A proper documentation is available in the Uralic NLP GitHub .
You might also be interested in using Korp on Python to access corpora of Uralic languages.
This library will have more functionality in the future as my PhD studies progress. This library and the API was created by Mika Hämäläinen .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file uralicNLP-1.0.5.tar.gz.
File metadata
- Download URL: uralicNLP-1.0.5.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.9.1 pkginfo/1.4.1 requests/2.18.4 setuptools/18.5 requests-toolbelt/0.8.0 tqdm/4.15.0 CPython/2.7.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2aedc8090ba6cf7167d542f6ea6369f1f6ad083d92315b77283212220978aa8
|
|
| MD5 |
97782807d3052154d021ab5c9f183981
|
|
| BLAKE2b-256 |
15e80e85292debe9cb5c5b69c9ab1b22f1a8c753d95302887d9ae2e3d7dc9e28
|
File details
Details for the file uralicNLP-1.0.5-py2.py3-none-any.whl.
File metadata
- Download URL: uralicNLP-1.0.5-py2.py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.9.1 pkginfo/1.4.1 requests/2.18.4 setuptools/18.5 requests-toolbelt/0.8.0 tqdm/4.15.0 CPython/2.7.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4d5878b192420b5c5650b56aa9fdf78b9b4320f7429e7955e0cc4881b59e0c7
|
|
| MD5 |
b5d06613673510ba1cc2c2e763fc6099
|
|
| BLAKE2b-256 |
7f8059ced34306ca12086eda1c1640aa9e64d9b8b7f43db82fe0390a21f005a1
|