Skip to main content

An NLP library for small Uralic languages such as Skolt Sami, Moksha and such

Project description

Uralic NLP is a Python library for processing small Uralic languages. The languages that are currently supported are Skolt Sami, Ingrian, Meadow & Eastern Mari, Votic, Olonets-Karelian, Erzya, Moksha, Hill Mari, Udmurt, Tundra Nenets, Komi-Permyak and Finnish.

Currently, this tool provides uralicApi functionality which uses the API of sanat.csc.fi. Over this API, it’s possible to do** morphological analysis**, morphological generation, lemmatization and dictionary search for these languages.

This library provides Omorfi as a service for Finnish.

Usage

from uralicNLP import uralicApi

print uralicApi.analyze("voita", "fin") #Morphological analysis for the Finnish word form voita

print uralicApi.generate("käsi+N+Sg+Par", "fin") #Generates the singular partitive form of the Finnish word käsi

print uralicApi.dictionary_search("car", "sms") #Does a dictionary search for the word car in the Skolt Sami dictionary

print uralicApi.lemmatize("voita", "fin") #Lists possible lemmas for the Finnish word form voita

Further information

A proper documentation is available in the Uralic NLP GitHub .

You might also be interested in using Korp on Python to access corpora of Uralic languages.

This library will have more functionality in the future as my PhD studies progress. This library and the API was created by Mika Hämäläinen .

Project details


Release history Release notifications

History Node

1.0.2

History Node

1.0.1

This version
History Node

1.0.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
uralicNLP-1.0.0-py2.py3-none-any.whl (4.4 kB) Copy SHA256 hash SHA256 Wheel py2.py3 Dec 7, 2017

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page