Skip to main content

An NLP library for small Uralic languages such as Skolt Sami, Moksha and such

Project description

Uralic NLP is a Python library for processing small Uralic languages. The languages that are currently supported are Skolt Sami, Ingrian, Meadow & Eastern Mari, Votic, Olonets-Karelian, Erzya, Moksha, Hill Mari, Udmurt, Tundra Nenets, Komi-Permyak and Finnish…

Currently, this tool provides uralicApi functionality which uses the API of sanat.csc.fi. Over this API, it’s possible to do morphological analysis, morphological generation, lemmatization and dictionary search for these languages. It is also possible to download the morphological models and constraint grammars to your computer for faster processing (see Further information for more).

This library provides Omorfi as a service for Finnish.

Usage

from uralicNLP import uralicApi

print uralicApi.analyze("voita", "fin") #Morphological analysis for the Finnish word form voita

print uralicApi.generate("käsi+N+Sg+Par", "fin") #Generates the singular partitive form of the Finnish word käsi

print uralicApi.dictionary_search("car", "sms") #Does a dictionary search for the word car in the Skolt Sami dictionary

print uralicApi.lemmatize("voita", "fin") #Lists possible lemmas for the Finnish word form voita

from uralicNLP.cg3 import Cg3

uralicApi.download("fin") #Downloads the CG and morphological models for Finnish

cg = Cg3("fin") #Creates a constraint grammar (CG) disambiguator object for Finnish

cg.disambiguate(["Kissa","voi","nauraa", "!"]) #Uses the CG to disambiguate the words in a tokenized sentence

Further information

A proper documentation is available in the Uralic NLP GitHub .

You might also be interested in using Korp on Python to access corpora of Uralic languages.

This library will have more functionality in the future as my PhD studies progress. This library and the API was created by Mika Hämäläinen .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uralicNLP-1.0.5.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uralicNLP-1.0.5-py2.py3-none-any.whl (11.3 kB view details)

Uploaded Python 2Python 3

File details

Details for the file uralicNLP-1.0.5.tar.gz.

File metadata

  • Download URL: uralicNLP-1.0.5.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.9.1 pkginfo/1.4.1 requests/2.18.4 setuptools/18.5 requests-toolbelt/0.8.0 tqdm/4.15.0 CPython/2.7.10

File hashes

Hashes for uralicNLP-1.0.5.tar.gz
Algorithm Hash digest
SHA256 f2aedc8090ba6cf7167d542f6ea6369f1f6ad083d92315b77283212220978aa8
MD5 97782807d3052154d021ab5c9f183981
BLAKE2b-256 15e80e85292debe9cb5c5b69c9ab1b22f1a8c753d95302887d9ae2e3d7dc9e28

See more details on using hashes here.

File details

Details for the file uralicNLP-1.0.5-py2.py3-none-any.whl.

File metadata

  • Download URL: uralicNLP-1.0.5-py2.py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.9.1 pkginfo/1.4.1 requests/2.18.4 setuptools/18.5 requests-toolbelt/0.8.0 tqdm/4.15.0 CPython/2.7.10

File hashes

Hashes for uralicNLP-1.0.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c4d5878b192420b5c5650b56aa9fdf78b9b4320f7429e7955e0cc4881b59e0c7
MD5 b5d06613673510ba1cc2c2e763fc6099
BLAKE2b-256 7f8059ced34306ca12086eda1c1640aa9e64d9b8b7f43db82fe0390a21f005a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page