Skip to main content

Python wrapper for the Yandex MyStem 3

Project description

Introduction

This module contains a wrapper for an excellent morphological analyzer for Russian language Yandex Mystem 3.0 released in June 2014. A morphological analyzer can perform lemmatization of text and derive a set of morphological attributes for each token. For more details about the algorithm see I. Segalovich «A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine», MLMTA-2003, Las Vegas, Nevada, USA.

Python is the language of choice for many computational linguists, inlcluding those working with Russian language. The main motivation for this development was absence of any Python wrapper for the Mystem, a one of the most popular morphological analyzers for Russian language along with the PyMorphy2, the TreeTagger and AOT.

The third version of Mystem introduces several importaint improvements, most importaintly part-of-speech disambiguation. Our wrapper runs the Mystem in the mode which performs POS disambiguation.

This wrapper is open sources under LGPLv3 license. However, please consider that the Yandex Mystem is not open source and licensed under conditions of the Yandex License.

System Requrements

The wrapper works with CPython 2.6+/3.3+ and PyPy 1.9+.

The wrapper was tested on Ubuntu Linux 12.04+, Mac OSX 10.9+ and Windows 7+.

Installation

  1. Stable version: https://pypi.python.org/pypi/pymystem3. You can install it using pip:

    pip install pymystem3
  1. Latest version: https://github.com/Digsolab/pymystem3

A Quick Example

>>> from pymystem3 import Mystem
>>> text = "Красивая мама красиво мыла раму"
>>> m = Mystem()
>>> lemmas = m.lemmatize(text)
>>> print(''.join(lemmas))
красивый мама красиво мыть рама

Issues

The current version can be considered as an alpha version, so please let us know if something does not work as expected. Please report any bugs or requests that you have using the GitHub issue tracker (https://github.com/Digsolab/pymystem3/issues)!

Authors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymystem3-0.1.0.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

pymystem3-0.1.0-py2.7.egg (19.8 kB view details)

Uploaded Egg

File details

Details for the file pymystem3-0.1.0.tar.gz.

File metadata

  • Download URL: pymystem3-0.1.0.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pymystem3-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0e4b71ba537d1cc82adb5836843b083e19035e049a0a59f53ff62c8138090550
MD5 1dee57d8d230d6cee8f7d2c30197c1c0
BLAKE2b-256 954ae96136984d40ef4db56be953bba02c99f0265fc6c16594bffb4a61721a60

See more details on using hashes here.

File details

Details for the file pymystem3-0.1.0-py2.7.egg.

File metadata

  • Download URL: pymystem3-0.1.0-py2.7.egg
  • Upload date:
  • Size: 19.8 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pymystem3-0.1.0-py2.7.egg
Algorithm Hash digest
SHA256 864045321b77084590dc3c4ecb6c15a5f8a1b3b9c8ffd3b845a40b9db3876dd5
MD5 14437d610e7b174f482869cb567609d8
BLAKE2b-256 1f2df0e4c909a70c2dd551bcdb57b48a65a5d2b206683eb25703b1deac2e75b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page