Skip to main content

A morphological dictionary tool.

Project description

# Morpho-Syntactic "MoSyn" API
## UAG - BIG DATA RESEARCH GROUP

----

### Description
This API has been created by the Universidad Autónoma de Guadalajara(UAG) Big Data Research Group as a Natural Language Processing tool. It provides a python library with functions that help to perform morphological anaysis on texts written in Spanish.

- Quick overview on what morphological analysis is in the following video: [El análisis morfológico de una oración](https://www.youtube.com/watch?v=BgAHya5ejJ8)
- Link to EAGLES standard:[INTRODUCCIÓN A LAS ETIQUETAS EAGLES](http://www.cs.upc.edu/~nlp/tools/parole-sp.html)

Additional information about the creation of the morphological dictionary can be found in the following Article:
* P.J. Castro Pérez, A.A. García Fuentes, M. E. Huerta Arreola, R. Dávila Pérez. (2016). Machine Readable Dictionary for Mexican Spanish. En Tecnologías Modernas para la Industria y la Educación. Cuernavaca, Morelos, Mexico: Institute Eng Electric Electronics Morelos Section, S.C. (607-95255).

----


## Installation
The following packages are required to install mosyn
- Python: [https://www.python.org/](https://www.python.org/)
- PyPI: [https://pip.pypa.io/en/stable/installing/](https://pip.pypa.io/en/stable/installing/)
- NLTK: [http://www.nltk.org/](http://www.nltk.org/)

Once having installed the depenencies above then install mosyn:
```
# pip install mosyn
```

If at any point the following error appears:
```python
Resource u'tokenizers/punkt/english.pickle' not found. Please
use the NLTK Downloader to obtain the resource:

>>>nltk.download()

Searched in:
- '/home/ec2-user/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- u''
```
try the instructions in the following link:
[http://stackoverflow.com/questions/26570944/resource-utokenizers-punkt-english-pickle-not-found](http://stackoverflow.com/questions/26570944/resource-utokenizers-punkt-english-pickle-not-found)




## Running examples
Examples may need to be downloaded from GitHub and locally copied. Examples should be copied to the same level where mosyn has been installed for they to work out of the box.

Use the command `pydoc mosyn´ to find out the directory where mosyn has been installed installed to:
```
$ pydoc mosyn
```

After hit enter a screen with a text similar to the following should appear:
```
Help on package mosyn:

NAME
mosyn - # -*- coding: iso-8859-15 -*-

FILE
/Library/Python/2.7/site-packages/mosyn-1.0.5-py2.7.egg/mosyn/__init__.py

PACKAGE CONTENTS
__main__
mosyn
util (package)

(END)
```

In that example mosyn is installed in `/Library/Python/2.7/site-packages/mosyn-1.0.5-py2.7.egg/mosyn`. Download examples directory to that location:
```
-rw-r--r-- 1 root wheel 413 Sep 20 23:18 __init__.py
-rw-r--r-- 1 root wheel 439 Sep 20 23:18 __init__.pyc
-rw-r--r-- 1 root wheel 265 Sep 20 23:18 __main__.py
-rw-r--r-- 1 root wheel 256 Sep 20 23:18 __main__.pyc
drwxr-xr-x 3 root wheel 102 Sep 20 23:18 dict
drwxr-xr-x 8 root wheel 272 Sep 20 23:37 examples <<===
-rw-r--r-- 1 root wheel 12023 Sep 20 23:18 mosyn.py
-rw-r--r-- 1 root wheel 12875 Sep 20 23:18 mosyn.pyc
drwxr-xr-x 12 root wheel 408 Sep 20 23:18 util
```


Navigate to that directory and execute one of the examples; e.g.:
```
$ cd examples
python python2.x/parseFileSample.py
Processing: Poema20.txt.
.............................................................
" PUEDO " ( lema: poder )
VMIP1S0 -> singular verb without gender
----------------------------------------------------

" escribir " ( lema: escribir )
V0N0000 -> undefined number verb without gender
----------------------------------------------------

" los " ( lema: el )
DA0MP0 -> plural male determinant
PP3MPA00 -> plural male pronoun
NCMS000 -> singular male name
----------------------------------------------------

. . .

----------------------------------------------------

" escribo " ( lema: escribir )
V0IP1S0 -> singular verb without gender
VMIP1S0 -> singular verb without gender
----------------------------------------------------

" . " ( lema: . )
FP -> undefined number punctuation without gender
----------------------------------------------------
```

## Contact
Please address questions to uagdataanalysis@gmail.com
Report a bug by creating an issue in the following link:
[https://github.com/uagdataanalysis/mosynapi/issues](https://github.com/uagdataanalysis/mosynapi/issues)


<div style="text-align:center"><a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.</CENTER></div>


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mosyn-1.1.0.tar.gz (5.0 MB view details)

Uploaded Source

Built Distribution

mosyn-1.1.0-py3-none-any.whl (5.3 MB view details)

Uploaded Python 3

File details

Details for the file mosyn-1.1.0.tar.gz.

File metadata

  • Download URL: mosyn-1.1.0.tar.gz
  • Upload date:
  • Size: 5.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.7.0

File hashes

Hashes for mosyn-1.1.0.tar.gz
Algorithm Hash digest
SHA256 64fb3ba953f431e2bce88213d84787453865dce0d17828430343e62e634189e9
MD5 81e90e52af2f06e01169e12fd05597cb
BLAKE2b-256 43688fdda40cb7096f7124fd46f47bc527898a49801fa8e2f8a50d98411ab87f

See more details on using hashes here.

File details

Details for the file mosyn-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: mosyn-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.7.0

File hashes

Hashes for mosyn-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f3dd413f33474fcd6f5037c0aca35f517a0be138dcda5cd6c4b8ed81c4faebab
MD5 001816d27d7f5dd7bdd23a5d90111122
BLAKE2b-256 e3f56fc04a1011aaba8fc8cbb77696bd6baa11861a5a0af1567b72f6b7474c62

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page