Skip to main content

use MER inside python

Project description

Downloads

Use MER scripts inside python.

(from the MER repository)

MER is a Named-Entity Recognition tool which given any lexicon and any input text returns the list of terms recognized in the text, including their exact location (annotations).

Given an ontology (owl file) MER is also able to link the entities to their classes.

More information about MER can be found in:

  • MER: a Shell Script and Annotation Server for Minimal Named Entity Recognition and Linking, F. Couto and A. Lamurias, Journal of Cheminformatics, 10:58, 2018 [https://doi.org/10.1186/s13321-018-0312-9]
  • MER: a Minimal Named-Entity Recognition Tagger and Annotation Server, F. Couto, L. Campos, and A. Lamurias, in BioCreative V.5 Challenge Evaluation, 2017 [https://www.researchgate.net/publication/316545534_MER_a_Minimal_Named-Entity_Recognition_Tagger_and_Annotation_Server]

Documentation

https://merpy.readthedocs.io/en/latest/

Dependencies

awk

MER was developed and tested using the GNU awk (gawk) and grep. If you have another awk interpreter in your machine, there's no assurance that the program will work.

For example, to install GNU awk on Ubuntu:

sudo apt-get install gawk

Installation

pip install merpy

or

python setup.py install

Then you might want to update the MER scripts and download preprocessed data:

>>> import merpy
>>> merpy.download_mer()
>>> merpy.download_lexicons()

Basic Usage

>>> import merpy
>>> merpy.download_lexicons()
>>> merpy.process_lexicon("hp")
>>> document = 'Influenza, commonly known as "the flu", is an infectious disease caused by an influenza virus. Symptoms can be mild to severe. The most common symptoms include: a high fever, runny nose, sore throat, muscle pains, headache, coughing, and feeling tired'
>>> entities = merpy.get_entities(document, "hp") # get_entities_mp uses multiprocessing (set n_cores param)
>>> print(entities)
[['111', '115', 'mild', 'http://purl.obolibrary.org/obo/HP_0012825'], ['119', '125', 'severe', 'http://purl.obolibrary.org/obo/HP_0012828'], ['168', '173', 'fever', 'http://purl.obolibrary.org/obo/HP_0001945'], ['214', '222', 'headache', 'http://purl.obolibrary.org/obo/HP_0002315'], ['224', '232', 'coughing', 'http://purl.obolibrary.org/obo/HP_0012735'], ['246', '251', 'tired', 'http://purl.obolibrary.org/obo/HP_0012378'], ['175', '185', 'runny nose', 'http://purl.obolibrary.org/obo/HP_0031417']]
>>> lexicons = merpy.get_lexicons()
>>> merpy.show_lexicons()
lexicons preloaded:
['lexicon', 'go', 'cell_line_and_cell_type', 'chebi_lite', 'chemical', 'hp', 'disease', 'wordnet_nouns', 'hpo', 'radlex', 'doid', 'protein', 'hpomultilang', 'tissue_and_organ', 'mirna', 'subcellular_structure']

lexicons loaded ready to use:
['lexicon', 'doid', 'hp']

lexicons with linked concepts:
['doid', 'hp', 'go', 'chebi_lite', 'lexicon']
>>> merpy.create_lexicon(["gene1", "gene2", "gene3"], "genelist")
wrote genelist lexicon
>>> merpy.process_lexicon("genelist")
>>> merpy.delete_lexicon("genelist")
deleted genelist lexicon
>>> merpy.download_lexicon("https://github.com/lasigeBioTM/MER/raw/biocreative2017/data/ChEBI.txt", "chebi")
wrote chebi lexicon
>>> merpy.process_lexicon("chebi")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

merpy-1.2.tar.gz (16.3 kB view details)

Uploaded Source

Built Distributions

merpy-1.2.0-py3-none-any.whl (23.7 kB view details)

Uploaded Python 3

merpy-1.2-py3-none-any.whl (23.7 kB view details)

Uploaded Python 3

File details

Details for the file merpy-1.2.tar.gz.

File metadata

  • Download URL: merpy-1.2.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for merpy-1.2.tar.gz
Algorithm Hash digest
SHA256 ca350a71aa9faa8eb83b9a0bc4d2aae96c3c1103f56553aba9fe34859ec0f502
MD5 736173c1f202bd4164045fa25feae92c
BLAKE2b-256 faced714afb7d94beda520e6e1b349df748c227b3f50fc34ac19ea06615958b3

See more details on using hashes here.

File details

Details for the file merpy-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: merpy-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 23.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for merpy-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fe61adb74860245b0a86f4b8d292ed02530c454493c70f7ce169555016f77b64
MD5 f1bd0afa6a1472a4ef833bec13353492
BLAKE2b-256 657c07ed321888e9c3406d9b18f9f0829490111bf94f5167ee5feefb8cf83edc

See more details on using hashes here.

File details

Details for the file merpy-1.2-py3-none-any.whl.

File metadata

  • Download URL: merpy-1.2-py3-none-any.whl
  • Upload date:
  • Size: 23.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.5

File hashes

Hashes for merpy-1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e90c73f897240d17b8027e39833a8c64fb5d01accda448b8987bf14daebe7e4a
MD5 218d81ce95d7b4eb61885cfc956a409a
BLAKE2b-256 0929b793fd3bcb9af82fe45226b26932364a6580f6f369021ec78bf15a8affdc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page