Skip to main content

use MER inside python

Project description

Downloads

Use MER scripts inside python.

(from the MER repository)

MER is a Named-Entity Recognition tool which given any lexicon and any input text returns the list of terms recognized in the text, including their exact location (annotations).

Given an ontology (owl file) MER is also able to link the entities to their classes.

More information about MER can be found in:

  • MER: a Shell Script and Annotation Server for Minimal Named Entity Recognition and Linking, F. Couto and A. Lamurias, Journal of Cheminformatics, 10:58, 2018 [https://doi.org/10.1186/s13321-018-0312-9]
  • MER: a Minimal Named-Entity Recognition Tagger and Annotation Server, F. Couto, L. Campos, and A. Lamurias, in BioCreative V.5 Challenge Evaluation, 2017 [https://www.researchgate.net/publication/316545534_MER_a_Minimal_Named-Entity_Recognition_Tagger_and_Annotation_Server]

** New ** ** NEW **

  • Package lexicons202103.tgz is available
  • Multilingual lexicons using DeCS

Documentation

https://merpy.readthedocs.io/en/latest/

Dependencies

awk

MER was developed and tested using the GNU awk (gawk) and grep. If you have another awk interpreter in your machine, there's no assurance that the program will work.

For example, to install GNU awk on Ubuntu:

sudo apt-get install gawk

Installation

pip install merpy

or

python setup.py install

Then you might want to update the MER scripts and download preprocessed data:

>>> import merpy
>>> merpy.download_mer()
>>> merpy.download_lexicons()

Basic Usage

>>> import merpy
>>> merpy.download_lexicons()
>>> merpy.process_lexicon("hp")
>>> document = 'Influenza, commonly known as "the flu", is an infectious disease caused by an influenza virus. Symptoms can be mild to severe. The most common symptoms include: a high fever, runny nose, sore throat, muscle pains, headache, coughing, and feeling tired ... Acetylcysteine for reducing the oxygen transport and caffeine to stimulate ... fever, tachypnea ... fiebre, taquipnea ... febre, taquipneia' 
>>> entities = merpy.get_entities(document, "hp") # get_entities_mp uses multiprocessing (set n_cores param)
>>> print(entities)
[['111', '115', 'mild', 'http://purl.obolibrary.org/obo/HP_0012825'], ['119', '125', 'severe', 'http://purl.obolibrary.org/obo/HP_0012828'], ['168', '173', 'fever', 'http://purl.obolibrary.org/obo/HP_0001945'], ['181', '185', 'nose', 'http://purl.obolibrary.org/obo/UBERON_0000004'], ['200', '206', 'muscle', 'http://purl.obolibrary.org/obo/UBERON_0005090'], ['214', '222', 'headache', 'http://purl.obolibrary.org/obo/HP_0002315'], ['224', '232', 'coughing', 'http://purl.obolibrary.org/obo/HP_0012735'], ['246', '251', 'tired', 'http://purl.obolibrary.org/obo/HP_0012378'], ['288', '294', 'oxygen', 'http://purl.obolibrary.org/obo/CHEBI_15379'], ['295', '304', 'transport', 'http://purl.obolibrary.org/obo/GO_0006810'], ['335', '340', 'fever', 'http://purl.obolibrary.org/obo/HP_0001945'], ['342', '351', 'tachypnea', 'http://purl.obolibrary.org/obo/HP_0002789'], ['175', '185', 'runny nose', 'http://purl.obolibrary.org/obo/HP_0031417'], ['187', '198', 'sore throat', 'http://purl.obolibrary.org/obo/HP_0033050']]
>>> entities = merpy.get_entities(document, "bireme_decs_por2020") 
>>> print(entities)
[['378', '383', 'febre', 'https://decs.bvsalud.org/ths/?filter=ths_regid&q=D005334'], ['385', '395', 'taquipneia', 'https://decs.bvsalud.org/ths/?filter=ths_regid&q=D059246']]
>>> lexicons = merpy.get_lexicons()
>>> merpy.show_lexicons()
lexicons preloaded:
['lexicon', 'bireme_decs_por2020', 'bireme_decs_spa2020', 'wordnet-hyponym', 'radlex', 'doid', 'bireme_decs_eng2020', 'go', 'hp', 'chebi_lite']

lexicons loaded ready to use:
['bireme_decs_por2020', 'chebi_lite', 'hp', 'bireme_decs_spa2020', 'wordnet-hyponym', 'doid', 'lexicon', 'radlex', 'go', 'bireme_decs_eng2020']

lexicons with linked concepts:
['bireme_decs_eng2020', 'doid', 'hp', 'go', 'lexicon', 'bireme_decs_spa2020', 'bireme_decs_por2020', 'radlex', 'chebi_lite']
>>> merpy.create_lexicon(["gene1", "gene2", "gene3"], "genelist")
wrote genelist lexicon
>>> merpy.process_lexicon("genelist")
>>> merpy.delete_lexicon("genelist")
deleted genelist lexicon
>>> merpy.download_lexicon("https://github.com/lasigeBioTM/MER/raw/biocreative2017/data/ChEBI.txt", "chebi")
wrote chebi lexicon
>>> merpy.process_lexicon("chebi")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

merpy-1.7.2.tar.gz (18.9 kB view details)

Uploaded Source

Built Distributions

merpy-1.7.2-py3-none-any.whl (24.7 MB view details)

Uploaded Python 3

merpy-1.7.2-py2.py3-none-any.whl (24.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file merpy-1.7.2.tar.gz.

File metadata

  • Download URL: merpy-1.7.2.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for merpy-1.7.2.tar.gz
Algorithm Hash digest
SHA256 a25744b0a57a4b1ae035d0d5aa6155e4447303ff18980ea4a38b7090af651d40
MD5 ae56b73c55e404dc9138b16d58c86774
BLAKE2b-256 aa05f30de26fed87eebbfda6399584f6012d2b95332b8e5cebed5ca96c7561cf

See more details on using hashes here.

File details

Details for the file merpy-1.7.2-py3-none-any.whl.

File metadata

  • Download URL: merpy-1.7.2-py3-none-any.whl
  • Upload date:
  • Size: 24.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for merpy-1.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e7071a23b1c8d2ba7748995112cd5896f737a225ab3f1c2ce97d8d7e8e4d23d0
MD5 d9170b7b0b90c81917876ba00681362c
BLAKE2b-256 83f91492dfc9b8ed8245c68c5ee9f6b7afeffdc83103795ea84e94985539d312

See more details on using hashes here.

File details

Details for the file merpy-1.7.2-py2.py3-none-any.whl.

File metadata

  • Download URL: merpy-1.7.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for merpy-1.7.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 14ca780393128399d2f111f7439bf1af9bca02bf89fe01c1884da0c6654b4afb
MD5 0f82536b398d6f4e68748ca615dba594
BLAKE2b-256 c385188b76be6488641d58ff411086e982144b28eb03729bf4704b3c66ae0f2a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page