use MER inside python
Project description
Use MER scripts inside python.
(from the MER repository)
MER is a Named-Entity Recognition tool which given any lexicon and any input text returns the list of terms recognized in the text, including their exact location (annotations).
Given an ontology (owl file) MER is also able to link the entities to their classes.
More information about MER can be found in:
- MER: a Shell Script and Annotation Server for Minimal Named Entity Recognition and Linking, F. Couto and A. Lamurias, Journal of Cheminformatics, 10:58, 2018 [https://doi.org/10.1186/s13321-018-0312-9]
- MER: a Minimal Named-Entity Recognition Tagger and Annotation Server, F. Couto, L. Campos, and A. Lamurias, in BioCreative V.5 Challenge Evaluation, 2017 [https://www.researchgate.net/publication/316545534_MER_a_Minimal_Named-Entity_Recognition_Tagger_and_Annotation_Server]
Dependencies
awk
MER was developed and tested using the GNU awk (gawk) and grep. If you have another awk interpreter in your machine, there's no assurance that the program will work.
For example, to install GNU awk on Ubuntu:
sudo apt-get install gawk
Installation
pip install merpy
or
python setup.py install
Then you might want to update the MER scripts and download preprocessed data:
>>> import merpy
>>> merpy.download_mer()
>>> merpy.download_lexicons()
Basic Usage
>>> import merpy
>>> merpy.download_lexicons()
>>> merpy.process_lexicon("hp")
>>> document = 'Influenza, commonly known as "the flu", is an infectious disease caused by an influenza virus. Symptoms can be mild to severe. The most common symptoms include: a high fever, runny nose, sore throat, muscle pains, headache, coughing, and feeling tired'
>>> entities = merpy.get_entities(document, "hp")
>>> print(entities)
[['111', '115', 'mild', 'http://purl.obolibrary.org/obo/HP_0012825'], ['119', '125', 'severe', 'http://purl.obolibrary.org/obo/HP_0012828'], ['168', '173', 'fever', 'http://purl.obolibrary.org/obo/HP_0001945'], ['214', '222', 'headache', 'http://purl.obolibrary.org/obo/HP_0002315'], ['224', '232', 'coughing', 'http://purl.obolibrary.org/obo/HP_0012735'], ['246', '251', 'tired', 'http://purl.obolibrary.org/obo/HP_0012378'], ['175', '185', 'runny nose', 'http://purl.obolibrary.org/obo/HP_0031417']]
>>> lexicons = merpy.get_lexicons()
>>> merpy.show_lexicons()
lexicons preloaded:
['lexicon', 'go', 'cell_line_and_cell_type', 'chebi_lite', 'chemical', 'hp', 'disease', 'wordnet_nouns', 'hpo', 'radlex', 'doid', 'protein', 'hpomultilang', 'tissue_and_organ', 'mirna', 'subcellular_structure']
lexicons loaded ready to use:
['lexicon', 'doid', 'hp']
lexicons with linked concepts:
['doid', 'hp', 'go', 'chebi_lite', 'lexicon']
>>> merpy.create_lexicon(["gene1", "gene2", "gene3"], "genelist")
wrote genelist lexicon
>>> merpy.process_lexicon("genelist")
>>> merpy.download_lexicon("https://github.com/lasigeBioTM/MER/raw/biocreative2017/data/ChEBI.txt", "chebi")
wrote chebi lexicon
>>> merpy.process_lexicon("chebi")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file merpy-1.0.3.tar.gz
.
File metadata
- Download URL: merpy-1.0.3.tar.gz
- Upload date:
- Size: 15.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1dc1daa6dbd5ffea8cf97c406d4ff118677e0cd7732bef6847ee004dc43f7e25 |
|
MD5 | f68f1f78b92fc0a5db6bf1a0c5c998cf |
|
BLAKE2b-256 | cfee1b332cba959e93a431a3b201b04c9e5c512133edbc3bd68964e1441698bf |
File details
Details for the file merpy-1.0.3-py3.7.egg
.
File metadata
- Download URL: merpy-1.0.3-py3.7.egg
- Upload date:
- Size: 26.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35dec1abf2191a4403a6f0e92fc2107917a851f478cb4fc3e6d6a49656767a23 |
|
MD5 | 48ad7fcccef12b4a7c9314248acd7c6c |
|
BLAKE2b-256 | afda8d9a61175ed1df54d1f5997dce006a7bf16aa29573eef08412eb98269b83 |
File details
Details for the file merpy-1.0.3-py3-none-any.whl
.
File metadata
- Download URL: merpy-1.0.3-py3-none-any.whl
- Upload date:
- Size: 23.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 456a158ed77cc419355a3acdaf5ccb8cf4209633cd2b96d343b5de26b15ef61e |
|
MD5 | 408c61ad7b6e9c42a9b5da12f6e3c593 |
|
BLAKE2b-256 | 20560da6870697008ddfd08db99c039a1a4fee0e21a1826a65f68929e7fc582f |