OntoGene's Biomedical Entity Recogniser
Project description
OGER: OntoGene's Biomedical Entity Recogniser
A flexible, dictionary-based system for extracting biomedical terms from scientific articles.
Demo
A demo version of OGER is hosted at https://pub.cl.uzh.ch/purl/OGER.
Installation
Install OGER from its repository using pip:
pip install git+https://github.com/OntoGene/OGER.git
Make sure you use Python 3's pip (eg. pip3
).
Python 2.x is not supported.
Note: By default, pip
installs Python packages at the system level, which typically requires root/admin privileges.
To install OGER to a user-owned location, set the --user
flag.
Usage: Command-line Tool
The installation process should provide you with an executable oger
, which is the common starting point for a number of command-line tools.
Type oger
CMD
, followed by command-specific options, to run the desired tool:
oger run # run an annotation job
oger serve # start a REST API server
oger eval # determine annotation accurateness
oger test # run software tests
oger version # print the version number
To see a list of available options for each command, run eg. oger serve -h
.
As an alternative to the oger
executable, you may run python3 -m oger
CMD
.
Usage: Python Library
Get config and start a pipeline server:
>>> from oger.ctrl.router import Router, PipelineServer
>>> conf = Router(termlist_path='testfiles/test_terms.tsv')
>>> pl = PipelineServer(conf)
Note: test files can be downloaded here.
Load a text document from disk (using the included test suite):
>>> doc = pl.load_one('testfiles/txt/13373697.txt', 'txt')
>>> doc
<Article with 1 subelement at 0x7f8b2135d860>
>>> doc.text
'[The kind and the measure of ventilation disorders in tuberculous bronchostenosis in relation to its localization]. \n'
Download a collection of articles from PubMed:
>>> coll = pl.load_one(['21436587', '21436588'], fmt='pubmed')
>>> coll
<Collection with 2 subelements at 0x7f8b215f4cc0>
>>> coll[0]
<Article with 2 subelements at 0x7f8b2156a5f8>
>>> coll[0][0]
<Section with 1 subelement at 0x7f8b2156a358>
>>> coll[0][0].text
'Human prostate cancer metastases target the hematopoietic stem cell niche to establish footholds in mouse bone marrow.\n'
Run entity recognition:
>>> pl.process(coll)
>>> entity = next(coll[0].iter_entities())
>>> entity.text, entity.start, entity.end
('Human', 0, 5)
>>> entity.cid
'9606'
>>> entity.info
('organism', 'Homo sapiens', 'NCBI Taxonomy', '9606', 'DC')
Export to disk:
>>> with open('output/collection.json', 'w', encoding='utf8') as f:
... pl.write(coll, 'bioc_json', f)
The second argument specifies the output format.
OGER supports BioC (XML and JSON), PubTator, PubAnnotation JSON, BioNLP stand-off, and CoNLL, among others.
A full list of available formats is given here (see the export-format
parameter).
Documentation
Documentation is maintained in the GitHub wiki.
Prerequisites
OGER runs on Python 3.4+.
The following third-party libraries need to be installed (pip should take care of this):
Publications
If you use OGER in an academic context, please cite us:
Lenz Furrer, Anna Jancso, Nicola Colic, and Fabio Rinaldi (2019): OGER++: hybrid multi-type entity recognition. In: Journal of Cheminformatics 11:7. DOI: 10.1186/s13321-018-0326-3 | PDF | bibtex |
Lenz Furrer and Fabio Rinaldi (2017): OGER: OntoGene's Entity Recogniser in the BeCalm TIPS Task. In: Proceedings of the BioCreative V.5 Challenge Evaluation Workshop, pp. 175–182. | PDF | bibtex |
Marco Basaldella, Lenz Furrer, Carlo Tasso, and Fabio Rinaldi (2017): Entity recognition in the biomedical domain using a hybrid approach. In: Journal of Biomedical Semantics 8:51. DOI: 10.1186/s13326-017-0157-6 | PDF | bibtex |
License
OGER offers a dual licensing model.
You can redistribute OGER and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
The GNU Affero General Public License is designed to ensure that if a modified version is distributed or made accessible on a server (e.g. in a SaaS offering), the modified source code becomes available to the community.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see http://www.gnu.org/licenses/.
If you wish to use OGER under alternate terms, you may obtain a commercial license to OGER. Please contact us for more information (http://www.ontogene.org).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file OGER-1.5.tar.gz
.
File metadata
- Download URL: OGER-1.5.tar.gz
- Upload date:
- Size: 679.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | df589b8d557c487dafd6d550efc90bc309466f0927b39d7191bd345b805c134e |
|
MD5 | 1368f5e618197e221a9eb2520ecb550f |
|
BLAKE2b-256 | 0d47001e99a3b06ac9e4b9de42f4bf370410e70cdf89cbf888573a594feb994c |
File details
Details for the file OGER-1.5-py3-none-any.whl
.
File metadata
- Download URL: OGER-1.5-py3-none-any.whl
- Upload date:
- Size: 760.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 825cf872faa70fe371747ca2a7dffc0f9749e37298ec25bc4eb91b647c3eda06 |
|
MD5 | 8e81059c170004a802744180be062ced |
|
BLAKE2b-256 | 43656d0b8dbc1bc88c806c1fb43c6cadeb728fbb11a0c11d525ccbc3716f3049 |