OntoGPT
Project description
OntoGPT
Introduction
OntoGPT is a Python package for extracting structured information from text with large language models (LLMs), instruction prompts, and ontology-based grounding.
For more details, please see the full documentation.
Quick Start
OntoGPT runs on the command line, though there's also a minimal web app interface (see Web Application
section below).
-
Ensure you have Python 3.9 or greater installed.
-
Install with
pip
:pip install ontogpt
-
Set your OpenAI API key:
runoak set-apikey -e openai <your openai api key>
-
See the list of all OntoGPT commands:
ontogpt --help
-
Try a simple example of information extraction:
echo "One treatment for high blood pressure is carvedilol." > example.txt ontogpt extract -i example.txt -t drug
OntoGPT will retrieve the necessary ontologies and output results to the command line. Your output will provide all extracted objects under the heading
extracted_object
.
Web Application
There is a bare bones web application for running OntoGPT and viewing results.
First, install the required dependencies with pip
by running the following command:
pip install ontogpt[web]
Then run this command to start the web application:
web-ontogpt
NOTE: We do not recommend hosting this webapp publicly without authentication.
Evaluations
OntoGPT's functions have been evaluated on test data. Please see the full documentation for details on these evaluations and how to reproduce them.
Related Projects
- TALISMAN, a tool for generating summaries of functions enriched within a gene set. TALISMAN uses OntoGPT to work with LLMs.
Tutorials and Presentations
- Presentation: "Staying grounded: assembling structured biological knowledge with help from large language models" - presented by Harry Caufield as part of the AgBioData Consortium webinar series (September 2023)
- Presentation: "Transforming unstructured biomedical texts with large language models" - presented by Harry Caufield as part of the BOSC track at ISMB/ECCB 2023 (July 2023)
- Presentation: "OntoGPT: A framework for working with ontologies and large language models" - talk by Chris Mungall at Joint Food Ontology Workgroup (May 2023)
Citation
The information extraction approach used in OntoGPT, SPIRES, is described further in: Caufield JH, Hegde H, Emonet V, Harris NL, Joachimiak MP, Matentzoglu N, et al. Structured prompt interrogation and recursive extraction of semantics (SPIRES): A method for populating knowledge bases using zero-shot learning. Bioinformatics, Volume 40, Issue 3, March 2024, btae104, https://doi.org/10.1093/bioinformatics/btae104.
Acknowledgements
This project is part of the Monarch Initiative. We also gratefully acknowledge Bosch Research for their support of this research project.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.