Skip to main content

Ontology Access Kit: Python library for common ontology operations over a variety of backends

Project description

Ontology Access Kit (OAK)

Python lib for common ontology operations over a variety of backends.

PyPI version badge Downloads DOI Contributor Covenant

OAK provides a collection of interfaces for various ontology operations, including:

  • look up basic features of an ontology element, such as its label, definition, relationships, or aliases
  • search an ontology for a term
  • validate an ontology
  • modify or delete terms
  • generate and visualize subgraphs
  • identify lexical matches and export as SSSOM mapping tables
  • perform more advanced operations, such as graph traversal, OWL axiom processing, or text annotation

These interfaces are separated from any particular backend, for which there a number of different adapters. This means the same Python API and command line can be used regardless of whether the ontology:

  • is served by a remote API such as OLS or BioPortal
  • is present locally on the filesystem in owl, obo, obojson, or sqlite formats
  • is to be downloaded from a remote repository such as the OBO library
  • is queried from a remote database, including SPARQL endpoints (Ontobee/Ubergraph), A SQL database, a Solr/ES endpoint

Documentation:

Contributing

See the contribution guidelines at CONTRIBUTING.md. All contributors are expected to uphold our Code of Conduct.

Usage

from oaklib import get_adapter

# connect to the CL sqlite database adapter
# (will first download if not already downloaded)
adapter = get_adapter("sqlite:obo:cl")

NEURON = "CL:0000540"

print('## Basic info')
print(f'ID: {NEURON}')
print(f'Label: {adapter.label(NEURON)}')

for alias in adapter.entity_aliases(NEURON):
    print(f'Alias: {alias}')

print('## Relationships (direct)')
for relationship in adapter.relationships([NEURON]):
    print(f' * {relationship.predicate} -> {relationship.object} "{adapter.label(relationship.object)}"')
    
print('## Ancestors (over IS_A and PART_OF)')
from oaklib.datamodels.vocabulary import IS_A, PART_OF
from oaklib.interfaces import OboGraphInterface

if not isinstance(adapter, OboGraphInterface):
    raise ValueError('This adapter does not support graph operations')

for ancestor in adapter.ancestors(NEURON, predicates=[IS_A, PART_OF]):
    print(f' * ANCESTOR: "{adapter.label(ancestor)}"')

For more examples, see

Command Line

See:

Search

Use the pronto backend to fetch and parse an ontology from the OBO library, then use the search command

runoak -i obolibrary:pato.obo search osmol 

Returns:

PATO:0001655 ! osmolarity
PATO:0001656 ! decreased osmolarity
PATO:0001657 ! increased osmolarity
PATO:0002027 ! osmolality
PATO:0002028 ! decreased osmolality
PATO:0002029 ! increased osmolality
PATO:0045034 ! normal osmolality
PATO:0045035 ! normal osmolarity

QC and Validation

Perform validation on PR using sqlite/rdftab instance:

runoak -i sqlite:../semantic-sql/db/pr.db validate

List all terms

List all terms obolibrary has for mondo

runoak -i obolibrary:mondo.obo terms 

Lexical index

Make a lexical index of all terms in Mondo:

runoak  -i obolibrary:mondo.obo lexmatch -L mondo.index.yaml

Search

Searching over OBO using ontobee:

runoak  -i ontobee: search tentacle

yields:

http://purl.obolibrary.org/obo/CEPH_0000256 ! tentacle
http://purl.obolibrary.org/obo/CEPH_0000257 ! tentacle absence
http://purl.obolibrary.org/obo/CEPH_0000258 ! tentacle pad
...

Searching over a broader set of ontologies in bioportal (requires API KEY) (https://www.bioontology.org/wiki/BioPortal_Help#Getting_an_API_key)

runoak set-apikey bioportal YOUR-KEY-HERE
runoak  -i bioportal: search tentacle

yields:

BTO:0001357 ! tentacle
http://purl.jp/bio/4/id/200906071014668510 ! tentacle
CEPH:0000256 ! tentacle
http://www.projecthalo.com/aura#Tentacle ! Tentacle
CEPH:0000256 ! tentacle
...

Alternatively, you can add "BIOPORTAL_API_KEY" to your environment variables.

Searching over more limited set of ontologies in Ubergraph:

runoak -v -i ubergraph: search tentacle

yields

UBERON:0013206 ! nasal tentacle

Annotating Texts

runoak  -i bioportal: annotate neuron from CA4 region of hippocampus of mouse

yields:

object_id: CL:0000540
object_label: neuron
object_source: https://data.bioontology.org/ontologies/NIFDYS
match_type: PREF
subject_start: 1
subject_end: 6
subject_label: NEURON

object_id: http://www.co-ode.org/ontologies/galen#Neuron
object_label: Neuron
object_source: https://data.bioontology.org/ontologies/GALEN
match_type: PREF
subject_start: 1
subject_end: 6
subject_label: NEURON

...

Mapping

Create a SSSOM mapping file for a set of ontologies:

robot merge -I http://purl.obolibrary.org/obo/hp.owl -I http://purl.obolibrary.org/obo/mp.owl convert --check false -o hp-mp.obo
runoak lexmatch -i hp-mp.obo -o hp-mp.sssom.tsv

Visualization of ancestor graphs

Use the sqlite backend to visualize graph up from 'vacuole' using test ontology sqlite:

runoak -i sqlite:tests/input/go-nucleus.db  viz GO:0005773

img

Same using ubergraph, restricting to is-a and part-of

runoak -i ubergraph:  viz GO:0005773 -p i,BFO:0000050

Same using pronto, fetching ontology from obolibrary

runoak -i obolibrary:go.obo  viz GO:0005773

Configuration

OAK uses pystow for caching. By default, this goes inside ~/.data/, but can be configured following these instructions.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oaklib-0.7.0rc6.tar.gz (30.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oaklib-0.7.0rc6-py3-none-any.whl (704.0 kB view details)

Uploaded Python 3

File details

Details for the file oaklib-0.7.0rc6.tar.gz.

File metadata

  • Download URL: oaklib-0.7.0rc6.tar.gz
  • Upload date:
  • Size: 30.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oaklib-0.7.0rc6.tar.gz
Algorithm Hash digest
SHA256 5ce253120d60cd6500e3e6362b38ca8a488517be6eb5674a3ad82f0024dcbe83
MD5 0290baf52f27eb4be8e8f3705bb8c4f6
BLAKE2b-256 4d57fe98bcf14af7a3255a0e37707025b437235d48793ca4fdb0a52d9522afb0

See more details on using hashes here.

Provenance

The following attestation bundles were made for oaklib-0.7.0rc6.tar.gz:

Publisher: pypi-publish.yaml on INCATools/ontology-access-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oaklib-0.7.0rc6-py3-none-any.whl.

File metadata

  • Download URL: oaklib-0.7.0rc6-py3-none-any.whl
  • Upload date:
  • Size: 704.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oaklib-0.7.0rc6-py3-none-any.whl
Algorithm Hash digest
SHA256 a7b5a904923e43ea08f2c2d08152c787eb7bedb2cd5eb0dd4fc42a6332fc3cbb
MD5 e0cc65fb50da81821f1b629c4541fcd9
BLAKE2b-256 d356b687a794dc0b5704a7f50d6a2ef25c32ae933c59f74e025480c4a484f4b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for oaklib-0.7.0rc6-py3-none-any.whl:

Publisher: pypi-publish.yaml on INCATools/ontology-access-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page