Skip to main content

Parse OBO formatted ontologies into networkx

Project description

obonet: load OBO-formatted ontologies into networkx

GitHub Actions CI Build Status
Software License
PyPI

Read OBO-formatted ontologies in Python. obonet is

  • user friendly
  • succinct
  • pythonic
  • modern
  • simple and tested
  • lightweight
  • networkx leveraging

This Python package loads OBO serialized ontologies into networks. The function obonet.read_obo() takes an .obo file and returns a networkx.MultiDiGraph representation of the ontology. The parser was designed for the OBO specification version 1.2 & 1.4.

Usage

See pyproject.toml for the minimum Python version required and the dependencies. OBO files can be read from a path, URL, or open file handle. Compression is inferred from the path's extension. See example usage below:

import networkx
import obonet

# Read the taxrank ontology
url = 'https://github.com/dhimmel/obonet/raw/main/tests/data/taxrank.obo'
graph = obonet.read_obo(url)

# Or read the xz-compressed taxrank ontology
url = 'https://github.com/dhimmel/obonet/raw/main/tests/data/taxrank.obo.xz'
graph = obonet.read_obo(url)

# Number of nodes
len(graph)

# Number of edges
graph.number_of_edges()

# Check if the ontology is a DAG
networkx.is_directed_acyclic_graph(graph)

# Mapping from term ID to name
id_to_name = {id_: data.get('name') for id_, data in graph.nodes(data=True)}
id_to_name['TAXRANK:0000006']  # TAXRANK:0000006 is species

# Find all superterms of species. Note that networkx.descendants gets
# superterms, while networkx.ancestors returns subterms.
networkx.descendants(graph, 'TAXRANK:0000006')

# Include parsed OBO clauses to preserve comments and trailing modifiers
graph = obonet.read_obo(url, include_clauses=True)
graph.nodes['TAXRANK:0000060']['_clauses']['is_a'][0]
# output preserves the OBO trailing comment after "!":
# {
#     'tag': 'is_a',
#     'value': 'TAXRANK:0000000',
#     'trailing_modifier': None,
#     'comment': 'taxonomic_rank',
# }

For a more detailed tutorial, see the Gene Ontology example notebook.

OBO files can also be converted to NetworkX node-link JSON from the command line:

uvx obonet tests/data/taxrank.obo --include-clauses --output=taxrank.json

Comparison

This package specializes in reading OBO files into a newtorkx.MultiDiGraph. A more general ontology-to-NetworkX reader is available in the Python nxontology package via the nxontology.imports.pronto_to_multidigraph function. This function takes a pronto.Ontology object, which can be loaded from an OBO file, OBO Graphs JSON file, or Ontology Web Language 2 RDF/XML file (OWL). Using pronto_to_multidigraph allows creating a MultiDiGraph similar to the created by obonet, with some differences in the amount of metadata retained.

The primary focus of the nxontology package is to provide an NXOntology class for representing ontologies based around a networkx.DiGraph. NXOntology provides optimized implementations for computing node similarity and other intrinsic ontology metrics. There are two important differences between a DiGraph for NXOntology and the MultiDiGraph produced by obonet:

  1. NXOntology is based on a DiGraph that does not allow multiple edges between the same two nodes. Multiple edges between the same two nodes must therefore be collapsed. By default, it only considers is a / rdfs:subClassOf relationships, but using pronto_to_multidigraph to create the NXOntology allows for retaining additional relationship types, like part of in the case of the Gene Ontology.

  2. NXOntology reverses the direction of relationships so edges go from superterm to subterm. Traditionally in ontologies, the is a relationships go from subterm to superterm, but this is confusing. NXOntology reverses edges so functions such as ancestors refer to more general concepts and descendants refer to more specific concepts.

The nxontology.imports.multidigraph_to_digraph function converts from a MultiDiGraph, like the one produced by obonet, to a DiGraph by filtering to the desired relationship types, reversing edges, and collapsing parallel edges.

Installation

The recommended approach is to install the latest release from PyPI using:

pip install obonet

However, if you'd like to install the most recent version from GitHub, use:

pip install git+https://github.com/dhimmel/obonet.git#egg=obonet

Contributing

GitHub issues

We welcome feature suggestions and community contributions. Currently, only reading OBO files is supported.

Develop

Some development commands:

# install dependencies
uv sync --extra dev

# install git hooks
uv run prek install

# run all prek checks
uv run prek run --all-files

# run tests
uv run pytest

# generate changelog for release notes
git fetch --tags origin main
OLD_TAG=$(git describe --tags --abbrev=0)
git log --oneline --decorate=no --reverse $OLD_TAG..HEAD

Maintainers can make a new release at https://github.com/dhimmel/obonet/releases/new.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

obonet-1.3.0.tar.gz (27.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

obonet-1.3.0-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file obonet-1.3.0.tar.gz.

File metadata

  • Download URL: obonet-1.3.0.tar.gz
  • Upload date:
  • Size: 27.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for obonet-1.3.0.tar.gz
Algorithm Hash digest
SHA256 7c128bfe455bd7a134a3e6122e89338b896b4dea116511a850cec1a2e2a98294
MD5 957c29c21c9737ddc6c3ffba94c7508d
BLAKE2b-256 60a3e8909a913d94d95806519906235baae831dae43df706eb006a0bb48c776b

See more details on using hashes here.

Provenance

The following attestation bundles were made for obonet-1.3.0.tar.gz:

Publisher: release.yaml on dhimmel/obonet

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file obonet-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: obonet-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for obonet-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 77d981082aa95cf0d6c78568a8079f43c781c8dd5df30a9b2f7c665b87e4f0fc
MD5 567186190a0dcf3de555c31228c7bedf
BLAKE2b-256 3d8172afd839f6502b72cb01a641a39c43cc6e77bbba101dd07194008b9c735f

See more details on using hashes here.

Provenance

The following attestation bundles were made for obonet-1.3.0-py3-none-any.whl:

Publisher: release.yaml on dhimmel/obonet

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page