Skip to main content

Client library for loading the OptimusKG biomedical knowledge graph from Harvard Dataverse.

Project description

PyPI License: MIT Python 3.12+ GitHub Stars DOI Website

Python client for loading the OptimusKG biomedical knowledge graph from Harvard Dataverse.

Highlights

  • A modern biomedical knowledge graph with molecular, anatomical, clinical, and environmental modalities.
  • Integrates 65 heterogeneous resources grounded with 18 ontologies and controlled vocabularies using the BioCypher framework and the Biolink Model.
  • Contains 190,531 nodes across 10 entity types, 21,813,816 edges across 26 relation types, and 67,249,863 property instances encoding 110,276,843 values across 150 distinct property keys.
  • Independently validated using PaperQA3, a multimodal agent that retrieves and reasons over scientific literature.

Installation

pip install optimuskg
# Or with pipx.
pipx install optimuskg

Usage

The client fetches files from the gold layer with local caching, and supports loading the graph either as Polars DataFrames or as a NetworkX MultiDiGraph:

import optimuskg

# Download (once) and cache a file from the gold layer
path = optimuskg.get_file("nodes/gene.parquet")

# Load a single Parquet file
drugs = optimuskg.load_parquet("nodes/drug.parquet")

# Load nodes and edges (full graph or Largest Connected Component)
nodes, edges = optimuskg.load_graph(lcc=True)

# Load as NetworkX MultiDiGraph (JSON properties parsed)
G = optimuskg.load_networkx(lcc=True)

Configuration

Downloads are cached by default in platformdirs.user_cache_dir("optimuskg") (~/Library/Caches/optimuskg on macOS, ~/.cache/optimuskg on Linux). Override the location with the $OPTIMUSKG_CACHE_DIR environment variable or programmatically:

optimuskg.set_cache_dir("/path/to/cache")

To target a different dataset version (e.g., a pre-release), set the $OPTIMUSKG_DOI environment variable or call:

optimuskg.set_doi("doi:10.xxxx/XXXX")

Citation

If you use OptimusKG in your research, please cite:

@article{vittor2026optimuskg,
  title={OptimusKG: Unifying biomedical knowledge in a modern multimodal graph},
  author={Vittor, Lucas and Noori, Ayush and Arango, I{\~n}aki and Polonuer, Joaqu{\'\i}n and Rodriques, Sam and White, Andrew and Clifton, David A. and Zitnik, Marinka},
  journal={Nature Scientific Data},
  year={2026}
}

License

The optimuskg client is released under the MIT License. OptimusKG integrates multiple primary data resources, each of which is subject to its own license and terms of use. These terms may impose restrictions on redistribution, commercial use, or downstream applications of the resulting knowledge graph or its subsets. Some resources provide data under academic or noncommercial licenses, while others may impose attribution or usage requirements. As a result, use of OptimusKG may be partially restricted depending on the specific data components included in a given instantiation. Users are responsible for reviewing and complying with the license and terms of use of each primary dataset, as specified by the original data providers. OptimusKG does not alter or override these source-specific licensing conditions.

Made with ❤️ at Zitnik Lab, Harvard Medical School

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optimuskg-1.0.0.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

optimuskg-1.0.0-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file optimuskg-1.0.0.tar.gz.

File metadata

  • Download URL: optimuskg-1.0.0.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for optimuskg-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ac46e780fb56bcf552dd6fa01a524690d488b6402303696d0889962e436b5233
MD5 0e18ddf7d29c9523cb2af3217131de58
BLAKE2b-256 fb2fd0ee365ed4d9ba1343c7478ec3f58b33742c266136e7ff2a0eace16f06f2

See more details on using hashes here.

File details

Details for the file optimuskg-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: optimuskg-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for optimuskg-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 47f108371830c8d1f052b389df964eadec8fd2c9c9e7b3b74ef65464af4ff078
MD5 faeb2bfa7773b934d3de88228b4797fb
BLAKE2b-256 9078ba1f888de213c288aed71e39dfc41ab58a9d36e77a4286284a7769532539

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page