Convert iterable data to Semantic Knowledge Graphs, with a simple declarative mapping.
Project description
OntoWeaver
OntoWeaver is a tool for transforming iterable data (like tables) in Semantic Knowledge Graphs (SKG) databases.
OntoWeaver allows writing a simple declarative mapping to express how columns from a table should be converted as typed nodes, edges or properties in an SKG.
SKG databases allows for an easy integration of very heterogeneous data, and OntoWeaver brings a reproducible approach to building them.
With OntoWeaver, you can very easily implement a script that will allow you to automatically reconfigure a new SKG from the input data, each time you need it.
OntoWeaver has been tested on large scale biomedical use cases (think: millions of nodes), and we can guarantee that it is simple to operate by anyone having a basic knowledge of programming.
Basics
Mapping data
OntoWeaver provides a simple layer of abstraction on top of BioCypher, which remains responsible for doing the ontology alignment, supporting several graph database backends, and allowing reproducible & configurable builds.
With a pure Biocypher approach, you would have to write a whole adapter by hand, with OntoWeaver, you just have to express a mapping in YAML, looking like:
row: # The meaning of an entry in the input table.
map:
column: <column name in your CSV>
to_subject: <ontology node type to use for representing a row>
transformers: # How to map cells to nodes and edges.
- map: # Map a column to a node.
column: <column name>
to_object: <ontology node type to use for representing a column>
via_relation: <edge type for linking subject and object nodes>
- map: # Map a column to a property.
column: <another name>
to_property: <property name>
for_object: <type holding the property>
metadata: # Optional properties added to every node and edge.
- source: "My OntoWeaver adapter"
- version: "v1.2.3"
OntoWeaver can read anything that Pandas can load, which means a lot of tabular formats. It can also parse graphs from OWL files.
Usage
To configure your SKG, you need input data, a mapping (see above), but also a BioCyhper configuration: a schema.yaml and a ibiocypher_config.yaml.
In most cases, you will just need to call the ontoweave command to build-up
the SKG you prepared:
ontoweave my_data.csv:my_mapping.yaml --import-script-run
If you're using OntoWeaver from its Git repository, you will have to us UV:
uv run ontoweave data_A.csv:map_A.yaml data_B.tsv:map_B.yaml
The ontoweave command is very configurable, see ontoweave --help for more
details.
Detailed documentation with tutorials and a more detailed installation guide is available on the OntoWeaver website.
Installation
The project is written in Python and is tested with the UV environment manager. You can install the necessary dependencies in a virtual environment like this:
git clone https://github.com/oncodash/ontoweaver.git
cd ontoweaver
uv venv
uv pip install .
UV will create a virtual environment according to your configuration (either centrally or in the project folder).
You can then run any script by calling it directly (.e.g. uv run ontoweave),
and it should just work. If you want to call scripts from anywhere in your
system, you will have to add the …/ontoweaver/src/ontoweaver directory to your PATH:
# Put this in your ~/.bashrc or ~/.zshrc
export PATH="$PATH:$HOME/<your path>/ontoweaver/src/ontoweaver/
The package can also be used in a UV environment. Just run:
uv sync
UV will create a virtual environment according to your configuration, and you can call the CLI with:
uv run ./src/ontoweaver/ontoweave --help
Theoretically, OntoWeaver can export a knowledge graph in any of the formats supported by BioCypher (Neo4j, ArangoDB, CSV, RDF, PostgreSQL, SQLite, NetworkX, … see BioCypher's documentation).
Development
Tests
Tests are located in the tests/ subdirectory and may be a good starting point
to see OntoWeaver in practice. You may start with tests/test_simplest.py which
shows the simplest example of mapping tabular data through BioCypher.
To run tests, use pytest:
uv run pytest
Contributing
In case of any questions or improvements feel free to open an issue or a pull request!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ontoweaver-1.3.2.tar.gz.
File metadata
- Download URL: ontoweaver-1.3.2.tar.gz
- Upload date:
- Size: 96.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eeb5f1c2ccffafb42b8d4a03a6fb17be167f1b66e50e8b54c86081117669370e
|
|
| MD5 |
aaf3638f1540c93f80ccc7c00f20bfdb
|
|
| BLAKE2b-256 |
3292880a397ede8a6fd5591b58de186385d05ebc3aa2a6334efbd25592acd330
|
Provenance
The following attestation bundles were made for ontoweaver-1.3.2.tar.gz:
Publisher:
publish.yaml on oncodash/ontoweaver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ontoweaver-1.3.2.tar.gz -
Subject digest:
eeb5f1c2ccffafb42b8d4a03a6fb17be167f1b66e50e8b54c86081117669370e - Sigstore transparency entry: 1003409680
- Sigstore integration time:
-
Permalink:
oncodash/ontoweaver@ad00b6ce3efb703e37e81520279c9c4ecd0c85f6 -
Branch / Tag:
refs/tags/v1.3.3 - Owner: https://github.com/oncodash
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@ad00b6ce3efb703e37e81520279c9c4ecd0c85f6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ontoweaver-1.3.2-py3-none-any.whl.
File metadata
- Download URL: ontoweaver-1.3.2-py3-none-any.whl
- Upload date:
- Size: 76.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3567c75ab668b48ee2b4f21f50a64b016ffec61e1d0781e22a23d0647201e514
|
|
| MD5 |
ee5b559d0bf1f53f61266259c55034b7
|
|
| BLAKE2b-256 |
452ec18875a2daddd370b21e0bdc12ef95cd306eed99df92872f526e78a0e6ff
|
Provenance
The following attestation bundles were made for ontoweaver-1.3.2-py3-none-any.whl:
Publisher:
publish.yaml on oncodash/ontoweaver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ontoweaver-1.3.2-py3-none-any.whl -
Subject digest:
3567c75ab668b48ee2b4f21f50a64b016ffec61e1d0781e22a23d0647201e514 - Sigstore transparency entry: 1003409687
- Sigstore integration time:
-
Permalink:
oncodash/ontoweaver@ad00b6ce3efb703e37e81520279c9c4ecd0c85f6 -
Branch / Tag:
refs/tags/v1.3.3 - Owner: https://github.com/oncodash
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@ad00b6ce3efb703e37e81520279c9c4ecd0c85f6 -
Trigger Event:
push
-
Statement type: