AutoRDF2GML: A Framework for Transforming RDF Data into Graph Representations for Graph Machine Learning.
Project description
AutoRDF2GML
Overview
AutoRDF2GML is a framework designed to transform RDF data into graph representations suitable for graph-based machine learning methods, e.g., Graph Neural Networks (GNNs). It uniquely generates content-based features from RDF datatype properties and topology-based features from RDF object properties, enabling the effective integration of Semantic Web technologies with Graph Machine Learning.
Installation
To install the current PyPI version, run:
pip install autordf2gml
We recommend users to use isolated environment, such as venv or conda, to use the library. Please note that the current version has only been tested with Python versions 3.8 to 3.9.9.
Usage
To start using AutoRDF2GML, you need: (1) RDF file and (2) Configuration file describing the configuration for the transformation. In the configuration file, define the RDF classes and properties as needed for your project. See the following for quick example.
Quick Example
This example uses the semopenalex-C1793878-sample.nt RDF file, a curated subset from SemOpenAlex.
1. Preparing the configuration file
Fill all the required fields in the config file: see config-soa-cb.ini and config-soa-tb.ini as examples for the content-based and topology-based transformation, respectively. The following shows an example of the config file format:
```ini
[InputPath] ;required
input_path = semopenalex-C1793878-sample.nt
[SavePath] ;required
save_path_numeric_graph = semopenalex/numeric-graph/
save_path_mapping = semopenalex/mapping/
[NLD] ;required
nld_class = work
[EMBEDDING] ;required
embedding_model = allenai/scibert_scivocab_uncased
[Nodes] ;required
classes = work, author, institution, source, concept, publisher
work = https://semopenalex.org/class/Work
author = https://semopenalex.org/class/Author
institution = https://semopenalex.org/class/Institution
[SimpleEdges] ;required
edge_names = author_institution
author_institution_start_node = author
author_institution_properties = http://www.w3.org/ns/org#memberOf
author_institution_end_node = institution
```
2. Using the library
import autordf2gml
#to run content-based transformation
autordf2gml.content_feature("config-soa-cb.ini")
#to run topology-based transformation
autordf2gml.topology_feature("config-soa-tb.ini")
#to run content-based transformation only using simple-edges
autordf2gml.simpleedges_feature("config-aifb-cb-simple.ini")
Our Github
The most recent updates, documentation, and examples can be accessed through the following repository:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file autordf2gml-0.0.1.tar.gz
.
File metadata
- Download URL: autordf2gml-0.0.1.tar.gz
- Upload date:
- Size: 15.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a39a521dc7d470c1258894b3458cbfa3d73453ee9a307effa482c29727607c9 |
|
MD5 | b8f56bf1c1f859c4fd402976513a55e6 |
|
BLAKE2b-256 | 9a3ea438d0a4ea422348f2c5646e978d3e0a33ecc558ddb05fd1946a55e7482c |
File details
Details for the file autordf2gml-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: autordf2gml-0.0.1-py3-none-any.whl
- Upload date:
- Size: 19.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 905519432e9261a1b7ffdcfaaaaca2cb49e3f5414b6d40647b7100cb96c0f027 |
|
MD5 | 9758f7799b52f81f4844c63268be1255 |
|
BLAKE2b-256 | f63412bcfaa9d1ba0329ceddd7b36761bf793c76808c058d9b04a04e0748c96c |