Skip to main content

AutoRDF2GML: A Framework for Transforming RDF Data into Graph Representations for Graph Machine Learning.

Project description

AutoRDF2GML

Overview

AutoRDF2GML is a framework designed to transform RDF data into graph representations suitable for graph-based machine learning methods, e.g., Graph Neural Networks (GNNs). It uniquely generates content-based features from RDF datatype properties and topology-based features from RDF object properties, enabling the effective integration of Semantic Web technologies with Graph Machine Learning.

Installation

To install the current PyPI version, run:

pip install autordf2gml

We recommend users to use isolated environment, such as venv or conda, to use the library. Please note that the current version has only been tested with Python versions 3.8 to 3.9.9.

Usage

To start using AutoRDF2GML, you need: (1) RDF file and (2) Configuration file describing the configuration for the transformation. In the configuration file, define the RDF classes and properties as needed for your project. See the following for quick example.

Quick Example

This example uses the semopenalex-C1793878-sample.nt RDF file, a curated subset from SemOpenAlex.

1. Preparing the configuration file

Fill all the required fields in the config file: see config-soa-cb.ini and config-soa-tb.ini as examples for the content-based and topology-based transformation, respectively. The following shows an example of the config file format:

```ini
[InputPath] ;required
input_path = semopenalex-C1793878-sample.nt

[SavePath] ;required
save_path_numeric_graph = semopenalex/numeric-graph/
save_path_mapping = semopenalex/mapping/

[NLD] ;required
nld_class = work

[EMBEDDING] ;required
embedding_model = allenai/scibert_scivocab_uncased

[Nodes] ;required
classes = work, author, institution, source, concept, publisher
work = https://semopenalex.org/class/Work
author = https://semopenalex.org/class/Author
institution = https://semopenalex.org/class/Institution

[SimpleEdges] ;required
edge_names = author_institution
author_institution_start_node = author
author_institution_properties = http://www.w3.org/ns/org#memberOf
author_institution_end_node = institution
```

2. Using the library

import autordf2gml

#to run content-based transformation
autordf2gml.content_feature("config-soa-cb.ini") 

#to run topology-based transformation
autordf2gml.topology_feature("config-soa-tb.ini") 

#to run content-based transformation only using simple-edges
autordf2gml.simpleedges_feature("config-aifb-cb-simple.ini")

Our Github

The most recent updates, documentation, and examples can be accessed through the following repository:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autordf2gml-0.0.1.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

autordf2gml-0.0.1-py3-none-any.whl (19.9 kB view details)

Uploaded Python 3

File details

Details for the file autordf2gml-0.0.1.tar.gz.

File metadata

  • Download URL: autordf2gml-0.0.1.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.8.8

File hashes

Hashes for autordf2gml-0.0.1.tar.gz
Algorithm Hash digest
SHA256 8a39a521dc7d470c1258894b3458cbfa3d73453ee9a307effa482c29727607c9
MD5 b8f56bf1c1f859c4fd402976513a55e6
BLAKE2b-256 9a3ea438d0a4ea422348f2c5646e978d3e0a33ecc558ddb05fd1946a55e7482c

See more details on using hashes here.

File details

Details for the file autordf2gml-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: autordf2gml-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 19.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.8.8

File hashes

Hashes for autordf2gml-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 905519432e9261a1b7ffdcfaaaaca2cb49e3f5414b6d40647b7100cb96c0f027
MD5 9758f7799b52f81f4844c63268be1255
BLAKE2b-256 f63412bcfaa9d1ba0329ceddd7b36761bf793c76808c058d9b04a04e0748c96c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page