Skip to main content

Converts instances from a CSV file into RDF.

Project description

Description

The entityrdfizer project is designed to convert entities of any domain and their data and metadata into RDF. It requires the entities and their data to be provided as inputs in an ABox CSV template, that is filled in with data. A group of ABox CSV template files are provided under the following URL: https://github.com/cambridge-cares/TheWorldAvatar/tree/master/JPS_Ontology/KBTemplates/ABox

Installation

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Virtual environment setup

It is highly recommended to use a virtual environment for the entityrdfizer installation. The virtual environment can be created as follows:

(Windows)

$ python -m venv entityrdfizer_venv
$ entityrdfizer_venv\Scripts\activate.bat
(entityrdfizer_venv) $

(Linux)

$ python3 -m venv entityrdfizer_venv
$ source entityrdfizer_venv\bin\activate
(entityrdfizer_venv) $

The above commands will create and activate the virtual environment entityrdfizer_venv in the current directory.

Installation via pip

To install the entityrdfizer simply run the following command:

(entityrdfizer_venv) $ pip install entityrdfizer

Installation from the version-controlled source (for developers)

This type of installation is only for the developers. To install entityrdfizer directly from its repository you need to first clone the TheWorldAvatar project. Then simply navigate to the TheWorldAvatar\EntityRDFizer directory and execute the following commands:

# build and install
(entityrdfizer_venv) $ pip install .

# or build for in-place development
(entityrdfizer_venv) $ pip install -e .

Alternatively, use the provided install_rdfizer.sh convenience script, that can create virtual environment and install the entityrdfizer in one go:

# create the environment and install the project
$ install_rdfizer.sh -v -i
# create the environment and install the project for in-place development
$ install_rdfizer.sh -v -i -e

Note that installing the project for in-place development (setting the -e flag) also installs the required python packages for development and testing. To test the code, simply run the following commands:

(entityrdfizer_venv) $ pytest
# or
(entityrdfizer_venv) $ pytest tests

How to use

Usage:
    csv2rdf <csvFileOrDirPath> --csvType=<type> [--outDir=<dir>] [--csvTbox=<tbox>]

Options:
--csvType=<type> Type of the csv file.
                 Choose one of abox/tbox   [default: abox]
--outDir=<dir>   Output directory path
--csvTbox=<tbox> TBox in csv format to validate the input ABox csv file (for ABox writer only)

csv file format for ABox

The input csv file must have at least 6 columns: A,B,C,D,E,F. Extra columns are ignored.

The file specified for parameter --csvTbox should follow the format in examples EntityRDFizer/tests/test_tboxes/ontocompchem/

Rows in csv file contain one of the following:

Ontology description containing prefixes for the TBox and the ABox.

For ABox prefix:

Fot TBox prefix:

The ontology prefix in Col C mush end with SLASH (/) or HASH (#). The full path of entities will be http://www.theworldavatar.com/ontology/ontospecies/ClassName or http://www.theworldavatar.com/ontology/ontospecies/OntoSpecies.owl#ClassName, respectively.

Definition of an instance of class

The name of the instance can be either a full path or relative to the base ontology.

  • Col A: short class name for the ontology defined in the TBox, or a full IRI of class name for a class from an external ontologies
  • Col B: "Instance"
  • Col C: The new instance name. It is possible to provide a full IRI of the instance together with the ontology defined in base,
  • Col D,E,F must be empty.

Relation between two class instances

  • Col A: Subject. An instance name defined earlier in this file, or a full IRI of the instance
  • Col B: "Instance"
  • Col C: Object. The instance defined before this point or a full IRI of the instance
  • Col D: Predicate. Relative name or rull IRI of the triple: Col A predicate Col C.
  • Col E,F are not used. If the instance of classes A,C are relatile paths then they must be defined before this line.

Assign data value to an instance

Data type of the instance can be full path, or one of predefined shortcuts: 'string', 'integer', 'float', 'double', 'decimal', 'datetime', 'boolean'. For the predefined data types it is possible to add the "xsd:" prefix, like 'xsd:string', etc.

  • Col A: Full http:// address of the relation
  • Col B: "Data Property"
  • Col C: instance to assign the value
  • Col D is not used
  • Col E: value to be assigned
  • Col F: data type of the value.

Authors

Feroz Farazi (msff2@cam.ac.uk), 17 May 2021

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

entityrdfizer-1.0.7.tar.gz (27.5 kB view details)

Uploaded Source

Built Distribution

entityrdfizer-1.0.7-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file entityrdfizer-1.0.7.tar.gz.

File metadata

  • Download URL: entityrdfizer-1.0.7.tar.gz
  • Upload date:
  • Size: 27.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.8.10

File hashes

Hashes for entityrdfizer-1.0.7.tar.gz
Algorithm Hash digest
SHA256 3d051abe66b8b2283ca5ab5f0a1612da91f11018376de45eab9d70de819791ce
MD5 da636ae8233436c41b8e5558a9fdc7c4
BLAKE2b-256 65c39a0c05f110d149db3317e9c93f6a981045ba26903e2582e0ad3cd1a9bc2e

See more details on using hashes here.

File details

Details for the file entityrdfizer-1.0.7-py3-none-any.whl.

File metadata

File hashes

Hashes for entityrdfizer-1.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 08b04cda12c0aeecb6c2706cbabbdda290864659181ec8efdfd8a44d2e1da896
MD5 184480588a4bdc0afea4bb4a27e5a01c
BLAKE2b-256 3d4eb53d0d5b2e81ecaa74617078fb4aa80438f9fcf3bc0c157a6259f8b2a68c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page