Skip to main content

No project description provided

Project description

<img source="goku_rdf_slurp.png" width=10% height=10%>

RDFIngest

tests Coverage Status PyPI version license: GPL v3

RDFIngest - A simple tool for ingesting local and remote RDF data sources into a triplestore.

WARNING: This project is in an early stage of development and should be used with caution.

Requirements

  • Python >= 3.11

Installation

RDFIngest is availabe on PyPI:

pip install rdfingest

Also the RDFIngest CLI can be installed with pipx:

pipx install rdfingest

For installation from source either use poetry or run pip install . from the package folder.

Usage

RDFIngest reads two YAML files:

  • a config file for obtaining triplestore credentials and
  • a registry which defines the RDF sources to be ingested.

Example config:

service:
  endpoint: "https://sometriplestore.endpoint"
  user: "admin"
  password: "supersecretpassword123"

Example registry:

graphs:
  - source: https://someremote.ttl
    graph_id: https://somenamedgraph.id

  - source: [
    somelocal.ttl,
    https://someotherremote.ttl
    ]
    graph_id: https://someothernamedgraph.id
    
  - source: https://someremote.trig
  
  - source: [
    https://someotherremote.trig,
    someotherlocal.ttl,
    yetanotherremote.ttl	
    ]
    graph_id: https://yetanothernamedgraph.id

RDFIngest parses all registered RDF sources and ingests the data as named graphs into the specified triplestore by executing POST requests for every source.

By default also a SPARQL DROP operation is run for every Graph ID before POSTing.

For contextless RDF sources a graph_id is required, RDF Datasets/Quad formats obviously do not require a graph_id field.

For Datasets, the default graph (at least for now) is ignored. Running automated DROP and/or POST operations on a remote default graph is considered somewhat dangerous.

Namespaces are one honking great idea -- let's do more of those!

The tool accepts both local and remote RDF data sources.

Entry example

Consider the following entry:

graphs:
 - source: [
    https://someremote.trig,
    somelocal.ttl,
    anotherremote.ttl	
    ]
    graph_id: https://somenamedgraph.id/

In this case every named graph in the Dataset https://someremote.trig is ingested using their respective named graph identifiers, somelocal.ttl and anotherremote.ttl are ingested into a named graph https://somenamedgraph.id/.

CLI

Run the rdfingest command.

rdfingest --config ./config.yaml --registry ./registry.yaml

Default values for config and registry are ./config.yaml and ./registry.yaml.

Also see rdfingest --help.

RDFIngest class

Point an RDFIngest instance to a config file and a registry and invoke run_ingest.

rdfingest = RDFIngest(
	config="./config.yaml"
	registry="./registry.yaml", 
	drop=True,
	debug=False
)

rdfingest.run_ingest()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdfingest-0.1.2.tar.gz (20.9 kB view details)

Uploaded Source

Built Distribution

rdfingest-0.1.2-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file rdfingest-0.1.2.tar.gz.

File metadata

  • Download URL: rdfingest-0.1.2.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.6 Linux/6.7.1-arch1-1

File hashes

Hashes for rdfingest-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e05a6a976f6528789b2a69e83cb5f6ed1e8892a86101b59a47c62a340000d94c
MD5 56dabb1aba22d454cfe4c91de460ac2a
BLAKE2b-256 9207058eb8a57ab5c57ef3d3cb66d776bb64528b4c8dae5a8a7904f7b78f3619

See more details on using hashes here.

File details

Details for the file rdfingest-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: rdfingest-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.6 Linux/6.7.1-arch1-1

File hashes

Hashes for rdfingest-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 32753e8ec0afeff2bf4f082fa616cbf6854e67807a571f6b807d21b636723ee8
MD5 4d3b7239d237286bb5b83dee04ffbaf2
BLAKE2b-256 ffa4b74431d4d832feb9f8474b6dc069072c5e3fe5767d5b8fef6f7e2e9d71a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page