No project description provided
Project description
RDFIngest
RDFIngest - A simple tool for ingesting local and remote RDF data sources into a triplestore.
WARNING: This project is in an early stage of development and should be used with caution.
Requirements
- Python >= 3.11
Installation
RDFIngest is availabe on PyPI:
pip install rdfingest
Also the RDFIngest CLI can be installed with pipx:
pipx install rdfingest
For installation from source either use poetry or run pip install .
from the package folder.
Usage
RDFIngest reads two YAML files:
- a config file for obtaining triplestore credentials and
- a registry which defines the RDF sources to be ingested.
Example config:
service:
endpoint: "https://sometriplestore.endpoint"
user: "admin"
password: "supersecretpassword123"
Example registry:
graphs:
- source: https://someremote.ttl
graph_id: https://somenamedgraph.id
- source: [
somelocal.ttl,
https://someotherremote.ttl
]
graph_id: https://someothernamedgraph.id
- source: https://someremote.trig
- source: [
https://someotherremote.trig,
someotherlocal.ttl,
yetanotherremote.ttl
]
graph_id: https://yetanothernamedgraph.id
RDFIngest parses all registered RDF sources and ingests the data as named graphs into the specified triplestore by executing POST requests for every source.
By default also a SPARQL DROP operation is run for every Graph ID before POSTing.
For contextless RDF sources a graph_id
is required, RDF Datasets/Quad formats obviously do not require a graph_id
field.
For Datasets, the default graph (at least for now) is ignored. Running automated DROP and/or POST operations on a remote default graph is considered somewhat dangerous.
Namespaces are one honking great idea -- let's do more of those!
The tool accepts both local and remote RDF data sources.
Entry example
Consider the following entry:
graphs:
- source: [
https://someremote.trig,
somelocal.ttl,
anotherremote.ttl
]
graph_id: https://somenamedgraph.id/
In this case every named graph in the Dataset https://someremote.trig
is ingested using their respective named graph identifiers,
somelocal.ttl
and anotherremote.ttl
are ingested into a named graph https://somenamedgraph.id/
.
CLI
Run the rdfingest
command.
rdfingest --config ./config.yaml --registry ./registry.yaml
Default values for config and registry are ./config.yaml
and ./registry.yaml
.
Also see rdfingest --help
.
RDFIngest class
Point an RDFIngest
instance to a config file and a registry and invoke run_ingest
.
rdfingest = RDFIngest(
config="./config.yaml"
registry="./registry.yaml",
drop=True,
debug=False
)
rdfingest.run_ingest()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for rdfingest-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27d79dcebe4a8b53f9bd0099822ac3d7e66f578dc3917f1b237dddf6d96acb87 |
|
MD5 | ca22b3bdad013a4a7f62ca74d6b79c08 |
|
BLAKE2b-256 | a8273dfc03bd2c2325e0edf23c8336ae6250661534697179187d2e00ac63604a |