No project description provided
Project description
RDFIngest
RDFIngest - A simple tool for ingesting local and remote RDF data sources into a triplestore.
WARNING: This project is in an early stage of development and should be used with caution.
Requirements
- Python >= 3.11
Installation
RDFIngest is availabe on PyPI:
pip install rdfingest
Also the RDFIngest CLI can be installed with pipx:
pipx install rdfingest
For installation from source either use poetry or run pip install .
from the package folder.
Usage
RDFIngest reads two YAML files:
- a config file for obtaining triplestore credentials and
- a registry which defines the RDF sources to be ingested.
Example config:
service:
endpoint: "https://sometriplestore.endpoint"
user: "admin"
password: "supersecretpassword123"
Example registry:
graphs:
- source: https://someremote.ttl
graph_id: https://somenamedgraph.id
- source: [
somelocal.ttl,
https://someotherremote.ttl
]
graph_id: https://someothernamedgraph.id
- source: https://someremote.trig
- source: [
https://someotherremote.trig,
someotherlocal.ttl,
yetanotherremote.ttl
]
graph_id: https://yetanothernamedgraph.id
RDFIngest parses all registered RDF sources and ingests the data as named graphs into the specified triplestore by executing POST requests for every source.
By default also a SPARQL DROP operation is run for every Graph ID before POSTing.
For contextless RDF sources a graph_id
is required, RDF Datasets/Quad formats obviously do not require a graph_id
field.
For Datasets, the default graph (at least for now) is ignored. Running automated DROP and/or POST operations on a remote default graph is considered somewhat dangerous.
Namespaces are one honking great idea -- let's do more of those!
The tool accepts both local and remote RDF data sources.
Entry example
Consider the following entry:
graphs:
- source: [
https://someremote.trig,
somelocal.ttl,
anotherremote.ttl
]
graph_id: https://somenamedgraph.id/
In this case every named graph in the Dataset https://someremote.trig
is ingested using their respective named graph identifiers,
somelocal.ttl
and anotherremote.ttl
are ingested into a named graph https://somenamedgraph.id/
.
CLI
Run the rdfingest
command.
rdfingest --config ./config.yaml --registry ./registry.yaml
Default values for config and registry are ./config.yaml
and ./registry.yaml
.
Also see rdfingest --help
.
RDFIngest class
Point an RDFIngest
instance to a config file and a registry and invoke run_ingest
.
rdfingest = RDFIngest(
config="./config.yaml"
registry="./registry.yaml",
drop=True,
debug=False
)
rdfingest.run_ingest()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for rdfingest-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 32753e8ec0afeff2bf4f082fa616cbf6854e67807a571f6b807d21b636723ee8 |
|
MD5 | 4d3b7239d237286bb5b83dee04ffbaf2 |
|
BLAKE2b-256 | ffa4b74431d4d832feb9f8474b6dc069072c5e3fe5767d5b8fef6f7e2e9d71a6 |