Wraps several RDF schema solver tools
Project description
RDFSolve
Extract RDF schemas from SPARQL endpoints and convert to multiple formats (VoID, LinkML, JSON-LD).
Installation
uv pip install rdfsolve
Quick Start
CLI
Extract schema and convert to multiple formats:
# Discover existing VoID metadata (fast)
rdfsolve discover --endpoint https://sparql.rhea-db.org/sparql
# Extract schema (uses discovered VoID if available)
rdfsolve extract --endpoint https://sparql.rhea-db.org/sparql \
--output-dir ./output
# Export to different formats
rdfsolve export --void-file ./output/void_description.ttl \
--format all --output-dir ./output
Extract Command Options:
# Force fresh generation (bypasses discovered VoID)
rdfsolve extract --endpoint URL --force-generate
# Custom naming and URIs
rdfsolve extract --endpoint URL \
--dataset-name mydata \
--void-base-uri "http://example.org/mydata/well-known/void"
# Filter specific graphs
rdfsolve extract --endpoint URL \
--graph-uri http://example.org/graph1 \
--graph-uri http://example.org/graph2
Export Formats:
csv- Schema patterns tablejsonld- JSON-LD representationlinkml- LinkML YAML schemashacl- SHACL shapes for RDF validationrdfconfig- RDF-config YAML files (model, prefix, endpoint)coverage- Pattern frequency analysisall- All formats (default)
Export with custom LinkML schema:
rdfsolve export --void-file void_description.ttl \
--format linkml \
--schema-name custom_schema \
--schema-uri "http://example.org/schemas/custom" \
--schema-description "Custom schema description"
Export SHACL shapes for RDF validation:
# Export closed SHACL shapes (strict validation)
rdfsolve export --void-file void_description.ttl \
--format shacl \
--shacl-closed \
--shacl-suffix Shape
# Export open SHACL shapes (flexible validation)
rdfsolve export --void-file void_description.ttl \
--format shacl \
--shacl-open
SHACL (Shapes Constraint Language) shapes define constraints on RDF data and can be used to validate RDF instances against the extracted schema. Closed shapes only allow properties explicitly defined in the schema, while open shapes are more permissive.
Export RDF-config files:
rdfsolve export --void-file void_description.ttl \
--format rdfconfig \
--endpoint-url https://sparql.example.org/sparql \
--graph-uri http://example.org/graph \
--output-dir ./output
Creates a directory {dataset}_config/ containing:
model.yml- Class and property structureprefix.yml- Namespace prefix definitionsendpoint.yml- SPARQL endpoint configuration
This structure is required by the rdf-config tool.
Count instances per class:
rdfsolve count --endpoint URL --output counts.csv
Service graph filtering:
By default, extract and count exclude Virtuoso system graphs and well-known URIs. Use --include-service-graphs to include them.
Python API
from rdfsolve.api import (
generate_void_from_endpoint,
load_parser_from_graph,
count_instances_per_class,
to_shacl_from_file,
to_rdfconfig_from_file,
)
# Generate VoID from endpoint
void_graph = generate_void_from_endpoint(
endpoint_url="https://sparql.example.org/",
graph_uris=["http://example.org/graph"],
void_base_uri="http://example.org/void", # Custom partition URIs
)
# Load parser and extract schema
parser = load_parser_from_graph(void_graph)
# Export to different formats
schema_df = parser.to_schema() # Pandas DataFrame
schema_jsonld = parser.to_jsonld() # JSON-LD
linkml_yaml = parser.to_linkml_yaml(
schema_name="my_schema",
schema_base_uri="http://example.org/schemas/my_schema"
)
# Export to SHACL shapes for validation
shacl_ttl = parser.to_shacl(
schema_name="my_schema",
schema_base_uri="http://example.org/schemas/my_schema",
closed=True, # Closed shapes for strict validation
suffix="Shape", # Append "Shape" to class names
)
# Or use the convenience function
shacl_ttl = to_shacl_from_file(
"void_description.ttl",
schema_name="my_schema",
closed=True,
)
# Export to RDF-config format
rdfconfig = to_rdfconfig_from_file(
"void_description.ttl",
endpoint_url="https://sparql.example.org/",
graph_uri="http://example.org/graph",
)
# Save to {dataset}_config/ directory structure
import os
os.makedirs("dataset_config", exist_ok=True)
with open("dataset_config/model.yml", "w") as f:
f.write(rdfconfig["model"])
with open("dataset_config/prefix.yml", "w") as f:
f.write(rdfconfig["prefix"])
with open("dataset_config/endpoint.yml", "w") as f:
f.write(rdfconfig["endpoint"])
# Count instances per class
class_counts = count_instances_per_class(
"https://sparql.example.org/",
graph_uris=["http://example.org/graph"],
)
Features
- Extract RDF schemas from SPARQL endpoints using VoID partitions
- Discover existing VoID metadata or generate fresh
- Export to multiple formats: CSV, JSON-LD, LinkML, SHACL, RDF-config, coverage analysis
- SHACL shapes generation for RDF data validation
- RDF-config export for schema documentation (compatible with rdf-config tool)
- Customizable dataset naming and VoID partition URIs
- Service graph filtering (excludes Virtuoso system graphs by default)
- Instance counting per class with optional sampling
Documentation
- Documentation: rdfsolve.readthedocs.io
- Results dashboard: jmillanacosta.github.io/rdfsolve
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rdfsolve-0.0.1.tar.gz.
File metadata
- Download URL: rdfsolve-0.0.1.tar.gz
- Upload date:
- Size: 718.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cae5107f05188799255acf21e0bfa1e39b1b2cc77da092eacc109d5efefcf172
|
|
| MD5 |
0f8b1a059619924cc2fbbcbc8a3821dc
|
|
| BLAKE2b-256 |
bca59821b4dab7d22471e76d13d50011bafcdd8cedb141c2b1998054a68af6e8
|
Provenance
The following attestation bundles were made for rdfsolve-0.0.1.tar.gz:
Publisher:
python-publish.yml on jmillanacosta/rdfsolve
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rdfsolve-0.0.1.tar.gz -
Subject digest:
cae5107f05188799255acf21e0bfa1e39b1b2cc77da092eacc109d5efefcf172 - Sigstore transparency entry: 760287799
- Sigstore integration time:
-
Permalink:
jmillanacosta/rdfsolve@d264fb26985add9f090c5ec4b66453112bf9fdab -
Branch / Tag:
refs/tags/0.0.1 - Owner: https://github.com/jmillanacosta
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@d264fb26985add9f090c5ec4b66453112bf9fdab -
Trigger Event:
release
-
Statement type:
File details
Details for the file rdfsolve-0.0.1-py3-none-any.whl.
File metadata
- Download URL: rdfsolve-0.0.1-py3-none-any.whl
- Upload date:
- Size: 67.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ff1eec77c57fe11c0cfbfaa9c0ed33db9e0fd8d462780d2db5c8182f7293f22
|
|
| MD5 |
941044799cde0a57ed36eeecc7743eb8
|
|
| BLAKE2b-256 |
6a1af7b0b9bc936d8b02c4ad72f0f76e8f2d5b24d4402bdd30b79912a0cad105
|
Provenance
The following attestation bundles were made for rdfsolve-0.0.1-py3-none-any.whl:
Publisher:
python-publish.yml on jmillanacosta/rdfsolve
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rdfsolve-0.0.1-py3-none-any.whl -
Subject digest:
3ff1eec77c57fe11c0cfbfaa9c0ed33db9e0fd8d462780d2db5c8182f7293f22 - Sigstore transparency entry: 760287802
- Sigstore integration time:
-
Permalink:
jmillanacosta/rdfsolve@d264fb26985add9f090c5ec4b66453112bf9fdab -
Branch / Tag:
refs/tags/0.0.1 - Owner: https://github.com/jmillanacosta
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@d264fb26985add9f090c5ec4b66453112bf9fdab -
Trigger Event:
release
-
Statement type: