Python bindings for the PurRDF RDF 1.2 kernel, GTS carrier, SHACL, and slice tooling
Project description
PurRDF
The RDF 1.2 toolkit with a purr: primitives, codecs, SPARQL, SHACL, and graph transport.
One RDF engine. One behavior. Every language.
Why does this exist?
RDF tooling fragments along two axes.
Across languages: every ecosystem has its own parser, with its own bugs, its own corner-case interpretations, and its own subset of the spec. Move a graph from a Rust service to a Python pipeline to a browser and you have silently changed what the data means three times.
Across time: RDF 1.2 — triple terms, reifiers, base-direction literals — is where the standard is going, and almost no incumbent library carries it.
PurRDF exists so that a graph is the same graph everywhere. It is a from-scratch, dependency-light Rust core — parser to SPARQL engine to SHACL validator to binary transport — carried verbatim into Python, WebAssembly/JavaScript, and C. There are deliberately no Cargo feature flags anywhere in the workspace (CI enforces this): a data carrier must not have optional behavior, so every consumer gets the same byte-identical semantics.
PurRDF is the data backbone of the GMEOW stack and the reference home of the GTS graph-transport engine, but it assumes nothing about your ontology or application.
What's inside
- RDF 1.2 primitives — an immutable, value-interned dataset IR (
TermIdspace, string arena, copy-on-write mutation), with triple terms in object position, reifier/annotation side-tables, and base-direction literals (rdf:dirLangString). - Native codecs — first-party parsers/serializers for Turtle, TriG, N-Triples, N-Quads, RDF/XML, JSON-LD (star), and YAML-LD; byte-deterministic output.
- Canonicalization — W3C RDFC-1.0 dataset canonicalization, tested against the W3C fixture suite.
- SPARQL 1.1/1.2 — native parser → algebra → multiset evaluator over the interned IR (property paths, aggregates, EXISTS decorrelation, cost-based BGP planning, injectable SERVICE federation), gated by the W3C SPARQL 1.1 conformance harness. Results in SPARQL JSON/XML/CSV/TSV.
- SHACL Core validation — a native validator gated against the SHACL conformance
corpus, plus scoped SHACL 1.2 draft support for reifier shapes. (ShEx is planned;
see
docs/CUTOVER.md.) - GTS graph transport — a single-file, content-addressed, append-only container
for RDF 1.2 graphs and the binaries they reference: BLAKE3-chained CBOR segments,
deterministic fold, COSE signing/encryption, pure-Rust crypto (wasm-friendly).
Spec in
docs/GTS-SPEC.md, frozen cross-language conformance vectors invectors/. - Slices, mappings, and provenance — a manifest-based slice catalog with
content-addressed artifact IDs, an explicit RDF↔GTS loss ledger
(
generated/rdf-loss-matrix.json), SSSOM mapping TSV support, and an FnO function-catalog codec. - Zero-dependency foundations —
purrdf-iri(RFC 3987/3986) andpurrdf-xsd(XSD 1.1 value space) have no runtime dependencies at all;purrdf-events(the object-safe ingestion seam) has none either.
Quickstart
Rust
cargo add purrdf
use purrdf::{parse_dataset, serialize_dataset, RdfDatasetBuilder, RdfLiteral, SerializeGraph};
// Build a dataset in interned TermId space.
let mut b = RdfDatasetBuilder::new();
let alice = b.intern_iri("https://example.org/alice");
let knows = b.intern_iri("http://xmlns.com/foaf/0.1/knows");
let bob = b.intern_iri("https://example.org/bob");
let name = b.intern_iri("http://xmlns.com/foaf/0.1/name");
let hi = b.intern_literal(RdfLiteral::simple("Alice"));
b.push_quad(alice, knows, bob, None);
b.push_quad(alice, name, hi, None);
let ds = b.freeze().expect("freeze");
// Serialize to any native codec and parse back, losslessly.
let ttl = serialize_dataset(&ds, "text/turtle", SerializeGraph::Dataset).unwrap();
let back = parse_dataset(&ttl, "text/turtle", None).unwrap();
assert_eq!(back.quad_count(), 2);
Python
pip install purrdf
import purrdf
quads = purrdf.parse(
'<https://example.org/alice> <http://xmlns.com/foaf/0.1/name> "Alice" .',
purrdf.RdfFormat.TURTLE,
)
from purrdf_native import shacl
report = shacl.validate(shapes_ttl=my_shapes, data_nt=my_data)
print(report["conforms"])
The Python package also ships an rdflib compatibility layer
and GTS relational exports (gts_to_sqlite, gts_to_duckdb, gts_to_parquet).
JavaScript / WebAssembly
An RDF/JS-shaped API (DataFactory / Dataset / Stream)
over the same engine, including the RDF 1.2 features no incumbent RDF/JS library
carries — quoted triple terms and base-direction literals:
import { ready, DataFactory, Dataset } from "@blackcatinformatics/purrdf";
await ready(); // one-time async wasm instantiation
const f = new DataFactory();
const rtl = f.directionalLiteral("مرحبا", "ar", "rtl");
const ds = new Dataset();
ds.add(f.quad(f.namedNode("https://ex/s"), f.namedNode("https://ex/says"), rtl));
const nq = ds.serialize("nquads"); // directions survive the round-trip
const reparsed = Dataset.parse(nq, "nquads");
See crates/rdf-wasm (make wasm-pkg builds the ESM package).
C
libpurrdf (crates/rdf-capi) exposes parse, serialize,
pattern iteration, copy-on-write mutation, SPARQL, and GTS round-trips behind a
panic-safe C ABI with a committed header (include/purrdf.h)
that CI checks for drift. Built with cargo-c: make capi-build.
Crate map
| Crate | What it is |
|---|---|
purrdf |
Umbrella facade: the RDF surface at the root, slice and shapes as modules. Start here. |
purrdf-rdf |
RDF 1.2 implementation: native codecs, GTS adapters, describe, canonicalization entry points. |
purrdf-core |
The kernel: interned IR, diagnostics, store traits, provenance, loss ledger, RDFC-1.0. |
purrdf-gts |
GTS container engine: reader, writer, fold, verify, COSE sign/encrypt. |
purrdf-sparql-algebra |
SPARQL 1.1/1.2 parser → query algebra AST. |
purrdf-sparql-eval |
Multiset SPARQL evaluator in interned TermId space. |
purrdf-sparql-results |
SPARQL results JSON/XML/CSV/TSV, plus a provenance-carrying extension. |
purrdf-shapes |
SHACL Core validation engine. |
purrdf-slice |
Slice catalog: manifests, typed artifacts, ownership/dependency analysis. |
purrdf-iri |
Zero-dependency IRI/URI parsing, resolution, normalization, CURIEs. |
purrdf-xsd |
Zero-dependency XSD 1.1 value space with SPARQL numeric promotion. |
purrdf-events |
Zero-dependency object-safe RDF event sink/source seam. |
purrdf-wasm |
The wasm32 engine behind the purrdf ESM package. |
purrdf-capi |
libpurrdf C ABI (unpublished; built via cargo-c). |
purrdf-sparql-conformance |
W3C SPARQL conformance harness (unpublished). |
Fast by measurement, not by assertion
The IR keeps every term once in a string arena addressed by copyable
NonZeroU32 ids, hashes with fixed-key ahash everywhere hot, and freezes datasets
into Box<[QuadRow]> tables with lazy ordinal permutation indexes (~4 bytes/quad
per axis). Performance claims are backed by criterion benchmarks rather than
adjectives — crates/rdf-core/benches/ir_layout.rs measures AoS vs. SoA vs.
predicate-adjacency layouts (allocation counts, high-water mark, end-to-end
latency), and the shipped layout is whichever wins. Run them with make bench.
Conformance
- SPARQL — W3C SPARQL 1.1 test suite via
purrdf-sparql-conformance. - SHACL — gated against the SHACL Core conformance corpus (parity with pySHACL,
inference="none"). - RDFC-1.0 — W3C canonicalization fixtures (
crates/rdf/tests/fixtures/rdfc/). - GTS — the frozen, byte-exact cross-language vector corpus in
vectors/, shared with the other GTS engines.
Development
make metadata # regenerate + verify generated artifacts
make check # fmt, build, tests, hygiene gates
make bench # criterion benchmarks
Releases are tag-driven with OIDC trusted publishing (crates.io and PyPI), with
build-provenance attestations and SPDX SBOMs — see docs/RELEASE.md.
The GMEOW family
PurRDF is the library layer of a small family of linked-data projects:
gmeow-ontology— the GMEOW reasoning-centric super-vocabulary and its publishing toolchain (PurRDF's primary consumer).gmeow-gts— the GTS specification and its multi-language engines; PurRDF hosts the Rust engine.
Extraction history and source commits: PROVENANCE.md.
Brand assets and usage: docs/BRAND.md.
License
Licensed under either of Apache License 2.0 or
MIT license at your option, as described in
LICENSING.md.
If you use PurRDF in research, please cite it — see CITATION.cff.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file purrdf-0.1.3.tar.gz.
File metadata
- Download URL: purrdf-0.1.3.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f50e0a4308c5607c87193833d0bcc9cdd4820479f385ef55c8c1c4cbbfdeee6b
|
|
| MD5 |
520bad982d01ec5b337e1081605205af
|
|
| BLAKE2b-256 |
afb6efb1eaa8b77640e2689ffa11a29b63963d46bb1c679bf87a6e927da1c67b
|
Provenance
The following attestation bundles were made for purrdf-0.1.3.tar.gz:
Publisher:
release-pypi.yaml on Blackcat-Informatics/purrdf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
purrdf-0.1.3.tar.gz -
Subject digest:
f50e0a4308c5607c87193833d0bcc9cdd4820479f385ef55c8c1c4cbbfdeee6b - Sigstore transparency entry: 2043153761
- Sigstore integration time:
-
Permalink:
Blackcat-Informatics/purrdf@29f6f3a986bbbb1d77ef4713491035dc70e509d3 -
Branch / Tag:
refs/tags/py-v0.1.3 - Owner: https://github.com/Blackcat-Informatics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yaml@29f6f3a986bbbb1d77ef4713491035dc70e509d3 -
Trigger Event:
push
-
Statement type:
File details
Details for the file purrdf-0.1.3-cp313-abi3-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: purrdf-0.1.3-cp313-abi3-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 3.1 MB
- Tags: CPython 3.13+, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b48c25a9d6c21aa9eb1290e2808578a8193343a8a58bce8988ed55a758118746
|
|
| MD5 |
feb51ef31fcd54480d4dbe6a5b4a8d13
|
|
| BLAKE2b-256 |
dcbe199d74825416958f10c9fd58c4dca84540afe71e3793e231b81adcf33492
|
Provenance
The following attestation bundles were made for purrdf-0.1.3-cp313-abi3-manylinux_2_34_x86_64.whl:
Publisher:
release-pypi.yaml on Blackcat-Informatics/purrdf
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
purrdf-0.1.3-cp313-abi3-manylinux_2_34_x86_64.whl -
Subject digest:
b48c25a9d6c21aa9eb1290e2808578a8193343a8a58bce8988ed55a758118746 - Sigstore transparency entry: 2043153813
- Sigstore integration time:
-
Permalink:
Blackcat-Informatics/purrdf@29f6f3a986bbbb1d77ef4713491035dc70e509d3 -
Branch / Tag:
refs/tags/py-v0.1.3 - Owner: https://github.com/Blackcat-Informatics
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yaml@29f6f3a986bbbb1d77ef4713491035dc70e509d3 -
Trigger Event:
push
-
Statement type: