Skip to main content

Python bindings for the PurRDF RDF 1.2 kernel, GTS carrier, SHACL, and slice tooling

Project description

PurRDF logo — a black cat holding an RDF triple

PurRDF

The RDF 1.2 toolkit with a purr: primitives, codecs, SPARQL, SHACL, and graph transport.

One RDF engine. One behavior. Every language.

CI crates.io PyPI npm License: MIT OR Apache-2.0 MSRV 1.96


Why does this exist?

RDF tooling fragments along two axes.

Across languages: every ecosystem has its own parser, with its own bugs, its own corner-case interpretations, and its own subset of the spec. Move a graph from a Rust service to a Python pipeline to a browser and you have silently changed what the data means three times.

Across time: RDF 1.2 — triple terms, reifiers, base-direction literals — is where the standard is going, and almost no incumbent library carries it.

PurRDF exists so that a graph is the same graph everywhere. It is a from-scratch, dependency-light Rust core — parser to SPARQL engine to SHACL validator to binary transport — carried verbatim into Python, WebAssembly/JavaScript, and C. There are deliberately no Cargo feature flags anywhere in the workspace (CI enforces this): a data carrier must not have optional behavior, so every consumer gets the same byte-identical semantics.

PurRDF is the data backbone of the GMEOW stack and the reference home of the GTS graph-transport engine, but it assumes nothing about your ontology or application.

What's inside

  • RDF 1.2 primitives — an immutable, value-interned dataset IR (TermId space, string arena, copy-on-write mutation), with triple terms in object position, reifier/annotation side-tables, and base-direction literals (rdf:dirLangString).
  • Native codecs — first-party parsers/serializers for Turtle, TriG, N-Triples, N-Quads, RDF/XML, JSON-LD (star), and YAML-LD; byte-deterministic output.
  • Canonicalization — W3C RDFC-1.0 dataset canonicalization, tested against the W3C fixture suite.
  • SPARQL 1.1/1.2 — native parser → algebra → multiset evaluator over the interned IR (property paths, aggregates, EXISTS decorrelation, cost-based BGP planning, injectable SERVICE federation), gated by the W3C SPARQL 1.1 conformance harness. Results in SPARQL JSON/XML/CSV/TSV.
  • SHACL Core validation — a native validator gated against the SHACL conformance corpus, plus scoped SHACL 1.2 draft support for reifier shapes. (ShEx is planned; see docs/CUTOVER.md.)
  • GTS graph transport — a single-file, content-addressed, append-only container for RDF 1.2 graphs and the binaries they reference: BLAKE3-chained CBOR segments, deterministic fold, COSE signing/encryption, pure-Rust crypto (wasm-friendly). Spec in docs/GTS-SPEC.md, frozen cross-language conformance vectors in vectors/.
  • Slices, mappings, and provenance — a manifest-based slice catalog with content-addressed artifact IDs, an explicit RDF↔GTS loss ledger (generated/rdf-loss-matrix.json), SSSOM mapping TSV support, and an FnO function-catalog codec.
  • Zero-dependency foundationspurrdf-iri (RFC 3987/3986) and purrdf-xsd (XSD 1.1 value space) have no runtime dependencies at all; purrdf-events (the object-safe ingestion seam) has none either.

Quickstart

Rust

cargo add purrdf
use purrdf::{parse_dataset, serialize_dataset, RdfDatasetBuilder, RdfLiteral, SerializeGraph};

// Build a dataset in interned TermId space.
let mut b = RdfDatasetBuilder::new();
let alice = b.intern_iri("https://example.org/alice");
let knows = b.intern_iri("http://xmlns.com/foaf/0.1/knows");
let bob = b.intern_iri("https://example.org/bob");
let name = b.intern_iri("http://xmlns.com/foaf/0.1/name");
let hi = b.intern_literal(RdfLiteral::simple("Alice"));
b.push_quad(alice, knows, bob, None);
b.push_quad(alice, name, hi, None);
let ds = b.freeze().expect("freeze");

// Serialize to any native codec and parse back, losslessly.
let ttl = serialize_dataset(&ds, "text/turtle", SerializeGraph::Dataset).unwrap();
let back = parse_dataset(&ttl, "text/turtle", None).unwrap();
assert_eq!(back.quad_count(), 2);

Python

pip install purrdf
import purrdf

quads = purrdf.parse(
    '<https://example.org/alice> <http://xmlns.com/foaf/0.1/name> "Alice" .',
    purrdf.RdfFormat.TURTLE,
)

from purrdf_native import shacl
report = shacl.validate(shapes_ttl=my_shapes, data_nt=my_data)
print(report["conforms"])

The Python package also ships an rdflib compatibility layer and GTS relational exports (gts_to_sqlite, gts_to_duckdb, gts_to_parquet).

JavaScript / WebAssembly

An RDF/JS-shaped API (DataFactory / Dataset / Stream) over the same engine, including the RDF 1.2 features no incumbent RDF/JS library carries — quoted triple terms and base-direction literals:

import { ready, DataFactory, Dataset } from "@blackcatinformatics/purrdf";

await ready(); // one-time async wasm instantiation

const f = new DataFactory();
const rtl = f.directionalLiteral("مرحبا", "ar", "rtl");

const ds = new Dataset();
ds.add(f.quad(f.namedNode("https://ex/s"), f.namedNode("https://ex/says"), rtl));

const nq = ds.serialize("nquads");           // directions survive the round-trip
const reparsed = Dataset.parse(nq, "nquads");

See crates/rdf-wasm (make wasm-pkg builds the ESM package).

C

libpurrdf (crates/rdf-capi) exposes parse, serialize, pattern iteration, copy-on-write mutation, SPARQL, and GTS round-trips behind a panic-safe C ABI with a committed header (include/purrdf.h) that CI checks for drift. Built with cargo-c: make capi-build.

Crate map

Crate What it is
purrdf Umbrella facade: the RDF surface at the root, slice and shapes as modules. Start here.
purrdf-rdf RDF 1.2 implementation: native codecs, GTS adapters, describe, canonicalization entry points.
purrdf-core The kernel: interned IR, diagnostics, store traits, provenance, loss ledger, RDFC-1.0.
purrdf-gts GTS container engine: reader, writer, fold, verify, COSE sign/encrypt.
purrdf-sparql-algebra SPARQL 1.1/1.2 parser → query algebra AST.
purrdf-sparql-eval Multiset SPARQL evaluator in interned TermId space.
purrdf-sparql-results SPARQL results JSON/XML/CSV/TSV, plus a provenance-carrying extension.
purrdf-shapes SHACL Core validation engine.
purrdf-slice Slice catalog: manifests, typed artifacts, ownership/dependency analysis.
purrdf-iri Zero-dependency IRI/URI parsing, resolution, normalization, CURIEs.
purrdf-xsd Zero-dependency XSD 1.1 value space with SPARQL numeric promotion.
purrdf-events Zero-dependency object-safe RDF event sink/source seam.
purrdf-wasm The wasm32 engine behind the purrdf ESM package.
purrdf-capi libpurrdf C ABI (unpublished; built via cargo-c).
purrdf-sparql-conformance W3C SPARQL conformance harness (unpublished).

Fast by measurement, not by assertion

The IR keeps every term once in a string arena addressed by copyable NonZeroU32 ids, hashes with fixed-key ahash everywhere hot, and freezes datasets into Box<[QuadRow]> tables with lazy ordinal permutation indexes (~4 bytes/quad per axis). Performance claims are backed by criterion benchmarks rather than adjectives — crates/rdf-core/benches/ir_layout.rs measures AoS vs. SoA vs. predicate-adjacency layouts (allocation counts, high-water mark, end-to-end latency), and the shipped layout is whichever wins. Run them with make bench.

Conformance

  • SPARQL — W3C SPARQL 1.1 test suite via purrdf-sparql-conformance.
  • SHACL — gated against the SHACL Core conformance corpus (parity with pySHACL, inference="none").
  • RDFC-1.0 — W3C canonicalization fixtures (crates/rdf/tests/fixtures/rdfc/).
  • GTS — the frozen, byte-exact cross-language vector corpus in vectors/, shared with the other GTS engines.

Development

make metadata   # regenerate + verify generated artifacts
make check      # fmt, build, tests, hygiene gates
make bench      # criterion benchmarks

Releases are tag-driven with OIDC trusted publishing (crates.io and PyPI), with build-provenance attestations and SPDX SBOMs — see docs/RELEASE.md.

The GMEOW family

PurRDF is the library layer of a small family of linked-data projects:

  • gmeow-ontology — the GMEOW reasoning-centric super-vocabulary and its publishing toolchain (PurRDF's primary consumer).
  • gmeow-gts — the GTS specification and its multi-language engines; PurRDF hosts the Rust engine.

Extraction history and source commits: PROVENANCE.md. Brand assets and usage: docs/BRAND.md.

License

Licensed under either of Apache License 2.0 or MIT license at your option, as described in LICENSING.md.

If you use PurRDF in research, please cite it — see CITATION.cff.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

purrdf-0.1.3.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

purrdf-0.1.3-cp313-abi3-manylinux_2_34_x86_64.whl (3.1 MB view details)

Uploaded CPython 3.13+manylinux: glibc 2.34+ x86-64

File details

Details for the file purrdf-0.1.3.tar.gz.

File metadata

  • Download URL: purrdf-0.1.3.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for purrdf-0.1.3.tar.gz
Algorithm Hash digest
SHA256 f50e0a4308c5607c87193833d0bcc9cdd4820479f385ef55c8c1c4cbbfdeee6b
MD5 520bad982d01ec5b337e1081605205af
BLAKE2b-256 afb6efb1eaa8b77640e2689ffa11a29b63963d46bb1c679bf87a6e927da1c67b

See more details on using hashes here.

Provenance

The following attestation bundles were made for purrdf-0.1.3.tar.gz:

Publisher: release-pypi.yaml on Blackcat-Informatics/purrdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file purrdf-0.1.3-cp313-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for purrdf-0.1.3-cp313-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 b48c25a9d6c21aa9eb1290e2808578a8193343a8a58bce8988ed55a758118746
MD5 feb51ef31fcd54480d4dbe6a5b4a8d13
BLAKE2b-256 dcbe199d74825416958f10c9fd58c4dca84540afe71e3793e231b81adcf33492

See more details on using hashes here.

Provenance

The following attestation bundles were made for purrdf-0.1.3-cp313-abi3-manylinux_2_34_x86_64.whl:

Publisher: release-pypi.yaml on Blackcat-Informatics/purrdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page