Skip to main content

SQLite-backed RDFLib store

Project description

RDFLib-SQLite3

An SQLite-backed RDFLib store.

RDFLib-SQLite3 allws RDFLib RDF graphs to be persisted in an SQLite database. Furthermore it allows full-text and Geospatial indexing: Using the SQLite FTS5 and R*Tree.

Usage

from rdflib import Graph
import rdflib_sqlite3


# Create a Graph backed by an SQLite database
g = Graph("SQLite3")
# Open and create the database. See https://www.sqlite.org/uri.html for the URI format.
g.open("file:my-rdf.sqlite", create=True)

# do stuff with the graph...

Goals

RDFLib-SQLite3 is primary goal is stability. RDFLib-SQLite3 should be usable with minimal maintainance in many year and the data format should be desigend for long-term readability.

SQLite is a suitable backend as it uses a stable file format, offers long term support is well tested and widely used. RDFLib-SQLite3 uses the SQLite bindings that are provided as part of the Python 3 standard library, which will most likely be included in future versions of Python 3.

Database Schema

The RDF graphs is peristed in the SQLite database using two tables:

  • rdf_term:

     CREATE TABLE IF NOT EXISTS rdf_term (
     	id INTEGER PRIMARY KEY,
     	term BLOB UNIQUE
     );
    

    This is a mapping from RDF terms encoded using RDF/CBOR to integer identifiers.

  • rdf_triple:

     CREATE_RDF_TRIPLE_TABLE = """
     CREATE TABLE IF NOT EXISTS rdf_triple (
     	subject INTEGER NOT_NULL REFERENCES rdf_term ON DELETE RESTRICT,
     	predicate INTEGER NOT_NULL REFERENCES rdf_term ON DELETE RESTRICT,
     	object INTEGER NOT_NULL REFERENCES rdf_term ON DELETE RESTRICT
     );
    

    Holds triples with triple elements being identifiers as stored in the rdf_term table. Additional indices are defined on the rdf_triple table for efficient querying (and ensuring uniquness of triples).

Limitations

  • No support for Quads
  • No support for REGEXTerm, Date?, DateRagen? queries

TODOs

  • Triple removal
  • Tests
  • Database destruction (destroy method)
  • Database garbage collection (gc method)
  • Full-text search
  • Geospatial queries
  • Make SPARQL queries more efficient. RDFLib provides a SPARQL implementation that works with RDFLib-SQLite3. Unfortunately, performance is very limited as the SPARQL implementation does everyhing in Python. It would make much more efficient to offload query optimization, joins and even recursive queries to SQLite. This amounts to writing an SPARQL implementation that knows how to take advantage of SQLite.

Related Software

Publishing to PyPi

Make sure version is set propertly in pyproject.toml and rdflib_sqlite3/__init__.py.

pip install build twine

# Build the package
python -m build

# Upload using twine
twine upload dist/*

Acknowledgments

This software was initially developed as part of the SNSF-Ambizione funded research project "Computing the Social. Psychographics and Social Physics in the Digital Age".

License

AGPL-3.0-or-later

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdflib_sqlite3-0.1.0.tar.gz (28.5 kB view details)

Uploaded Source

Built Distribution

rdflib_sqlite3-0.1.0-py3-none-any.whl (28.7 kB view details)

Uploaded Python 3

File details

Details for the file rdflib_sqlite3-0.1.0.tar.gz.

File metadata

  • Download URL: rdflib_sqlite3-0.1.0.tar.gz
  • Upload date:
  • Size: 28.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.7

File hashes

Hashes for rdflib_sqlite3-0.1.0.tar.gz
Algorithm Hash digest
SHA256 391b98f71e4f810fb20737fbdb6cb9b1e90f788738e190eb6402b9ee13080f01
MD5 0c0b7b5c72380881880da6b9fb70520d
BLAKE2b-256 eaf3122dfcf0d09531eabf454994eb70106ea15953f7f96048f1429fe9bc931d

See more details on using hashes here.

File details

Details for the file rdflib_sqlite3-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for rdflib_sqlite3-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 db84c1a8707b971fc7d42fbb4ec8212f2657b0a75920870cc31bc166ab365f5d
MD5 d397a763575caaf01d9b285bbb72a282
BLAKE2b-256 1d18ec679b041290446f1b47fd56804f91f0c8f736f9ab5ff4d710ce3c4bad56

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page