An SQLAlchemy-like ORM for SPARQL endpoints.
Project description
SPARQLMojo
An SQLAlchemy-like ORM for SPARQL endpoints with Pydantic validation. Currently in beta, so there may be breaking changes.
Table of Contents
- Full Documentation Index
- Features
- Installation
- Version
- Usage
- HTTP Method Configuration
- Identity Map
- PREFIX Management System
- Language-Tagged Literals
- Collection Fields
- UPDATE Operations
- VALUES Clause Support
- Property Paths
- Ontology-Aware Models with SchemaRegistry
- Class Hierarchy Support
- Field-Level Filtering
- Running Tests
- Test Dataset
- Limitations
- Known Issues and Risks
- Release Process
- Dependencies
- Key Benefits of Pydantic Integration
- License
Features
- Declarative RDF models using Python classes with Pydantic validation
- Type-safe field definitions with automatic validation
- A session layer for querying and updating SPARQL endpoints
- A query compiler that converts Pythonic queries to SPARQL
- Session identity map to prevent duplicate instances and ensure consistency
- PREFIX management system for namespace handling with short-form IRIs
- Language-tagged literal support for multilingual text data
- Property path support with ORM-like convenience methods and inverse path support for reverse relationship traversal
- Field-level filtering with intuitive syntax and automatic datatype casting for numeric comparisons
- String filtering on IRI fields with chainable
str(),lower(),upper()methods for case-insensitive matching - Ontology-aware models with SchemaRegistry for automatic inverse relationship discovery via
owl:inverseOf - InverseField for clean, semantic reverse relationship navigation with automatic fallback to SPARQL
^operator - Class hierarchy support with automatic polymorphic queries — querying a base class returns all subclass instances without any extra configuration
Installation
# Install dependencies
poetry install
# Or install the package in editable mode
pip install -e .
Version
Check the installed version:
import sparqlmojo
print(sparqlmojo.__version__) # Output: 0.1.0
Or from the command line:
python -c "import sparqlmojo; print(sparqlmojo.__version__)"
Versioning Workflow
This project uses semantic versioning with automated releases. See the Release Process section for details on creating releases.
Usage
from typing import Annotated
from sparqlmojo import (
Condition,
InverseField,
IRIField,
LiteralField,
Model,
ObjectPropertyField,
RDF_TYPE,
SchemaRegistry,
Session,
SPARQLCompiler,
SubjectField,
)
class Person(Model):
rdf_type: Annotated[str, IRIField(RDF_TYPE, default="schema:Person")]
iri: Annotated[str, SubjectField()]
name: Annotated[str | None, LiteralField("schema:name")] = None
age: Annotated[int | None, LiteralField("schema:age")] = None
knows: Annotated[str | None, ObjectPropertyField("schema:knows", range_="Person")] = None
# Create a session
s = Session(endpoint="http://example.org/sparql")
# For endpoints with separate read/write URLs (e.g., Fuseki):
# s = Session(
# endpoint="http://example.org/sparql", # For SELECT queries
# write_endpoint="http://example.org/update" # For INSERT/DELETE/UPDATE
# )
# Configure HTTP method for SELECT queries (see "HTTP Method Configuration" below):
# s = Session(endpoint="http://example.org/sparql", query_method="GET")
# Build and compile a query
q = s.query(Person).filter(Condition("age", ">", 30)).limit(5)
sparql = SPARQLCompiler.compile_query(q)
print(sparql)
# Create an instance with validation
bob = Person(iri="http://example.org/bob", name="Bob", age=28)
s.add(bob)
s.commit()
# Pydantic validates types automatically
try:
invalid = Person(iri="http://example.org/alice", name="Alice", age="not a number") # Raises ValidationError
except Exception as e:
print(f"Validation error: {e}")
HTTP Method Configuration
SPARQLMojo supports configurable HTTP methods for SPARQL SELECT queries. By default, POST is used to avoid URL length limitations with large queries.
Query Methods
| Method | Description | Use Case |
|---|---|---|
POST |
Use HTTP POST for SELECT queries (default) | Recommended for most cases; avoids URL length issues |
GET |
Use HTTP GET for SELECT queries | Required by some read-only endpoints; better caching |
Configuration
from sparqlmojo import Session
# Default: Always use POST (safest option)
session = Session(endpoint="http://example.org/sparql")
# or explicitly:
session = Session(endpoint="http://example.org/sparql", query_method="POST")
# Use GET (for endpoints that require it or for caching benefits)
session = Session(endpoint="http://example.org/sparql", query_method="GET")
When to Use Each Mode
POST (Default)
- Recommended for most applications
- No risk of HTTP 414 "URI Too Long" errors
- Works with queries of any size, including large VALUES clauses
- Some proxies/CDNs may not cache POST requests
GET
- Better HTTP caching (responses can be cached by proxies)
- Required by some read-only SPARQL endpoints
- Risk of HTTP 414 errors with large queries (URLs > 2000 characters)
- Query is visible in server access logs (potential security consideration)
Note: UPDATE queries (INSERT, DELETE) always use POST regardless of this setting, as required by the SPARQL protocol.
Identity Map
SPARQLMojo now includes a Session identity map to prevent duplicate instances and ensure consistency:
# First retrieval creates new instance
person1 = session.get(Person, "http://example.org/bob")
# Second retrieval returns the SAME instance (not a duplicate)
person2 = session.get(Person, "http://example.org/bob")
assert person1 is person2 # True - same object reference
# Changes to one reference are visible in all references
person1.name = "Robert"
print(person2.name) # "Robert" - same object
Benefits
- Memory Efficiency: Uses weak references for automatic garbage collection
- Consistency: All operations on the same entity work with the same object
- Performance: Avoids creating duplicate objects for the same entity
- Automatic Management: No manual cache management required
Manual Cache Management
# Remove specific instance from identity map
session.expunge(person)
# Clear all instances from identity map
session.expunge_all()
PREFIX Management System
SPARQLMojo now includes a comprehensive PREFIX management system for namespace handling:
Features
- Built-in Common Prefixes: schema, foaf, rdf, rdfs, owl, xsd, dc, dcterms, skos, ex
- Custom Prefix Registration: Add your own namespace prefixes
- Short-form IRI Support: Use
schema:Personinstead of full IRIs - Automatic PREFIX Declarations: SPARQL queries include proper PREFIX clauses
- IRI Expansion/Contraction: Convert between short-form and full IRIs
Usage
from typing import Annotated
from sparqlmojo import IRIField, LiteralField, Model, RDF_TYPE, Session, SubjectField
# Define model with short-form IRIs
class Person(Model):
rdf_type: Annotated[str, IRIField(RDF_TYPE, default="schema:Person")]
iri: Annotated[str, SubjectField()]
name: Annotated[str | None, LiteralField("schema:name")] = None
age: Annotated[int | None, LiteralField("schema:age")] = None
# Create session with built-in prefix registry
session = Session()
# Register custom prefix
session.register_prefix("my", "http://example.org/my/")
# Query generation with automatic PREFIX declarations
query = session.query(Person)
sparql = query.compile()
# Generates: PREFIX schema: <http://schema.org/> ...
# IRI expansion/contraction
expanded = session.expand_iri("schema:Person") # "http://schema.org/Person"
contracted = session.contract_iri("http://schema.org/Person") # "schema:Person"
Benefits
- Improved Developer Experience: No need to write full IRIs everywhere
- Better Readability: Code is more concise and understandable
- Easy Maintenance: Update namespace URIs in one place
- Standards Compliance: Generates proper SPARQL PREFIX declarations
Language-Tagged Literals
SPARQLMojo supports language-tagged literals via LangString and MultiLangString fields for multilingual text data with BCP 47 language tag validation.
Collection Fields
SPARQLMojo supports collection fields (LiteralList, LangStringList, IRIList, TypedLiteralList) for aggregating multiple values from multi-valued RDF properties into Python lists, with support for filtering, size limiting, and efficient multi-field queries.
UPDATE Operations
SPARQLMojo supports UPDATE operations with dirty tracking, as well as batch inserts, updates, and deletes with automatic chunking for large datasets.
Running Tests
# Run all tests
poetry run pytest
# Run specific test file
poetry run pytest tests/test_basic.py
See Also: Test Fixtures Documentation for comprehensive documentation of shared fixtures, test models, and test organization.
Test Dataset
The project includes a comprehensive library management test dataset in tests/fixtures/library.ttl with Books, Users, and Checkout Records, along with worked examples showing how Python model instances translate to RDF triples.
Limitations
This is a prototype with several intentional limitations:
- No transaction support: Simple staging mechanism for inserts only
- No conflict resolution: Basic operations only
- Not production-ready: Focuses on demonstrating design patterns
For real-world use, consider adding:
- Proper literal typing
- Better parsing of results
- Streaming results and pagination
- Transaction support
Known Issues and Risks
Pydantic Internal API Dependency
SPARQLMojo uses Pydantic's internal ModelMetaclass to enable the intuitive field-level filtering syntax:
# This clean syntax is powered by the custom metaclass
query.filter(Person.name == "Alice")
query.filter(Product.price > 100)
The Risk: The metaclass is imported from Pydantic's private internal API:
from pydantic._internal._model_construction import ModelMetaclass as PydanticModelMetaclass
The _internal prefix indicates this is not part of Pydantic's public API and could change without notice in any Pydantic release. According to the Pydantic maintainers, they "want to be able to refactor the ModelMetaclass without it being considered a breaking change."
What This Means:
- ⚠️ No stability guarantees: The metaclass implementation may change in minor/patch releases
- ⚠️ No deprecation warnings: Changes won't be announced in advance
- ⚠️ Potential breakage: Any Pydantic update could require code changes
Mitigation Strategy:
- Pin Pydantic version carefully in production environments
- Test thoroughly after any Pydantic updates before upgrading
- Fallback available: If the metaclass breaks, fall back to the less elegant method-based approach:
# Alternative syntax that doesn't depend on private APIs query.filter(Person._get_field_filter("name") == "Alice")
Why We Use It Anyway: The UX benefit of the SQLAlchemy-like syntax is significant for a prototype focused on design clarity. For production use, consider the risk-reward tradeoff for your specific needs.
References:
- Pydantic Issue #6381: ModelMetaclass Import Location
- Pydantic Discussion #7185: ModelField and ModelMetaclass in v2
VALUES Clause Support
SPARQLMojo supports the SPARQL VALUES clause for efficient query constraints with explicit value sets, via both an ORM-style field-reference API and a dict-style API for multi-variable bindings.
Property Paths
SPARQLMojo supports SPARQL property paths for advanced relationship traversal, with ORM-like convenience methods (transitive, zero_or_more, inverse, etc.) and a PropertyPath escape hatch for complex expressions.
Ontology-Aware Models with SchemaRegistry
SPARQLMojo provides ontology-aware modeling through SchemaRegistry, enabling automatic inverse relationship discovery via owl:inverseOf and compile-time schema validation (domain, range, cardinality).
Class Hierarchy Support
SPARQLMojo supports rdfs:subClassOf class hierarchies — querying a base class automatically returns all registered subclass instances via polymorphic VALUES ?type queries, with no extra configuration required.
Field-Level Filtering
SPARQLMojo provides intuitive field-level filtering similar to SQLAlchemy, with Python comparison operators, automatic datatype casting, chainable string methods for IRI fields, and logical operators (and_, or_, not_).
Release Process
SPARQLMojo uses a tag-based release workflow with automated CHANGELOG management and Codeberg Releases.
Workflow Overview
- During Development: Update
CHANGELOG.mdin the[Unreleased]section when creating merge requests - Accumulate Changes: Multiple MRs can add to
[Unreleased]before a release - Create Release: Tag the commit to trigger automated release creation
For Contributors (Merge Request Time)
When creating a merge request, update CHANGELOG.md under the [Unreleased] section:
## [Unreleased]
### Fixed
- Issue #123: Fixed bug in query compilation
### Added
- New feature for advanced filtering
### Changed
- Improved performance of batch operations
Follow Keep a Changelog format with sections:
Fixed- Bug fixesAdded- New featuresChanged- Changes to existing functionalityDeprecated- Soon-to-be removed featuresRemoved- Removed featuresSecurity- Security fixes
For Maintainers (Release Time)
When ready to release a new version:
# 1. Preview release notes and create tag
./scripts/tag-release.sh v0.12.0
# 2. Push the tag to trigger CI/CD automation
git push origin v0.12.0
The CI/CD workflow (.gitea/workflows/release.yml) automatically:
- Extracts release notes from
[Unreleased]section - Updates
CHANGELOG.md([Unreleased]→[0.12.0] - 2026-03-05) - Adds new empty
[Unreleased]section at the top - Commits and pushes CHANGELOG update to main
- Creates Codeberg release with extracted notes
Manual Alternative (if CI/CD unavailable):
# 1. Create and push tag
git tag v0.12.0 && git push origin v0.12.0
# 2. Run publish script manually
./scripts/publish-release.sh v0.12.0
# 3. Push CHANGELOG update
git push origin main
Release Scripts
tag-release.sh- Create annotated tag with release notes previewpublish-release.sh- Update CHANGELOG and publish to Codebergcreate-release.sh- Legacy all-in-one script (usetag-release.shinstead)
See scripts/README.md for detailed documentation.
Version Format
Use semantic versioning: vMAJOR.MINOR.PATCH
- MAJOR: Breaking changes
- MINOR: New features (backward compatible)
- PATCH: Bug fixes (backward compatible)
Examples: v0.11.0, v1.0.0, v1.2.3
Dependencies
pydantic>=2.12.4- Data validation and type checkingSPARQLWrapper>=2.0.0- SPARQL endpoint communicationrdflib>=6.0.0- RDF graph parsing and manipulation
Key Benefits of Pydantic Integration
- Type Safety: Fields are validated at runtime against their type annotations
- Better IDE Support: Full autocomplete and type hints in modern IDEs
- Clear Error Messages: Pydantic provides detailed validation errors
- Automatic Coercion: Compatible types are automatically converted (e.g.,
"123"→123for int fields) - Extra Field Protection: Unknown fields are rejected by default
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sparqlmojo-0.15.2.tar.gz.
File metadata
- Download URL: sparqlmojo-0.15.2.tar.gz
- Upload date:
- Size: 99.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.13 Linux/6.12.57+deb13-amd64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bcf5e235f8d337886f4b7dafa9fc9cdaad571c21b90eb47eb764351d33eb651
|
|
| MD5 |
33033cecde9cbc749a48e01c6c1027cb
|
|
| BLAKE2b-256 |
755bd43f82b38e7e789f2d2466e9be0bc32b2ed1b7a8ad242aa7e624e652bc60
|
File details
Details for the file sparqlmojo-0.15.2-py3-none-any.whl.
File metadata
- Download URL: sparqlmojo-0.15.2-py3-none-any.whl
- Upload date:
- Size: 109.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.13 Linux/6.12.57+deb13-amd64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
465b846cea6e01fa2e83d18048dddcdba11c1c74eeaf74dc8c360a2680b11d10
|
|
| MD5 |
cc8ff10b3dc6d5a165e51eb4a16ec7b5
|
|
| BLAKE2b-256 |
9604c7c24cbc0530cf0be607d5049a31cdfb2a94ebf464ef0cb6981678dcb86c
|