An SQLAlchemy-like ORM for SPARQL endpoints.

These details have not been verified by PyPI

Project links

Project description

SPARQLMojo

An SQLAlchemy-like ORM for SPARQL endpoints with Pydantic validation. Currently in beta, so there may be breaking changes.

Features

Declarative RDF models using Python classes with Pydantic validation
Type-safe field definitions with automatic validation
A session layer for querying and updating SPARQL endpoints
A query compiler that converts Pythonic queries to SPARQL
Session identity map to prevent duplicate instances and ensure consistency
PREFIX management system for namespace handling with short-form IRIs
Language-tagged literal support for multilingual text data
Property path support with ORM-like convenience methods and inverse path support for reverse relationship traversal
Field-level filtering with intuitive syntax and automatic datatype casting for numeric comparisons
String filtering on IRI fields with chainable str(), lower(), upper() methods for case-insensitive matching
Ontology-aware models with SchemaRegistry for automatic inverse relationship discovery via owl:inverseOf
InverseField for clean, semantic reverse relationship navigation with automatic fallback to SPARQL ^ operator

Installation

# Install dependencies
poetry install

# Or install the package in editable mode
pip install -e .

Version

Check the installed version:

import sparqlmojo
print(sparqlmojo.__version__)  # Output: 0.1.0

Or from the command line:

python -c "import sparqlmojo; print(sparqlmojo.__version__)"

Versioning Workflow

This project uses semantic versioning with automated releases. See the Release Process section for details on creating releases.

Usage

from typing import Annotated

from sparqlmojo import (
    Condition,
    InverseField,
    IRIField,
    LiteralField,
    Model,
    ObjectPropertyField,
    RDF_TYPE,
    SchemaRegistry,
    Session,
    SPARQLCompiler,
    SubjectField,
)


class Person(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="schema:Person")]
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("schema:name")] = None
    age: Annotated[int | None, LiteralField("schema:age")] = None
    knows: Annotated[str | None, ObjectPropertyField("schema:knows", range_="Person")] = None


# Create a session
s = Session(endpoint="http://example.org/sparql")

# For endpoints with separate read/write URLs (e.g., Fuseki):
# s = Session(
#     endpoint="http://example.org/sparql",           # For SELECT queries
#     write_endpoint="http://example.org/update"      # For INSERT/DELETE/UPDATE
# )

# Configure HTTP method for SELECT queries (see "HTTP Method Configuration" below):
# s = Session(endpoint="http://example.org/sparql", query_method="GET")

# Build and compile a query
q = s.query(Person).filter(Condition("age", ">", 30)).limit(5)
sparql = SPARQLCompiler.compile_query(q)
print(sparql)

# Create an instance with validation
bob = Person(iri="http://example.org/bob", name="Bob", age=28)
s.add(bob)
s.commit()

# Pydantic validates types automatically
try:
    invalid = Person(iri="http://example.org/alice", name="Alice", age="not a number")  # Raises ValidationError
except Exception as e:
    print(f"Validation error: {e}")

HTTP Method Configuration

SPARQLMojo supports configurable HTTP methods for SPARQL SELECT queries. By default, POST is used to avoid URL length limitations with large queries.

Query Methods

Method	Description	Use Case
`POST`	Use HTTP POST for SELECT queries (default)	Recommended for most cases; avoids URL length issues
`GET`	Use HTTP GET for SELECT queries	Required by some read-only endpoints; better caching

Configuration

from sparqlmojo import Session

# Default: Always use POST (safest option)
session = Session(endpoint="http://example.org/sparql")
# or explicitly:
session = Session(endpoint="http://example.org/sparql", query_method="POST")

# Use GET (for endpoints that require it or for caching benefits)
session = Session(endpoint="http://example.org/sparql", query_method="GET")

When to Use Each Mode

POST (Default)

Recommended for most applications
No risk of HTTP 414 "URI Too Long" errors
Works with queries of any size, including large VALUES clauses
Some proxies/CDNs may not cache POST requests

GET

Better HTTP caching (responses can be cached by proxies)
Required by some read-only SPARQL endpoints
Risk of HTTP 414 errors with large queries (URLs > 2000 characters)
Query is visible in server access logs (potential security consideration)

Note: UPDATE queries (INSERT, DELETE) always use POST regardless of this setting, as required by the SPARQL protocol.

Identity Map

SPARQLMojo now includes a Session identity map to prevent duplicate instances and ensure consistency:

# First retrieval creates new instance
person1 = session.get(Person, "http://example.org/bob")

# Second retrieval returns the SAME instance (not a duplicate)
person2 = session.get(Person, "http://example.org/bob")

assert person1 is person2  # True - same object reference

# Changes to one reference are visible in all references
person1.name = "Robert"
print(person2.name)  # "Robert" - same object

Benefits

Memory Efficiency: Uses weak references for automatic garbage collection
Consistency: All operations on the same entity work with the same object
Performance: Avoids creating duplicate objects for the same entity
Automatic Management: No manual cache management required

Manual Cache Management

# Remove specific instance from identity map
session.expunge(person)

# Clear all instances from identity map
session.expunge_all()

PREFIX Management System

SPARQLMojo now includes a comprehensive PREFIX management system for namespace handling:

Features

Built-in Common Prefixes: schema, foaf, rdf, rdfs, owl, xsd, dc, dcterms, skos, ex
Custom Prefix Registration: Add your own namespace prefixes
Short-form IRI Support: Use schema:Person instead of full IRIs
Automatic PREFIX Declarations: SPARQL queries include proper PREFIX clauses
IRI Expansion/Contraction: Convert between short-form and full IRIs

Usage

from typing import Annotated

from sparqlmojo import IRIField, LiteralField, Model, RDF_TYPE, Session, SubjectField

# Define model with short-form IRIs
class Person(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="schema:Person")]
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("schema:name")] = None
    age: Annotated[int | None, LiteralField("schema:age")] = None

# Create session with built-in prefix registry
session = Session()

# Register custom prefix
session.register_prefix("my", "http://example.org/my/")

# Query generation with automatic PREFIX declarations
query = session.query(Person)
sparql = query.compile()
# Generates: PREFIX schema: <http://schema.org/> ...

# IRI expansion/contraction
expanded = session.expand_iri("schema:Person")  # "http://schema.org/Person"
contracted = session.contract_iri("http://schema.org/Person")  # "schema:Person"

Benefits

Improved Developer Experience: No need to write full IRIs everywhere
Better Readability: Code is more concise and understandable
Easy Maintenance: Update namespace URIs in one place
Standards Compliance: Generates proper SPARQL PREFIX declarations

Language-Tagged Literals

SPARQLMojo now supports language-tagged literals for multilingual text data with BCP 47 language tag validation:

LangString Field

Store single-language text with language tags:

from typing import Annotated
from sparqlmojo import IRIField, LangString, Model, RDF_TYPE, SubjectField

class Article(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Article")]
    iri: Annotated[str, SubjectField()]
    title_en: Annotated[str | None, LangString("http://schema.org/name", lang="en")] = None
    title_fr: Annotated[str | None, LangString("http://schema.org/name", lang="fr")] = None

article = Article(
    iri="http://example.org/article1",
    title_en="Hello World",
    title_fr="Bonjour le monde"
)

# Generates SPARQL with language tags:
# <article1> schema:name "Hello World"@en .
# <article1> schema:name "Bonjour le monde"@fr .

MultiLangString Field

Store multiple language versions in a single field:

from typing import Annotated
from sparqlmojo import IRIField, Model, MultiLangString, RDF_TYPE, SubjectField

class Document(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Document")]
    iri: Annotated[str, SubjectField()]
    title: Annotated[dict[str, str | None], MultiLangString("http://schema.org/name")] = None

doc = Document(
    iri="http://example.org/doc1",
    title={
        "en": "Hello",
        "fr": "Bonjour",
        "de": "Hallo",
        "es": "Hola"
    }
)

# Generates multiple SPARQL triples:
# <doc1> schema:name "Hello"@en .
# <doc1> schema:name "Bonjour"@fr .
# <doc1> schema:name "Hallo"@de .
# <doc1> schema:name "Hola"@es .

Complex Language Tags

Support for BCP 47 language tags with region and script codes:

from typing import Annotated
from sparqlmojo import IRIField, LangString, Model, MultiLangString, RDF_TYPE, SubjectField

class InternationalContent(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Article")]
    iri: Annotated[str, SubjectField()]
    # Region-specific variants
    title_us: Annotated[str | None, LangString("http://schema.org/name", lang="en-US")] = None
    title_gb: Annotated[str | None, LangString("http://schema.org/name", lang="en-GB")] = None

    # Script-specific variants in a single field
    chinese_title: Annotated[dict[str, str | None], MultiLangString("http://schema.org/name")] = None

content = InternationalContent(
    iri="http://example.org/content1",
    title_us="Color",
    title_gb="Colour",
    chinese_title={
        "zh-Hans": "简体中文",  # Simplified Chinese
        "zh-Hant": "繁體中文",  # Traditional Chinese
    }
)

Language Tag Validation

All language tags are validated against BCP 47 format:

# Valid tags
LangString("...", lang="en")        # Simple language
LangString("...", lang="en-US")     # Language + region
LangString("...", lang="zh-Hans")   # Language + script
LangString("...", lang="zh-Hans-CN") # Language + script + region

# Invalid tags (will raise ValueError)
LangString("...", lang="EN")        # Must be lowercase
LangString("...", lang="en us")     # No spaces allowed
LangString("...", lang="english")   # Must be 2-3 letter code

Benefits

RDF Standards Compliance: Proper @lang tag syntax with BCP 47 validation
Multilingual Support: Store and retrieve text in multiple languages
Flexible Data Modeling: Choose between separate fields or single multi-language field
Automatic SPARQL Generation: Language tags are automatically added to generated queries
Type Safety: Full Pydantic validation for field values and language codes

Collection Fields

SPARQLMojo supports collection fields for aggregating multiple values from multi-valued RDF properties into Python lists.

LiteralList - Aggregate Multiple Literal Values

from typing import Annotated
from sparqlmojo import IRIField, LiteralList, Model, RDF_TYPE, SubjectField

class Product(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Product")]
    iri: Annotated[str, SubjectField()]
    tags: Annotated[list[str] | None, LiteralList("http://schema.org/keywords")] = None

# Query returns all keyword values as a Python list
product = session.query(Product).first()
print(product.tags)  # ['electronics', 'gadgets', 'portable']

LangStringList - Aggregate Language-Tagged Literals

For multi-valued properties with language tags (like rdfs:label with multiple translations):

from typing import Annotated
from sparqlmojo import IRIField, LangStringList, Model, RDF_TYPE, SubjectField
from sparqlmojo.orm.model import LangLiteral

class City(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/City")]
    iri: Annotated[str, SubjectField()]
    labels: Annotated[list[LangLiteral] | None, LangStringList(
        "http://www.w3.org/2000/01/rdf-schema#label"
    )] = None

# Query returns all labels with their language tags
city = session.query(City).first()
for label in city.labels:
    print(f"{label.value} ({label.lang})")
# Output:
# Berlin (en)
# Berlin (de)
# Berlín (es)

IRIList - Aggregate Multiple IRI References

For multi-valued object properties:

from typing import Annotated
from sparqlmojo import IRIField, IRIList, Model, RDF_TYPE, SubjectField

class Person(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Person")]
    iri: Annotated[str, SubjectField()]
    friends: Annotated[list[str] | None, IRIList("http://schema.org/knows")] = None

# Query returns all friend IRIs as a list
person = session.query(Person).first()
print(person.friends)
# ['http://example.org/alice', 'http://example.org/bob', 'http://example.org/charlie']

TypedLiteralList - Aggregate Typed Literals with XSD Datatype Preservation

For multi-valued properties where you need to preserve the XSD datatype information (e.g., integers, decimals, dates):

from typing import Annotated
from sparqlmojo import IRIField, Model, RDF_TYPE, SubjectField, TypedLiteral, TypedLiteralList

class Document(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://example.org/Document")]
    iri: Annotated[str, SubjectField()]
    page_counts: Annotated[
        list[TypedLiteral] | None,
        TypedLiteralList("http://example.org/pageCount")
    ] = None

# Query returns TypedLiteral objects with preserved datatypes
doc = session.query(Document).first()
for pc in doc.page_counts:
    print(f"{pc.value} (type: {type(pc.value).__name__}, datatype: {pc.datatype})")
# Output:
# 42 (type: int, datatype: http://www.w3.org/2001/XMLSchema#integer)
# 3.14 (type: Decimal, datatype: http://www.w3.org/2001/XMLSchema#decimal)

Type Conversion Mapping:

XSD Datatype	Python Type
`xsd:integer`	`int`
`xsd:decimal`	`decimal.Decimal`
`xsd:float`	`float`
`xsd:double`	`float`
`xsd:boolean`	`bool`
`xsd:date`	`datetime.date`
`xsd:dateTime`	`datetime.datetime`
Unknown types	`str`

Unlike LiteralList which loses datatype information during aggregation, TypedLiteralList preserves the XSD datatype IRI alongside each value, enabling proper Python type conversion.

Custom Separators

Collection fields use GROUP_CONCAT internally. You can customize the separator:

# Default separator is ASCII Unit Separator (\\x1f)
tags: Annotated[list[str] | None, LiteralList(
    "http://schema.org/keywords",
    separator="|"  # Use pipe as separator
)] = None

Limiting Collection Size

For properties with potentially millions of values (e.g., Wikidata's wdt:P31 instances), use the limit parameter to prevent memory issues:

from typing import Annotated
from sparqlmojo import IRIList, LangStringList, Model, SubjectField
from sparqlmojo.orm.model import LangLiteral

class WikidataClass(Model):
    iri: Annotated[str, SubjectField()]
    # Limit to first 1000 instances to avoid OOM on classes with millions
    instances: Annotated[list[str] | None, IRIList(
        "^wdt:P31",  # Inverse path: entities that have this as their type
        limit=1000
    )] = None
    # Limit labels to 100 per entity
    labels: Annotated[list[LangLiteral] | None, LangStringList(
        "rdfs:label",
        limit=100
    )] = None

The limit parameter:

Must be a positive integer (raises TypeError for non-integers, ValueError for non-positive values)
Applies LIMIT inside a nested SELECT before GROUP_CONCAT aggregation
Defaults to None (unlimited)

Note: SPARQL LIMIT without ORDER BY returns results in arbitrary order, so "first N values" is not deterministic.

Multiple Collection Fields

Models can have multiple collection fields. SPARQLMojo uses scalar subqueries internally to avoid cartesian product explosion when querying models with multiple collection fields:

class WikidataEntity(Model):
    # No rdf_type field - queries any entity without type constraint
    iri: Annotated[str, SubjectField()]
    labels: Annotated[list[LangLiteral] | None, LangStringList("rdfs:label")] = None
    descriptions: Annotated[list[LangLiteral] | None, LangStringList("schema:description")] = None
    aliases: Annotated[list[LangLiteral] | None, LangStringList("skos:altLabel")] = None
    types: Annotated[list[str] | None, IRIList("wdt:P31")] = None

# Efficiently queries all collection fields without performance issues
entity = session.query(WikidataEntity).filter_by(s="http://www.wikidata.org/entity/Q42").first()

Filtering Collection Fields

Collection fields support polymorphic contains() for membership filtering, following SQLAlchemy conventions:

from typing import Annotated
from sparqlmojo import IRIField, IRIList, LiteralList, Model, RDF_TYPE, Session, SubjectField

class Book(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Book")]
    iri: Annotated[str, SubjectField()]
    genres: Annotated[list[str] | None, LiteralList("http://schema.org/genre")] = None
    related_works: Annotated[list[str] | None, IRIList("http://schema.org/relatedLink")] = None

session = Session()

# Filter books that have "Science Fiction" as a genre
query = session.query(Book).filter(Book.genres.contains("Science Fiction"))
# Generates triple pattern: ?s <http://schema.org/genre> "Science Fiction" .

# Filter books related to a specific work
query = session.query(Book).filter(
    Book.related_works.contains("http://example.org/books/dune")
)
# Generates: ?s <http://schema.org/relatedLink> <http://example.org/books/dune> .

Polymorphic Behavior: The contains() method behaves differently based on field type:

Regular fields (LiteralField, LangString): Substring matching with FILTER(CONTAINS(...))
Collection fields (LiteralList, IRIList, etc.): Membership check via triple pattern

This follows SQLAlchemy's convention where contains() does the right thing based on context.

Benefits

Natural Python API: Work with Python lists instead of raw SPARQL results
Efficient Queries: Uses SPARQL 1.1 scalar subqueries for optimal performance
Language Tag Preservation: LangStringList maintains value-language associations
Multiple Collection Support: Query models with many collection fields without cartesian products
Intuitive Filtering: Polymorphic contains() works naturally for both substring and membership checks

UPDATE Operations

SPARQLMojo now supports UPDATE operations with dirty tracking:

# Get an existing person from the database
person = s.get(Person, "http://example.org/bob")

# Modify fields - changes are automatically tracked
person.age = 29
person.name = "Robert"

# Stage the update (only modified fields will be updated)
s.update(person)

# Commit the changes
s.commit()  # Executes SPARQL DELETE/INSERT for changed fields

Dirty Tracking

person = Person(iri="http://example.org/bob", name="Bob", age=30)

# Mark as clean (baseline state)
person.mark_clean()

# Check if modified
print(person.is_dirty())  # False

# Modify a field
person.age = 31
print(person.is_dirty())  # True

# Get changes
changes = person.get_changes()
# {'age': (30, 31)}

# Reset tracking
person.mark_clean()

Partial Updates

Only fields that have been modified since mark_clean() was called will be updated:

person = s.get(Person, "http://example.org/bob")  # Automatically marked clean

# Only age is modified
person.age = 31

s.update(person)  # Only generates UPDATE for age field
s.commit()

SPARQL Generated

The update generates SPARQL DELETE/INSERT statements:

DELETE DATA {
  <http://example.org/bob> <http://schema.org/age> "30" .
} ;
INSERT DATA {
  <http://example.org/bob> <http://schema.org/age> "31" .
}

Batch Operations

SPARQLMojo now supports efficient batch operations for working with multiple instances:

Batch Inserts

# Create multiple instances
people = [
    Person(iri=f"http://example.org/person{i}", name=f"Person{i}", age=20 + i)
    for i in range(100)
]

# Add all instances in a single batch operation
s.add_all(people)
s.commit()  # Generates efficient INSERT DATA with all triples

Batch Updates

# Get multiple instances
people = [s.get(Person, f"http://example.org/person{i}") for i in range(10)]

# Modify instances (dirty tracking works with batches)
for person in people:
    person.age += 1

# Update all modified instances in batch
s.update_all(people)
s.commit()  # Only generates updates for actually modified fields

Batch Deletes

# Create instances to delete
people_to_delete = [
    Person(iri=f"http://example.org/person{i}")
    for i in range(50, 100)
]

# Delete all instances in batch
s.delete_all(people_to_delete)
s.commit()  # Generates efficient DELETE WHERE queries

Chunking for Large Batches

For very large datasets, SPARQLMojo automatically chunks operations:

# Configure chunk size (default: 1000 triples)
session = Session(max_batch_size=500)

# Large batch will be automatically chunked
large_batch = [Person(iri=f"http://example.org/person{i}", name=f"Person{i}") for i in range(10000)]
s.add_all(large_batch)
s.commit()  # Automatically splits into multiple INSERT DATA queries

Performance Benefits

Reduced overhead: Single method call instead of many individual calls
Optimized SPARQL: Efficient INSERT DATA queries with many triples
Automatic chunking: Prevents query size limits on endpoints
Memory efficient: Processes large datasets in manageable chunks

Running Tests

# Run all tests
poetry run pytest

# Run specific test file
poetry run pytest tests/test_basic.py

See Also: Test Fixtures Documentation for comprehensive documentation of shared fixtures, test models, and test organization.

Test Dataset

The project includes a comprehensive library management test dataset in tests/fixtures/library.ttl with:

10 Books (classics like "The Great Gatsby", "1984", "Pride and Prejudice")
10 Users (library patrons with member IDs and contact information)
5 Checkout Records (linking books to users with checkout/due dates)
Multiple Status Types (checked in, checked out, overdue)

Model Definitions

The test fixtures define three interconnected models:

from typing import Annotated
from sparqlmojo import IRIField, LiteralField, Model, ObjectPropertyField, RDF_TYPE, SubjectField

class Book(Model):
    """Book model for library system."""
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Book")]
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None
    author: Annotated[str | None, LiteralField("http://schema.org/author")] = None
    isbn: Annotated[str | None, LiteralField("http://schema.org/isbn")] = None
    date_published: Annotated[str | None, LiteralField("http://schema.org/datePublished")] = None
    status: Annotated[str | None, ObjectPropertyField("http://example.org/library/vocab/status")] = None

class Person(Model):
    """Person/User model for library system."""
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Person")]
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None
    email: Annotated[str | None, LiteralField("http://schema.org/email")] = None
    member_id: Annotated[str | None, LiteralField("http://example.org/library/vocab/memberId")] = None
    member_since: Annotated[str | None, LiteralField("http://example.org/library/vocab/memberSince")] = None

class CheckoutRecord(Model):
    """Checkout record linking books to patrons."""
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://example.org/library/vocab/CheckoutRecord")]
    iri: Annotated[str, SubjectField()]
    patron: Annotated[str | None, ObjectPropertyField("http://example.org/library/vocab/patron")] = None
    book: Annotated[str | None, ObjectPropertyField("http://example.org/library/vocab/book")] = None
    checkout_date: Annotated[str | None, LiteralField("http://example.org/library/vocab/checkoutDate")] = None
    due_date: Annotated[str | None, LiteralField("http://example.org/library/vocab/dueDate")] = None
    status: Annotated[str | None, LiteralField("http://example.org/library/vocab/status")] = None

Python to RDF Triple Translation

Here's how SPARQLMojo translates Python model instances to RDF triples:

Python Code

from sparqlmojo import Session

# Create model instances
book = Book(
    iri="http://example.org/library/book1",
    name="The Great Gatsby",
    author="F. Scott Fitzgerald",
    isbn="978-0743273565",
    date_published="1925"
)

person = Person(
    iri="http://example.org/library/user1",
    name="Alice Johnson",
    email="alice.johnson@example.com",
    member_id="LIB001",
    member_since="2020-01-15"
)

checkout = CheckoutRecord(
    iri="http://example.org/library/checkout1",
    patron="http://example.org/library/user1",
    book="http://example.org/library/book1",
    checkout_date="2025-10-20",
    due_date="2025-11-20",
    status="active"
)

# Add to session and commit
session = Session(endpoint="http://example.org/sparql")
session.add(book)
session.add(person)
session.add(checkout)
session.commit()

Generated RDF Triples (Turtle Format)

# Book triples
<http://example.org/library/book1> a <http://schema.org/Book> .
<http://example.org/library/book1> <http://schema.org/name> "The Great Gatsby" .
<http://example.org/library/book1> <http://schema.org/author> "F. Scott Fitzgerald" .
<http://example.org/library/book1> <http://schema.org/isbn> "978-0743273565" .
<http://example.org/library/book1> <http://schema.org/datePublished> "1925" .

# Person triples
<http://example.org/library/user1> a <http://schema.org/Person> .
<http://example.org/library/user1> <http://schema.org/name> "Alice Johnson" .
<http://example.org/library/user1> <http://schema.org/email> "alice.johnson@example.com" .
<http://example.org/library/user1> <http://example.org/library/vocab/memberId> "LIB001" .
<http://example.org/library/user1> <http://example.org/library/vocab/memberSince> "2020-01-15" .

# CheckoutRecord triples (note: ObjectProperty fields become IRI references)
<http://example.org/library/checkout1> a <http://example.org/library/vocab/CheckoutRecord> .
<http://example.org/library/checkout1> <http://example.org/library/vocab/patron> <http://example.org/library/user1> .
<http://example.org/library/checkout1> <http://example.org/library/vocab/book> <http://example.org/library/book1> .
<http://example.org/library/checkout1> <http://example.org/library/vocab/checkoutDate> "2025-10-20" .
<http://example.org/library/checkout1> <http://example.org/library/vocab/dueDate> "2025-11-20" .
<http://example.org/library/checkout1> <http://example.org/library/vocab/status> "active" .

Key Translation Rules:

Type Declaration: The rdf_type IRIField with RDF_TYPE predicate becomes the rdf:type triple (shown as a in Turtle)
Subject IRI: The iri SubjectField becomes the subject of all triples
Literal Fields: Python strings/numbers become quoted literals in RDF
ObjectProperty Fields: Python IRI strings become unquoted IRI references (linking entities)
Field Names: Python snake_case field names map to full predicate IRIs defined in the model

This mapping allows you to work with Pythonic objects while maintaining full RDF semantics in the underlying data store.

Limitations

This is a prototype with several intentional limitations:

No transaction support: Simple staging mechanism for inserts only
No conflict resolution: Basic operations only
Not production-ready: Focuses on demonstrating design patterns

For real-world use, consider adding:

Proper literal typing
Better parsing of results
Streaming results and pagination
Transaction support

Known Issues and Risks

Pydantic Internal API Dependency

SPARQLMojo uses Pydantic's internal ModelMetaclass to enable the intuitive field-level filtering syntax:

# This clean syntax is powered by the custom metaclass
query.filter(Person.name == "Alice")
query.filter(Product.price > 100)

The Risk: The metaclass is imported from Pydantic's private internal API:

from pydantic._internal._model_construction import ModelMetaclass as PydanticModelMetaclass

The _internal prefix indicates this is not part of Pydantic's public API and could change without notice in any Pydantic release. According to the Pydantic maintainers, they "want to be able to refactor the ModelMetaclass without it being considered a breaking change."

What This Means:

⚠️ No stability guarantees: The metaclass implementation may change in minor/patch releases
⚠️ No deprecation warnings: Changes won't be announced in advance
⚠️ Potential breakage: Any Pydantic update could require code changes

Mitigation Strategy:

Pin Pydantic version carefully in production environments
Test thoroughly after any Pydantic updates before upgrading

Fallback available: If the metaclass breaks, fall back to the less elegant method-based approach:

# Alternative syntax that doesn't depend on private APIs
query.filter(Person._get_field_filter("name") == "Alice")

Why We Use It Anyway: The UX benefit of the SQLAlchemy-like syntax is significant for a prototype focused on design clarity. For production use, consider the risk-reward tradeoff for your specific needs.

References:

VALUES Clause Support

SPARQLMojo supports the SPARQL VALUES clause for efficient query constraints with explicit value sets.

ORM-Style API (Recommended)

The ORM-style API provides type-safe, model-aware value binding:

from typing import Annotated
from sparqlmojo import IRIField, LangString, LiteralField, Model, RDF_TYPE, Session, SubjectField

class Person(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Person")]
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None
    age: Annotated[int | None, LiteralField("http://schema.org/age")] = None

class Label(Model):
    # No rdf_type - property relationship, not a typed entity
    entity_iri: Annotated[str, SubjectField()]
    text: Annotated[str | None, LangString("http://www.w3.org/2000/01/rdf-schema#label")] = None

# ORM-style: type-safe field reference
query = session.query(Person).values(Person.name, ['Alice', 'Bob', 'Charlie'])
# Generates: VALUES (?name) { ("Alice") ("Bob") ("Charlie") }

# SubjectField automatically maps to ?s variable
query = session.query(Label).values(Label.entity_iri, [
    'http://www.wikidata.org/entity/Q682',
    'http://www.wikidata.org/entity/Q123'
])
# Generates: VALUES (?s) { (<http://www.wikidata.org/entity/Q682>) (<http://www.wikidata.org/entity/Q123>) }

Dict-Style API

For multiple variables or advanced use cases, use the dict-style API:

# Single variable VALUES clause
query = session.query(Person).values({
    'name': ['Alice', 'Bob', 'Charlie']
})
# Generates: VALUES (?name) { ("Alice") ("Bob") ("Charlie") }

# Multiple variables VALUES clause
query = session.query(Person).values({
    'name': ['Alice', 'Bob'],
    'age': [30, 25]
})
# Generates: VALUES (?name ?age) { ("Alice" 30) ("Bob" 25) }

# Combined with other query methods
query = (
    session.query(Person)
    .values({'name': ['Alice', 'Bob', 'Charlie']})
    .filter(Condition("age", ">", 25))
    .limit(10)
)
# Generates: VALUES (?name) { ("Alice") ("Bob") ("Charlie") }
#           FILTER(?age > 25)
#           LIMIT 10

Key Features

ORM-Style API: Type-safe field references with query.values(Model.field, [values])
SubjectField Support: Automatic mapping to ?s variable for subject-based queries
Single and Multiple Variables: Support for both single and multiple variable bindings
Method Chaining: Works seamlessly with existing filter(), limit(), offset() methods
SPARQL Injection Protection: Built-in security with automatic value escaping
Comprehensive Validation: Validates variable names, list lengths, and data types
Performance Optimization: Reduces need for multiple queries or complex filters

Benefits

Efficient Query Constraints: VALUES clause allows inline value sets for better performance
Cleaner Code: More readable than multiple OR conditions
Type Safety: Proper formatting of different data types (strings, numbers, IRIs)
Security: Automatic protection against SPARQL injection attacks

Property Paths

SPARQLMojo supports SPARQL property paths for advanced relationship traversal with an ORM-like API:

Convenience Methods (Recommended)

For common use cases, use convenience methods that automatically infer predicates from your model:

from typing import Annotated
from sparqlmojo import IRIField, LiteralField, Model, ObjectPropertyField, RDF_TYPE, Session, SubjectField

class Person(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="schema:Person")]
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("schema:name")] = None
    knows: Annotated[str | None, ObjectPropertyField("schema:knows", range_="Person")] = None
    manager: Annotated[str | None, ObjectPropertyField("schema:manager", range_="Person")] = None
    parent: Annotated[str | None, ObjectPropertyField("schema:parent", range_="Person")] = None

# Transitive relationships (one-or-more: +)
# Find all people someone knows, directly or indirectly
query = session.query(Person).transitive('knows')

# Zero-or-more (*)
# Find all managers in the reporting chain
query = session.query(Person).zero_or_more('manager')

# Zero-or-one (?)
# Find people who may or may not have a parent
query = session.query(Person).zero_or_one('parent')

# Alternative paths (|)
# Find people who have either a parent or guardian
query = session.query(Person).alternative('parent', 'guardian')

# Inverse paths (^)
# Find children (inverse of parent relationship)
query = session.query(Person).inverse('child')

Method Chaining

Property path methods work seamlessly with other query methods:

# Find Alice's friends of friends
query = (
    session.query(Person)
    .transitive('knows')
    .filter_by(name='Alice')
    .limit(10)
)

# Find managers with ordering
query = (
    session.query(Person)
    .zero_or_more('manager')
    .order_by('name')
)

Advanced: Complex Property Paths

For complex expressions that don't map to a single field, use PropertyPath directly:

from sparqlmojo import PropertyPath

# Sequence paths (A then B)
query = session.query(Person).path(
    'colleague_email',
    PropertyPath('schema:worksFor/^schema:worksFor/schema:email')
)

# Grouped operators
query = session.query(Person).path(
    'contact',
    PropertyPath('(schema:knows|schema:friend)/schema:email')
)

Inverse Property Paths in Model Fields

You can define fields that use inverse property paths directly in your model using IRIField with PropertyPath. This is useful for Wikidata-style patterns where you need to find resources through inverse relationships:

from typing import Annotated
from sparqlmojo import IRIField, Model, PropertyPath, SubjectField

class Child(Model):
    iri: Annotated[str, SubjectField()]
    # Find parent by traversing parent->child in reverse
    parent: Annotated[str | None, IRIField(
        PropertyPath("^<http://schema.org/children>")
    )] = None

class WikidataStatement(Model):
    iri: Annotated[str, SubjectField()]
    # Find the property that defines this claim predicate
    property_iri: Annotated[str | None, IRIField(
        PropertyPath("^<http://wikiba.se/ontology#claim>")
    )] = None

# Query generates: ?s ^<http://schema.org/children> ?parent .
# Which is equivalent to: ?parent <http://schema.org/children> ?s .

How it works:

Normal pattern: ?subject <predicate> ?object finds objects of subjects
Inverse pattern: ?subject ^<predicate> ?object finds subjects where the object points to them via the predicate

Benefits

Type-Safe: Validates that fields exist in your model
No Field/Predicate Mismatch: Impossible to use wrong predicate for a field
Clean API: ORM-like syntax for 90% of use cases
Flexible: PropertyPath fallback for complex expressions
Security: Built-in SPARQL injection prevention

Ontology-Aware Models with SchemaRegistry

SPARQLMojo provides ontology-aware modeling through SchemaRegistry. When a global registry is set, all models automatically receive:

Inverse Discovery: InverseField uses named predicates from owl:inverseOf
Schema Validation: Fields are validated against ontology constraints (domain, range, cardinality)

This follows the "Convention over Configuration" pattern - schema features are enabled by default when a registry exists. Models can opt out via Meta.schema_aware = False.

Quick Start

from typing import Annotated
from sparqlmojo import (
    InverseField, IRIField, LiteralField, Model, RDF_TYPE,
    SchemaRegistry, SubjectField
)

# Load ontology and activate - enables all schema features
registry = SchemaRegistry()
registry.load_from_file("schema.ttl", format="turtle", activate=True)

# Define models - schema features work automatically
class Child(Model):
    iri: Annotated[str, SubjectField()]
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Person")]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None

    # InverseField automatically discovers owl:inverseOf
    parent: Annotated[str | None, InverseField("http://schema.org/children")] = None
    # If ontology defines: schema:children owl:inverseOf schema:parent
    # Generates: ?s <http://schema.org/parent> ?parent
    # Otherwise: ?s ^<http://schema.org/children> ?parent

SchemaRegistry

The SchemaRegistry is a thread-safe cache for ontology metadata that can load property information from:

RDF files (Turtle, RDF/XML, N3, etc.)
SPARQL endpoints
Manual registration

from sparqlmojo import SchemaRegistry, Session, PropertyInfo

# Load and activate in one step (recommended)
registry = SchemaRegistry()
registry.load_from_file("schema.ttl", format="turtle", activate=True)

# Or load then activate separately
registry = SchemaRegistry()
registry.load_from_file("schema.ttl", format="turtle")
registry.activate()  # Enable schema features globally

# Check status or deactivate
if registry.is_active:
    registry.deactivate()

# Create with SPARQL endpoint for lazy loading
registry = SchemaRegistry(endpoint="http://example.org/sparql", cache_ttl=3600)

# Manual registration of property metadata
from types import MappingProxyType

prop = PropertyInfo(
    predicate_iri="http://schema.org/children",
    inverse_of="http://schema.org/parent",
    domain=frozenset({"http://schema.org/Person"}),
    range_=frozenset({"http://schema.org/Person"}),
    label=MappingProxyType({"en": "children", "de": "Kinder"}),
    comment=MappingProxyType({"en": "Children of a person"})
)
registry.register_property(prop)

# Use with Session
session = Session(schema_registry=registry)

PropertyInfo Metadata

The PropertyInfo dataclass stores comprehensive ontology information as immutable types for thread-safe caching:

from types import MappingProxyType
from sparqlmojo import PropertyInfo

# Property information extracted from ontologies
property_info = PropertyInfo(
    predicate_iri="http://schema.org/children",

    # Inverse relationships (from owl:inverseOf)
    inverse_of="http://schema.org/parent",

    # Domain and range constraints (use frozenset for immutability)
    domain=frozenset({"http://schema.org/Person"}),
    range_=frozenset({"http://schema.org/Person"}),

    # OWL characteristics
    is_functional=False,
    is_inverse_functional=False,
    is_transitive=False,
    is_symmetric=False,

    # OWL cardinality constraints
    max_cardinality=None,      # owl:maxCardinality
    min_cardinality=None,      # owl:minCardinality
    exact_cardinality=None,    # owl:cardinality

    # Property hierarchy (use frozenset)
    subproperty_of=frozenset({"http://schema.org/relative"}),

    # Multilingual labels and descriptions (use MappingProxyType for immutable dicts)
    label=MappingProxyType({"en": "children", "de": "Kinder"}),
    comment=MappingProxyType({"en": "Children of a person"})
)

# Check if property is single-valued (functional or max cardinality <= 1)
if property_info.is_single_valued:
    print("Property allows at most one value")

Note: PropertyInfo uses immutable collection types (frozenset, MappingProxyType) to ensure thread-safe caching and prevent accidental modification of shared ontology metadata.

OwlType Enum

The OwlType StrEnum provides type-safe constants for OWL vocabulary terms used in property metadata and restrictions:

from sparqlmojo import OwlType

# Property types
OwlType.OBJECT_PROPERTY      # http://www.w3.org/2002/07/owl#ObjectProperty
OwlType.DATATYPE_PROPERTY    # http://www.w3.org/2002/07/owl#DatatypeProperty

# Property characteristics
OwlType.FUNCTIONAL_PROPERTY          # owl:FunctionalProperty
OwlType.INVERSE_FUNCTIONAL_PROPERTY  # owl:InverseFunctionalProperty
OwlType.TRANSITIVE_PROPERTY          # owl:TransitiveProperty
OwlType.SYMMETRIC_PROPERTY           # owl:SymmetricProperty

# Restriction predicates (for cardinality constraints)
OwlType.ON_PROPERTY       # owl:onProperty
OwlType.CARDINALITY       # owl:cardinality (exact)
OwlType.MAX_CARDINALITY   # owl:maxCardinality
OwlType.MIN_CARDINALITY   # owl:minCardinality

# Other
OwlType.INVERSE_OF        # owl:inverseOf

Since OwlType is a StrEnum, values can be used directly in string comparisons or converted to URIRef for rdflib operations:

from rdflib import URIRef
from sparqlmojo import OwlType

# String comparison (StrEnum inherits from str)
if property_type == OwlType.FUNCTIONAL_PROPERTY:
    print("This is a functional property")

# Use with rdflib
predicate = URIRef(OwlType.MAX_CARDINALITY)

InverseField with Auto-Discovery

InverseField automatically discovers inverse relationships from your ontology using owl:inverseOf. Auto-discovery is enabled by default when a global registry is set.

from typing import Annotated
from sparqlmojo import InverseField, LiteralField, Model, SubjectField

class Child(Model):
    """Model for finding parents through inverse relationship."""
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None

    # Automatically discovers parent as inverse of children
    parent: Annotated[str | None, InverseField("http://schema.org/children")] = None

class Author(Model):
    """Model for finding authored works."""
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None

    # Automatically discovers authorOf as inverse of author
    books: Annotated[str | None, InverseField("http://schema.org/author")] = None

How Auto-Discovery Works

When a global SchemaRegistry is set, InverseField queries it for owl:inverseOf metadata:

Without ontology metadata: Uses SPARQL inverse operator (^)

# Generates: ?s ^<http://schema.org/children> ?parent
# Equivalent to: ?parent <http://schema.org/children> ?s

With ontology metadata: Uses the named inverse property

# If ontology defines: schema:children owl:inverseOf schema:parent
# Generates: ?s <http://schema.org/parent> ?parent

Discovery happens automatically when the registry is activated:

from sparqlmojo import SchemaRegistry

# Load ontology and activate in one step
registry = SchemaRegistry()
registry.load_from_file("schema.ttl", format="turtle", activate=True)

# InverseField now uses schema:parent from ontology

Example Ontology File

Here's a sample Turtle ontology defining inverse relationships:

@prefix schema: <http://schema.org/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

schema:children a owl:ObjectProperty ;
    rdfs:label "children"@en, "Kinder"@de ;
    rdfs:comment "Children of a person"@en ;
    rdfs:domain schema:Person ;
    rdfs:range schema:Person ;
    owl:inverseOf schema:parent .

schema:parent a owl:ObjectProperty ;
    rdfs:label "parent"@en, "Elternteil"@de ;
    rdfs:comment "Parent of a person"@en ;
    rdfs:domain schema:Person ;
    rdfs:range schema:Person ;
    owl:inverseOf schema:children .

schema:author a owl:ObjectProperty ;
    rdfs:domain schema:CreativeWork ;
    rdfs:range schema:Person ;
    owl:inverseOf <http://example.org/authorOf> .

Comparison: Regular IRIField vs InverseField

from typing import Annotated
from sparqlmojo import IRIField, InverseField, LiteralField, Model, SubjectField

# Forward relationship: Find children of a person
class Person(Model):
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None
    children: Annotated[str | None, IRIField("http://schema.org/children")] = None
    # SPARQL: ?s <http://schema.org/children> ?children

# Inverse relationship: Find parent of a child
class Child(Model):
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None
    parent: Annotated[str | None, InverseField("http://schema.org/children")] = None
    # With ontology: ?s <http://schema.org/parent> ?parent
    # Without ontology: ?s ^<http://schema.org/children> ?parent

# Both approaches are equivalent but InverseField:
# 1. Uses cleaner property names from ontology
# 2. Follows semantic web best practices
# 3. Automatically adapts to ontology changes

Use Cases

1. Family Relationships

# Find parents through children inverse
children_to_parents = session.query(Child).all()

2. Authorship

# Find all books written by an author
author_books = session.query(Author).filter_by(name="J.K. Rowling").first()

3. Employment

class Employee(Model):
    iri: Annotated[str, SubjectField()]
    employer: Annotated[str | None, InverseField("http://example.org/employs")] = None
# Find employer through inverse of "employs" relationship

4. Wikidata-Style Patterns

# Wikidata often requires inverse navigation
class WikidataEntity(Model):
    iri: Annotated[str, SubjectField()]
    # Find items that have this entity as their "instance of" value
    instances: Annotated[
        str | None, InverseField("http://www.wikidata.org/prop/direct/P31")
    ] = None

Schema Validation

When a global registry is set, models are automatically validated against ontology constraints at class definition time. This catches configuration errors early.

Validation checks:

Domain constraints: Property domain matches model's rdf_type
Range constraints: Python type matches XSD range
Cardinality: Functional properties don't use collection fields
Property existence: Predicate is defined in ontology

from typing import Annotated
from sparqlmojo import IRIField, LiteralField, Model, RDF_TYPE, SchemaRegistry, SubjectField
from sparqlmojo.exc import DomainConstraintError

registry = SchemaRegistry()
registry.load_from_file("schema.ttl", activate=True)

# This will raise DomainConstraintError at class definition time
# because schema:author has domain schema:Book, not schema:Person
class Person(Model):
    iri: Annotated[str, SubjectField()]
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Person")]
    author: Annotated[str | None, LiteralField("http://schema.org/author")] = None
    # DomainConstraintError: predicate expects domain [schema:Book],
    # but model has rdf_type 'schema:Person'

Validation policies can be configured per-model:

class Person(Model):
    class Meta:
        # Configure validation behavior
        unknown_property_policy = "warn"     # warn, error, or ignore
        domain_mismatch_policy = "error"     # default: error
        range_mismatch_policy = "error"      # default: error
        cardinality_mismatch_policy = "warn" # default: warn

    iri: Annotated[str, SubjectField()]
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Person")]

Multiple rdf:types: Models can have multiple rdf:type fields. Domain validation passes if any of the types matches the property domain. This follows standard RDF semantics where rdfs:domain is an inference rule, not a constraint - a resource with multiple types can use properties from any of its types.

See RDF Schema 1.1 - rdfs:domain:

"rdfs:domain is an instance of rdf:Property that is used to state that any resource that has a given property is an instance of one or more classes."

class PersonAndOrganization(Model):
    iri: Annotated[str, SubjectField()]
    # Multiple rdf:type fields - resource is both a Person and Organization
    type1: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Person")]
    type2: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Organization")]
    # 'age' has domain Person - valid because one of the types matches
    age: Annotated[int | None, LiteralField("http://schema.org/age")] = None

Opting Out of Schema Features

To disable all schema features for a specific model, use schema_aware = False:

class LegacyModel(Model):
    class Meta:
        schema_aware = False  # Disables validation AND inverse discovery

    iri: Annotated[str, SubjectField()]
    # No validation errors even if predicates don't match ontology

To disable only inverse discovery for a specific field:

class Person(Model):
    iri: Annotated[str, SubjectField()]
    # Always use ^ operator, even if ontology has owl:inverseOf
    parent: Annotated[
        str | None, InverseField("http://schema.org/children", auto_discover=False)
    ] = None

Benefits

Convention over Configuration: Schema features work automatically when registry is set
Ontology-Aware: Leverages existing OWL/RDFS metadata for automatic configuration
Early Error Detection: Validation catches misconfigured models at definition time
Cleaner Models: Use semantic property names instead of inverse operators
Flexible Fallback: Automatically falls back to ^ operator when no inverse defined
Thread-Safe Caching: Registry caches ontology metadata with configurable TTL
Multiple Sources: Load from files, endpoints, or manual registration
Multilingual Support: PropertyInfo includes labels and comments in multiple languages
Standards Compliance: Follows OWL 2 and RDFS specifications

Field-Level Filtering

SPARQLMojo provides intuitive field-level filtering similar to SQLAlchemy, with automatic datatype casting for numeric comparisons.

Key Features

Intuitive Syntax: Use Python comparison operators directly on model fields
Automatic Datatype Casting: Numeric comparisons automatically cast to xsd:decimal/xsd:integer
String Operations: contains(), startswith(), endswith() methods (polymorphic: collection fields use membership check)
Membership Testing: in_() and not_in() operators
Logical Operators: and_(), or_(), not_() for complex conditions
IRI Field Support: Proper handling of IRI fields with angle bracket syntax

Basic Usage

from typing import Annotated
from sparqlmojo import IRIField, LiteralField, Model, RDF_TYPE, Session, SubjectField
from sparqlmojo.orm.filtering import FieldFilter, and_, or_

class Person(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Person")]
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None
    age: Annotated[int | None, LiteralField("http://schema.org/age")] = None
    email: Annotated[str | None, LiteralField("http://schema.org/email")] = None
    entity_id: Annotated[str | None, IRIField("http://schema.org/identifier")] = None

session = Session()

# Basic equality filtering
query = session.query(Person).filter(Person.name == "Alice")
# Generates: FILTER(?name = "Alice")

# Numeric comparisons with automatic casting
query = session.query(Person).filter(Person.age > 18)
# Generates: FILTER(xsd:integer(?age) > 18)

# String operations
query = session.query(Person).filter(Person.email.contains("@example.com"))
# Generates: FILTER(CONTAINS(?email, "@example.com"))

# Logical operators
from sparqlmojo.orm.filtering import and_, or_
query = session.query(Person).filter(
    and_(
        Person.name == "Alice",
        Person.age >= 18
    )
)
# Generates: FILTER(?name = "Alice" && xsd:integer(?age) >= 18)

# IN operator
query = session.query(Person).filter(
    Person.name.in_(["Alice", "Bob", "Charlie"])
)
# Generates: FILTER(?name IN ("Alice", "Bob", "Charlie"))

# IRI field filtering
query = session.query(Person).filter(
    Person.entity_id == "http://example.org/Q682"
)
# Generates: FILTER(?entity_id = <http://example.org/Q682>)

String Filtering on IRI Fields

For IRI fields, you often need to filter by the string content of the IRI rather than exact matching. SPARQLMojo provides chainable string function methods:

from typing import Annotated
from sparqlmojo import IRIField, LiteralField, Model, RDF_TYPE, Session, SubjectField

class Document(Model):
    rdf_type: Annotated[str, IRIField(RDF_TYPE, default="http://schema.org/Document")]
    iri: Annotated[str, SubjectField()]
    name: Annotated[str | None, LiteralField("http://schema.org/name")] = None
    format_type: Annotated[str | None, IRIField("http://example.org/formatType")] = None

session = Session()

# Filter IRI field by string content
query = session.query(Document).filter(
    Document.format_type.str().contains("pdf")
)
# Generates: FILTER(CONTAINS(STR(?format_type), "pdf"))

# Case-insensitive filtering with lower()
query = session.query(Document).filter(
    Document.format_type.str().lower().contains("pdf")
)
# Generates: FILTER(CONTAINS(LCASE(STR(?format_type)), "pdf"))

# Case-insensitive filtering with upper()
query = session.query(Document).filter(
    Document.format_type.str().upper().contains("PDF")
)
# Generates: FILTER(CONTAINS(UCASE(STR(?format_type)), "PDF"))

# String prefix/suffix matching
query = session.query(Document).filter(
    Document.format_type.str().startswith("http://")
)
# Generates: FILTER(STRSTARTS(STR(?format_type), "http://"))

query = session.query(Document).filter(
    Document.format_type.str().lower().endswith("/pdf")
)
# Generates: FILTER(STRENDS(LCASE(STR(?format_type)), "/pdf"))

Available Methods:

Method	Description	SPARQL Function
`str()`	Convert IRI to string	`STR()`
`lower()`	Convert to lowercase	`LCASE()`
`upper()`	Convert to uppercase	`UCASE()`
`contains(s)`	Check if string contains substring	`CONTAINS()`
`startswith(s)`	Check if string starts with prefix	`STRSTARTS()`
`endswith(s)`	Check if string ends with suffix	`STRENDS()`

Note: The str() method is required before lower() or upper() when filtering IRI fields, as IRIs must first be converted to strings before string functions can be applied.

Benefits

Type Safety: Field references are validated against the model definition
RDF Compatibility: Automatic datatype casting handles the common issue of numeric values stored as strings
Intuitive API: Familiar syntax for developers coming from SQLAlchemy or Django ORM
Backward Compatibility: Existing Condition class continues to work alongside new filtering
Performance: Efficient SPARQL generation with minimal overhead

Release Process

SPARQLMojo uses a tag-based release workflow with automated CHANGELOG management and Codeberg Releases.

Workflow Overview

During Development: Update CHANGELOG.md in the [Unreleased] section when creating merge requests
Accumulate Changes: Multiple MRs can add to [Unreleased] before a release
Create Release: Tag the commit to trigger automated release creation

For Contributors (Merge Request Time)

When creating a merge request, update CHANGELOG.md under the [Unreleased] section:

## [Unreleased]

### Fixed
- Issue #123: Fixed bug in query compilation

### Added
- New feature for advanced filtering

### Changed
- Improved performance of batch operations

Follow Keep a Changelog format with sections:

Fixed - Bug fixes
Added - New features
Changed - Changes to existing functionality
Deprecated - Soon-to-be removed features
Removed - Removed features
Security - Security fixes

For Maintainers (Release Time)

When ready to release a new version:

# 1. Preview release notes and create tag
./scripts/tag-release.sh v0.12.0

# 2. Push the tag to trigger CI/CD automation
git push origin v0.12.0

The CI/CD workflow (.gitea/workflows/release.yml) automatically:

Extracts release notes from [Unreleased] section
Updates CHANGELOG.md ([Unreleased] → [0.12.0] - 2026-03-05)
Adds new empty [Unreleased] section at the top
Commits and pushes CHANGELOG update to main
Creates Codeberg release with extracted notes

Manual Alternative (if CI/CD unavailable):

# 1. Create and push tag
git tag v0.12.0 && git push origin v0.12.0

# 2. Run publish script manually
./scripts/publish-release.sh v0.12.0

# 3. Push CHANGELOG update
git push origin main

Release Scripts

tag-release.sh - Create annotated tag with release notes preview
publish-release.sh - Update CHANGELOG and publish to Codeberg
create-release.sh - Legacy all-in-one script (use tag-release.sh instead)

See scripts/README.md for detailed documentation.

Version Format

Use semantic versioning: vMAJOR.MINOR.PATCH

MAJOR: Breaking changes
MINOR: New features (backward compatible)
PATCH: Bug fixes (backward compatible)

Examples: v0.11.0, v1.0.0, v1.2.3

Dependencies

pydantic>=2.12.4 - Data validation and type checking
SPARQLWrapper>=2.0.0 - SPARQL endpoint communication
rdflib>=6.0.0 - RDF graph parsing and manipulation

Key Benefits of Pydantic Integration

Type Safety: Fields are validated at runtime against their type annotations
Better IDE Support: Full autocomplete and type hints in modern IDEs
Clear Error Messages: Pydantic provides detailed validation errors
Automatic Coercion: Compatible types are automatically converted (e.g., "123" → 123 for int fields)
Extra Field Protection: Unknown fields are rejected by default

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.15.8

Apr 28, 2026

0.15.7

Apr 6, 2026

0.15.6

Apr 6, 2026

0.15.5

Mar 23, 2026

0.15.4

Mar 23, 2026

0.15.3

Mar 22, 2026

0.15.2

Mar 22, 2026

0.15.1

Mar 21, 2026

0.15.0

Mar 19, 2026

This version

0.14.4

Mar 16, 2026

0.14.3

Mar 16, 2026

0.14.2

Mar 15, 2026

0.14.1

Mar 15, 2026

0.14.0

Mar 5, 2026

0.13.0

Mar 5, 2026

0.12.0

Mar 5, 2026

0.11.0

Mar 3, 2026

0.10.0

Feb 25, 2026

0.9.0

Feb 24, 2026

0.8.0

Feb 23, 2026

0.7.1

Feb 21, 2026

0.7.0

Feb 21, 2026

0.6.1

Feb 18, 2026

0.6.0

Feb 17, 2026

0.5.4

Feb 16, 2026

0.5.3

Feb 16, 2026

0.5.2

Feb 16, 2026

0.5.1

Feb 14, 2026

0.5.0

Feb 13, 2026

0.4.2

Feb 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparqlmojo-0.14.4.tar.gz (116.7 kB view details)

Uploaded Mar 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sparqlmojo-0.14.4-py3-none-any.whl (115.9 kB view details)

Uploaded Mar 16, 2026 Python 3

File details

Details for the file sparqlmojo-0.14.4.tar.gz.

File metadata

Download URL: sparqlmojo-0.14.4.tar.gz
Upload date: Mar 16, 2026
Size: 116.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.12.13 Linux/6.12.57+deb13-amd64

File hashes

Hashes for sparqlmojo-0.14.4.tar.gz
Algorithm	Hash digest
SHA256	`95248d1b853549cce48ae4b2e7ce15647e123cbce8e5a08fd51f7975ba900c86`
MD5	`eec6f1cc0cf675da0e7d88028ed6a89e`
BLAKE2b-256	`35c86790aec7d79f2088938fd338d40ad2833633e75d2b488d2e72bb3cd8783d`

See more details on using hashes here.

File details

Details for the file sparqlmojo-0.14.4-py3-none-any.whl.

File metadata

Download URL: sparqlmojo-0.14.4-py3-none-any.whl
Upload date: Mar 16, 2026
Size: 115.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.2 CPython/3.12.13 Linux/6.12.57+deb13-amd64

File hashes

Hashes for sparqlmojo-0.14.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4b7071fe31ebdee6de5c325212477ebf8a1588b6f01a8960299d2721bc82b861`
MD5	`2d05f0752c80d12f728471163a92a1b0`
BLAKE2b-256	`671c37291ace3f21d3ed172214a272273acba8b1c0776ae8cd7633da9d564ba2`

See more details on using hashes here.

SPARQLMojo 0.14.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SPARQLMojo

Features

Installation

Version

Versioning Workflow

Usage

HTTP Method Configuration

Query Methods

Configuration

When to Use Each Mode

Identity Map

Benefits

Manual Cache Management

PREFIX Management System

Features

Usage

Benefits

Language-Tagged Literals

LangString Field

MultiLangString Field

Complex Language Tags

Language Tag Validation

Benefits

Collection Fields

LiteralList - Aggregate Multiple Literal Values

LangStringList - Aggregate Language-Tagged Literals

IRIList - Aggregate Multiple IRI References

TypedLiteralList - Aggregate Typed Literals with XSD Datatype Preservation

Custom Separators

Limiting Collection Size

Multiple Collection Fields

Filtering Collection Fields

Benefits

UPDATE Operations

Dirty Tracking

Partial Updates

SPARQL Generated

Batch Operations

Batch Inserts

Batch Updates

Batch Deletes

Chunking for Large Batches

Performance Benefits

Running Tests

Test Dataset

Model Definitions

Python to RDF Triple Translation

Python Code

Generated RDF Triples (Turtle Format)

Limitations

Known Issues and Risks

Pydantic Internal API Dependency

VALUES Clause Support

ORM-Style API (Recommended)

Dict-Style API

Key Features

Benefits

Property Paths

Convenience Methods (Recommended)

Method Chaining

Advanced: Complex Property Paths

Inverse Property Paths in Model Fields

Benefits

Ontology-Aware Models with SchemaRegistry

Quick Start

SchemaRegistry

PropertyInfo Metadata

OwlType Enum

InverseField with Auto-Discovery

How Auto-Discovery Works

Example Ontology File