Add your description here
Project description
SPARQLx ✨🦋
Python library for httpx-based SPARQL Query and Update Operations according to the SPARQL 1.2 Protocol (see Protocol Implemention).
WARNING: This project is in an early stage of development and should be used with caution.
Features
- Async Interface:
asynciosupport withaquery()andAsyncContextManagerAPI. - Query Response Streaming: Streaming iterators for large result sets available with
query_stream()andaquery_stream() - Synchronous Concurrency Wrapper: Support for concurrent execution of multiple queries from synchronous code with
queries() - RDFLib Integration: Direct conversion to RDFLib SPARQL result representations
- Context Managers: Synchronous and asynchronous context managers for lexical resource management
- Client Sharing: Support for sharing and re-using
httpxclients for HTTP connection pooling
Installation
sparqlx is a PEP 621-compliant package and available on PyPI.
pip install sparqlx
Usage
SPARQLWrapper.query
To run a query against an endpoint, instantiate a SPARQLWrapper object and call its query method:
from sparqlx import SPARQLWrapper
sparql_wrapper = SPARQLWrapper(
sparql_endpoint="https://query.wikidata.org/bigdata/namespace/wdq/sparql"
)
result: httpx.Response = sparql_wrapper.query("select * where {?s ?p ?o} limit 10")
The default response formats are JSON for SELECT and ASK queries and Turtle for CONSTRUCT and DESCRIBE queries.
SPARQLWrapper.query features a response_format parameter that takes
"json","xml","csv","tsv"forSELECTandASKqueries"turtle","xml","ntriples","json-ld"forCONSTRUCTandDESCRIBEqueries- any other string; the supplied value will be passed as MIME Type to the
Acceptheader.
If the convert parameter is set to True, SPARQLWrapper.query returns
- a
listof Python dictionaries with dict-values cast to RDFLib objects forSELECTqueries - a Python
boolforASKqueries - an
rdflib.Graphinstance forCONSTRUCTandDESCRIBEqueries.
Note that only JSON is supported as a response format for convert=True on SELECT and ASK query results.
Client Sharing and Configuration
By default, SPARQLWrapper creates and manages httpx.Client instances internally.
An httpx.Client can also be supplied by user code; this provides a configuration interface and allows for HTTP connection pooling.
Note that if an
httpx.Clientis supplied toSPARQLWrapper, user code is responsible for managing (closing) the client.
import httpx
from sparqlx import SPARQLWrapper
client = httpx.Client(timeout=10.0)
sparql_wrapper = SPARQLWrapper(
sparql_endpoint="https://query.wikidata.org/bigdata/namespace/wdq/sparql", client=client
)
result: httpx.Response = sparql_wrapper.query("select * where {?s ?p ?o} limit 10")
print(client.is_closed) # False
client.close()
print(client.is_closed) # True
It is also possible to configure SPARQLWrapper-managed clients by passing a dict holding httpx.Client kwargs to the client_config parameter:
from sparqlx import SPARQLWrapper
sparql_wrapper = SPARQLWrapper(
sparql_endpoint="https://query.wikidata.org/bigdata/namespace/wdq/sparql",
client_config={"timeout": 10.0},
)
result: httpx.Response = sparql_wrapper.query("select * where {?s ?p ?o} limit 10")
In that case, SPARQLWrapper will internally create and manage httpx.Client instances (the default behavior if no client is provided), but will instantiate clients based on the supplied client_config kwargs.
SPARQLWrapper.aquery
SPARQLWrapper.aquery is an asynchronous version of SPARQLWrapper.query.
import asyncio
from sparqlx import SPARQLWrapper
sparql_wrapper = SPARQLWrapper(
sparql_endpoint="https://query.wikidata.org/bigdata/namespace/wdq/sparql"
)
async def run_queries(*queries: str) -> list[httpx.Response]:
return await asyncio.gather(*[sparql_wrapper.aquery(query) for query in queries])
results: list[httpx.Response] = asyncio.run(
run_queries(*["select * where {?s ?p ?o} limit 10" for _ in range(10)])
)
For client sharing or configuration of internal client instances, pass an httpx.AsyncClient instance to aclient or kwargs to aclient_config respectively (see SPARQLWrapper.query).
SPARQLWrapper.queries
SPARQLWrapper.queries is a synchronous wrapper around asynchronous code and allows to run multiple queries concurrently from synchronous code.
from sparqlx import SPARQLWrapper
sparql_wrapper = SPARQLWrapper(
sparql_endpoint="https://query.wikidata.org/bigdata/namespace/wdq/sparql"
)
results: Iterator[httpx.Response] = sparql_wrapper.queries(
*["select * where {?s ?p ?o} limit 100" for _ in range(10)]
)
Note that since SPARQLWrapper.queries runs async code under the hood, httpx client sharing or configuration requires setting aclient or aclient_config in the respective SPARQLWrapper.
Also, SPARQLWrapper.queries creates an event loop and therefore cannot be called from asynchronous code.
If an httpx.AsyncClient is supplied, the client will be closed after the first call to SPARQLWrapper.queries.
User code that wants to run multiple calls to queries can still exert control over the client by using aclient_config. For finer control over concurrent query execution, use the async interface.
Response Streaming
HTTP Responses can be streamed using the SPARQLWrapper.query_stream and SPARQLWrapper.aquery_stream Iterators.
from sparqlx import SPARQLWrapper
sparql_wrapper = SPARQLWrapper(
sparql_endpoint="https://query.wikidata.org/bigdata/namespace/wdq/sparql",
)
stream: Iterator[bytes] = sparql_wrapper.query_stream(
"select * where {?s ?p ?o} limit 10000"
)
astream: AsyncIterator = sparql_wrapper.aquery_stream(
"select * where {?s ?p ?o} limit 10000"
)
The streaming method and chunk size (for chunked responses) can be controlled with the streaming_method and chunk_size parameters respectively.
Context Managers
SPARQLWrapper also implements the context manager protocol. This can be useful in two ways:
- Managed Client: Unless an httpx client is passed,
SPARQLWrappercreates and manages clients internally. In that case, the context manager uses a single client per context and enables connection pooling within the context. - Supplied Client: If an httpx client is passed,
SPARQLWrapperwill use that client instance and calling code is responsible for client management. In that case, the context manager will manage the supplied client.
from sparqlx import SPARQLWrapper
sparql_wrapper = SPARQLWrapper(
sparql_endpoint="https://query.wikidata.org/bigdata/namespace/wdq/sparql",
)
with sparql_wrapper as context_wrapper:
result: httpx.Response = context_wrapper.query("select * where {?s ?p ?o} limit 10")
import httpx
from sparqlx import SPARQLWrapper
client = httpx.Client()
sparql_wrapper = SPARQLWrapper(
sparql_endpoint="https://query.wikidata.org/bigdata/namespace/wdq/sparql", client=client
)
with sparql_wrapper as context_wrapper:
result: httpx.Response = context_wrapper.query("select * where {?s ?p ?o} limit 10")
print(client.is_closed) # False
print(client.is_closed) # True
Update Operations
SPARQLx supports Update Operations according to the SPARQL 1.2 Protocol.
The following methods implement SPARQL Update:
SPARQLWrapper.updateSPARQLWrapper.aupdateSPARQLWrapper.updates
Given an initially empty Triplestore with SPARQL and SPARQL Update endpoints, one could e.g. insert data like so:
import httpx
from sparqlx import SPARQLWrapper
sparql_wrapper = SPARQLWrapper(
sparql_endpoint="https://triplestore/query",
update_endpoint="https://triplestore/update",
aclient_config = {
"auth": httpx.BasicAuth(username="admin", password="supersecret123")
}
)
with sparql_wrapper as wrapper:
store_empty: bool = not wrapper.query(
"ask where {{?s ?p ?o} union {graph ?g {?s ?p ?o}}}", convert=True
)
assert store_empty, "Expected store to be empty."
wrapper.updates(
"insert data {<urn:s> <urn:p> <urn:o>}",
"insert data {graph <urn:ng1> {<urn:s> <urn:p> <urn:o>}}",
"insert data {graph <urn:ng2> {<urn:s> <urn:p> <urn:o>}}",
)
result = wrapper.query(
"select ?g ?s ?p ?o where { {?s ?p ?o} union { graph ?g {?s ?p ?o} }}",
convert=True,
)
This will run the specified update operations asynchronously with an internally managed event loop; the query then returns the following Python conversion:
[
{
"g": rdflib.term.URIRef("urn:ng2"),
"s": rdflib.term.URIRef("urn:s"),
"p": rdflib.term.URIRef("urn:p"),
"o": rdflib.term.URIRef("urn:o"),
},
{
"g": rdflib.term.URIRef("urn:ng1"),
"s": rdflib.term.URIRef("urn:s"),
"p": rdflib.term.URIRef("urn:p"),
"o": rdflib.term.URIRef("urn:o"),
},
{
"g": None,
"s": rdflib.term.URIRef("urn:s"),
"p": rdflib.term.URIRef("urn:p"),
"o": rdflib.term.URIRef("urn:o"),
},
]
SPARQL 1.2 Protocol Implementation
SPARQLx aims to provide a convenient Python interface for interacting with SPARQL endpoints according to the SPARQL 1.2 Protocol.
The SPARQL Protocol provides a specification for HTTP operations targeting SPARQL Query and Update endpoints.
"[The SPARQL 1.2 Protocol] describes a means for conveying SPARQL queries and updates to a SPARQL processing service and returning the results via HTTP to the entity that requested them." (SPARQL 1.2 Protocol - Abstract)
Generally, the SPARQL 1.2 Protocol defines the following HTTP operations for SPARQL operations:
- GET (query)
- URL-encoded POST (query and update)
- POST directly (query and update)
See 2.2 Query Operation and 2.3 Update Operation.
SPARQLx implements URL-encoded POST for both Query and Update operations.
This allows to send a Request Content Type in the Accept Header and both the Query/Update Request strings and Query Parameters in the Request Message Body.
SPARQL Protocol Request Parameters
The SPARQL Protocol also specifies the following request parameters:
- version (0 or 1)
- default-graph-uri (0 or more)
- named-graph-uri (0 or more)
for Query Operations, where default-graph-uri and named-graph-uri correspond to SPARQL FROM and FROM NAMED respectively, and, if present, take precedence over SPARQL clauses.
- version (0 or 1)
- using-graph-uri (0 or more)
- using-named-graph-uri (0 or more)
for Update Operations, where using-graph-uri and using-named-graph-uri correspond to SPARQL USING and USING NAMED, and likewise take precedence over SPARQL clauses.
SPARQL Protocol request parameters are reflected in the SPARQLx object API:
- Methods implementing query operations take
default_graph_uriandnamed_graph_uriparameters. - Methods implementing udpate operations take
using_graph_uriandusing_named_graph_uriparameters. - Both query and update methods take a
versionparameter.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sparqlx-0.3.0.tar.gz.
File metadata
- Download URL: sparqlx-0.3.0.tar.gz
- Upload date:
- Size: 48.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e82480cf61da9d8ce0290334116184c28acc621f476c3792d102a933a602b4b
|
|
| MD5 |
b88613c1338c9fdb2ba32f2f80164ff6
|
|
| BLAKE2b-256 |
3092e249b48ee2ec89941e820cc85c5fd2a0c1906a31c34e7536d2382a504cb8
|
File details
Details for the file sparqlx-0.3.0-py3-none-any.whl.
File metadata
- Download URL: sparqlx-0.3.0-py3-none-any.whl
- Upload date:
- Size: 22.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3272b0a60559c606ffcbd6975ea46e8502ca09998c454199d36b1b1783bebf1
|
|
| MD5 |
edfba8710d302df67a086da53eef2f4e
|
|
| BLAKE2b-256 |
c3029bd6cbe32ab31b710a992d80defd7861996dd64ec2420cc0db3813e6b568
|