A tool for uploading RDF data to SPARQL endpoints
Project description
RDF Uploader
When working with RDF data and multiple triple stores, it is common to need to upload knowledge graphs to different stores. Although most stores claim to be standards-based, there are two main standards: the Graph Store Protocol and SPARQL Update. However, there are nuances regarding exact URL endpoints, named graphs, and authentication, making it a pain to deal with multiple proprietary tools.
Introducing rdf_uploader, a single tool that can upload RDF data to a variety of data sources. It is easy to use and has no dependencies on RDFLib or any datastore-specific libraries, relying solely on pure HTTP. With rdf_uploader, you can seamlessly upload your RDF data to different triple stores without the hassle of dealing with multiple tools and their quirks.
Features
- Ingest RDF data into SPARQL endpoints using asynchronous operations
- Support for multiple RDF stores (MarkLogic, Blazegraph, Neptune, RDFox, and Stardog)
- Authentication support for secure endpoints
- Content type detection and customization
- Clear status outputs after each upload operation
- Concurrent uploads with configurable limits
Installation
From PyPI
pip install rdf-uploader
Using uv (faster installation)
uv install rdf-uploader
Development Installation
# Clone the repository
git clone https://github.com/yourusername/rdf-uploader.git
cd rdf-uploader
# Install with development dependencies
pip install -e ".[dev]"
# Or with uv
uv install -e ".[dev]"
Usage
Basic Usage
Upload a single RDF file to a SPARQL endpoint:
rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql
Multiple Files
Upload multiple RDF files:
rdf-uploader upload path/to/file1.ttl path/to/file2.n3 --endpoint http://localhost:3030/dataset/sparql
Specify Endpoint Type
rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --type fuseki
Available endpoint types:
marklogicneptuneblazegraphrdfoxstardog
Specify Named Graph
rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --graph http://example.org/graph
Authentication
For endpoints that require authentication:
rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --username myuser --password mypass
Content Type
Specify the content type for the RDF data:
rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --content-type "text/turtle"
If not specified, the content type is automatically detected based on the file extension:
.ttl,.turtle:text/turtle.nt:application/n-triples.n3:text/n3.nq,.nquads:application/n-quads.rdf,.xml:application/rdf+xml.jsonld:application/ld+json.json:application/rdf+json.trig:application/trig
Control Concurrency
Limit the number of concurrent uploads:
rdf-uploader upload path/to/*.ttl --endpoint http://localhost:3030/dataset/sparql --concurrent 10
Verbose Mode
Enable verbose output to see detailed information about each batch upload, including the number of triples per batch and server response codes:
rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --verbose
Help
Get help on available commands and options:
rdf-uploader --help
rdf-uploader upload --help
Testing
The project uses pytest for testing. Tests are designed to work with live test databases.
# Run all tests
pytest
# Run specific test file
pytest tests/test_uploader.py
# Run with verbose output
pytest -v
Test Configuration
Tests use a local SPARQL endpoint by default. You can configure the test endpoint by setting environment variables:
export TEST_ENDPOINT_URL=http://localhost:3030/test
export TEST_ENDPOINT_TYPE=fuseki
Development
Code Quality
The project uses ruff for linting and formatting:
# Run linting
ruff check .
# Run formatting
ruff format .
Type Checking
The project uses mypy for type checking:
mypy src tests
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rdf_uploader-0.1.0.tar.gz.
File metadata
- Download URL: rdf_uploader-0.1.0.tar.gz
- Upload date:
- Size: 20.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
07a2cfddefba39ede06aeb64931d27ff48021ed6aef8c31d52e03038907d03d8
|
|
| MD5 |
f1d90c930c3b9bcd5a6b48ee693e3c27
|
|
| BLAKE2b-256 |
964306e1d83c9fa83e42b7d3859ca5d05695d5f28f273d48b74c73e4a8dfb53b
|
File details
Details for the file rdf_uploader-0.1.0-py3-none-any.whl.
File metadata
- Download URL: rdf_uploader-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
470c1c75e171ef776cbd9c1a02f92aaacd80855d6745e47a21fb865e9fc90788
|
|
| MD5 |
1c87f1168c65d1d73eb57f4390f360e9
|
|
| BLAKE2b-256 |
ef8ff30b079231a4cd9daf4c68cdd5bb77934071361bb25796f13494c270dba8
|