Skip to main content

A tool for uploading RDF data to SPARQL endpoints

Project description

RDF Uploader

When working with RDF data and multiple triple stores, it is common to need to upload knowledge graphs to different stores. Although most stores claim to be standards-based, there are two main standards: the Graph Store Protocol and SPARQL Update. However, there are nuances regarding exact URL endpoints, named graphs, and authentication, making it a pain to deal with multiple proprietary tools.

Introducing rdf_uploader, a single tool that can upload RDF data to a variety of data sources. It is easy to use and has no dependencies on RDFLib or any datastore-specific libraries, relying solely on pure HTTP. With rdf_uploader, you can seamlessly upload your RDF data to different triple stores without the hassle of dealing with multiple tools and their quirks.

Features

  • Ingest RDF data into SPARQL endpoints using asynchronous operations
  • Support for multiple RDF stores (MarkLogic, Blazegraph, Neptune, RDFox, and Stardog)
  • Authentication support for secure endpoints
  • Content type detection and customization
  • Clear status outputs after each upload operation
  • Concurrent uploads with configurable limits

Installation

From PyPI

pip install rdf-uploader

Using uv (faster installation)

uv install rdf-uploader

Development Installation

# Clone the repository
git clone https://github.com/yourusername/rdf-uploader.git
cd rdf-uploader

# Install with development dependencies
pip install -e ".[dev]"

# Or with uv
uv install -e ".[dev]"

Usage

Basic Usage

Upload a single RDF file to a SPARQL endpoint:

rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql

Multiple Files

Upload multiple RDF files:

rdf-uploader upload path/to/file1.ttl path/to/file2.n3 --endpoint http://localhost:3030/dataset/sparql

Specify Endpoint Type

rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --type fuseki

Available endpoint types:

  • marklogic
  • neptune
  • blazegraph
  • rdfox
  • stardog

Specify Named Graph

rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --graph http://example.org/graph

Authentication

For endpoints that require authentication:

rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --username myuser --password mypass

Content Type

Specify the content type for the RDF data:

rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --content-type "text/turtle"

If not specified, the content type is automatically detected based on the file extension:

  • .ttl, .turtle: text/turtle
  • .nt: application/n-triples
  • .n3: text/n3
  • .nq, .nquads: application/n-quads
  • .rdf, .xml: application/rdf+xml
  • .jsonld: application/ld+json
  • .json: application/rdf+json
  • .trig: application/trig

Control Concurrency

Limit the number of concurrent uploads:

rdf-uploader upload path/to/*.ttl --endpoint http://localhost:3030/dataset/sparql --concurrent 10

Verbose Mode

Enable verbose output to see detailed information about each batch upload, including the number of triples per batch and server response codes:

rdf-uploader upload path/to/file.ttl --endpoint http://localhost:3030/dataset/sparql --verbose

Help

Get help on available commands and options:

rdf-uploader --help
rdf-uploader upload --help

Testing

The project uses pytest for testing. Tests are designed to work with live test databases.

# Run all tests
pytest

# Run specific test file
pytest tests/test_uploader.py

# Run with verbose output
pytest -v

Test Configuration

Tests use a local SPARQL endpoint by default. You can configure the test endpoint by setting environment variables:

export TEST_ENDPOINT_URL=http://localhost:3030/test
export TEST_ENDPOINT_TYPE=fuseki

Development

Code Quality

The project uses ruff for linting and formatting:

# Run linting
ruff check .

# Run formatting
ruff format .

Type Checking

The project uses mypy for type checking:

mypy src tests

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdf_uploader-0.1.0.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rdf_uploader-0.1.0-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file rdf_uploader-0.1.0.tar.gz.

File metadata

  • Download URL: rdf_uploader-0.1.0.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for rdf_uploader-0.1.0.tar.gz
Algorithm Hash digest
SHA256 07a2cfddefba39ede06aeb64931d27ff48021ed6aef8c31d52e03038907d03d8
MD5 f1d90c930c3b9bcd5a6b48ee693e3c27
BLAKE2b-256 964306e1d83c9fa83e42b7d3859ca5d05695d5f28f273d48b74c73e4a8dfb53b

See more details on using hashes here.

File details

Details for the file rdf_uploader-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rdf_uploader-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for rdf_uploader-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 470c1c75e171ef776cbd9c1a02f92aaacd80855d6745e47a21fb865e9fc90788
MD5 1c87f1168c65d1d73eb57f4390f360e9
BLAKE2b-256 ef8ff30b079231a4cd9daf4c68cdd5bb77934071361bb25796f13494c270dba8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page