Skip to main content

Sage: a preemptive SPARQL query engine for online Knowledge Graphs

Project description

Sage: a SPARQL query engine for public Linked Data providers

Build Status PyPI version Docs

Read the online documentation

SaGe is a SPARQL query engine for public Linked Data providers that implements Web preemption. The SPARQL engine includes a smart Sage client and a Sage SPARQL query server hosting RDF datasets (hosted using HDT). This repository contains the Python implementation of the SaGe SPARQL query server.

SPARQL queries are suspended by the web server after a fixed quantum of time and resumed upon client request. Using Web preemption, Sage ensures stable response times for query execution and completeness of results under high load.

The complete approach and experimental results are available in a Research paper accepted at The Web Conference 2019, available here. Thomas Minier, Hala Skaf-Molli and Pascal Molli. "SaGe: Web Preemption for Public SPARQL Query services" in Proceedings of the 2019 World Wide Web Conference (WWW'19), San Francisco, USA, May 13-17, 2019.

We appreciate your feedback/comments/questions to be sent to our mailing list or our issue tracker on github.

Table of contents

Installation

Installation in a virtualenv is strongly advised!

Requirements:

  • Python 3.7 (or higher)
  • pip
  • gcc/clang with c++11 support
  • Python Development headers

You should have the Python.h header available on your system.
For example, for Python 3.6, install the python3.6-dev package on Debian/Ubuntu systems.

Installation using pip

The core engine of the SaGe SPARQL query server with HDT as a backend can be installed as follows:

pip install sage-engine[hdt,postgres]

The SaGe query engine uses various backends to load RDF datasets. The various backends available are installed as extras dependencies. The above command install both the HDT and PostgreSQL backends.

Manual Installation using poetry

The SaGe SPARQL query server can also be manually installed using the poetry dependency manager.

git clone https://github.com/sage-org/sage-engine
cd sage-engine
poetry install --extras "hdt postgres"

As with pip, the various SaGe backends are installed as extras dependencies, using the --extras flag.

Getting started

Server configuration

A Sage server is configured using a configuration file in YAML syntax. You will find below a minimal working example of such configuration file. A full example is available in the config_examples/ directory

name: SaGe Test server
maintainer: Chuck Norris
quota: 75
max_results: 2000
graphs:
-
  name: dbpedia
  uri: http://example.org/dbpedia
  description: DBPedia
  backend: hdt-file
  file: datasets/dbpedia.2016.hdt

The quota and max_results fields are used to set the maximum time quantum and the maximum number of results allowed per request, respectively.

Each entry in the datasets field declare a RDF dataset with a name, description, backend and options specific to this backend. Currently, only the hdt-file backend is supported, which allow a Sage server to load RDF datasets from HDT files. Sage uses pyHDT to load and query HDT files.

Starting the server

The sage executable, installed alongside the Sage server, allows to easily start a Sage server from a configuration file using Gunicorn, a Python WSGI HTTP Server.

# launch Sage server with 4 workers on port 8000
sage my_config.yaml -w 4 -p 8000

The full usage of the sage executable is detailed below:

Usage: sage [OPTIONS] CONFIG

  Launch the Sage server using the CONFIG configuration file

Options:
  -p, --port INTEGER              The port to bind  [default: 8000]
  -w, --workers INTEGER           The number of server workers  [default: 4]
  --log-level [debug|info|warning|error]
                                  The granularity of log outputs  [default:
                                  info]
  --help                          Show this message and exit.

SaGe Docker image

The Sage server is also available through a Docker image. In order to use it, do not forget to mount in the container the directory that contains you configuration file and your datasets.

docker pull callidon/sage
docker run -v path/to/config-file:/opt/data/ -p 8000:8000 callidon/sage sage /opt/data/config.yaml -w 4 -p 8000

Documentation

To generate the documentation, navigate in the docs directory and generate the documentation

cd docs/
make html
open build/html/index.html

Copyright 2017-2019 - GDD Team, LS2N, University of Nantes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sage-engine-2.3.0.tar.gz (75.5 kB view details)

Uploaded Source

Built Distribution

sage_engine-2.3.0-py3-none-any.whl (125.4 kB view details)

Uploaded Python 3

File details

Details for the file sage-engine-2.3.0.tar.gz.

File metadata

  • Download URL: sage-engine-2.3.0.tar.gz
  • Upload date:
  • Size: 75.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.4 Linux/5.4.0-1043-azure

File hashes

Hashes for sage-engine-2.3.0.tar.gz
Algorithm Hash digest
SHA256 ce6f4d04eabe6cbbf9a19944cc3b6feba0e4528c8489680a2f558131ec04ef55
MD5 b6388e82245b51992ec9855537dd7aa6
BLAKE2b-256 c4806ff3211be4b695394eb5601590d9a2766e8808f682b0a0044d60c414dc78

See more details on using hashes here.

File details

Details for the file sage_engine-2.3.0-py3-none-any.whl.

File metadata

  • Download URL: sage_engine-2.3.0-py3-none-any.whl
  • Upload date:
  • Size: 125.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.6 CPython/3.9.4 Linux/5.4.0-1043-azure

File hashes

Hashes for sage_engine-2.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e74f81a23f41e7a42d0dbc7903de6e51edd7629731b4c1a47c46efacad46da06
MD5 339546dac8187c468cdc92304813671a
BLAKE2b-256 6a4c75dbc0b2f88fc03241035b9e8e1279fbd4258dfc07274749de59bd44b46d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page