Skip to main content

No project description provided

Project description

wikibase-sync

License license
Build Status travis build status
Coverage coverage

Python library to synchronise data between RDF files and Wikibase instances.

It updates your wikibase instance based on the changes of the RDF ontology.

How to install

You can install it manually from the source code:

git clone https://github.com/weso/wikibase-sync
cd wikibase-sync
python setup.py install

Python 3.7+ is recommended.

Examples

With the following code you can synchronize the modification of two RDF files to a given Wikibase instance:

from wbsync.triplestore import WikibaseAdapter
from wbsync.synchronization import GraphDiffSyncAlgorithm, OntologySynchronizer

mediawiki_api_url='wikibase_api_endpoint'
sparql_endpoint_url='wikibase_sparql_endpoint'
username='wikibase_username'
password='wikibase_password'
adapter = WikibaseAdapter(mediawiki_api_url, sparql_endpoint_url, username, password)

algorithm = GraphDiffSyncAlgorithm()
synchronizer = OntologySynchronizer(algorithm)

source_content = "original rdf content goes here"
target_content = "final rdf content goes here"
ops = synchronizer.synchronize(source_content, target_content)
for op in ops:
    res = op.execute(adapter)
    if not res.successful:
        print(f"Error synchronizing triple: {res.message}")

Leaving the source_content empty will be equivalent to adding the target contents to the Wikibase, while leaving the target_content empty will be equivalent to removing the source_content from the Wikibase if present. Additional examples about synchronizing RDF files with a Wikibase instance can be seen in the Synchronization notebook.

Executing batch operations

There is the possibility of performing batch operations (executing at once all of the statements of a given entity). This type of synchronization will have a better performance at the risk that an invalid statement will cancel the entire batch operation. The following code can be used to execute batch operations:

from wbsync.synchronization.operations import optimize_ops

def execute_batch_synchronization(source_content, target_content, synchronizer, adapter):
    ops = synchronizer.synchronize(source_content, target_content)
    batch_ops = optimize_ops(ops)
    for op in batch_ops:
        res = op.execute(adapter)
        if not res.successful:
            print(f"Error synchronizing triple: {res.message}")

More information about these operations and time gained with them can be explored in the Benchmarks notebook.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wbsync-0.1.4.tar.gz (13.9 kB view details)

Uploaded Source

File details

Details for the file wbsync-0.1.4.tar.gz.

File metadata

  • Download URL: wbsync-0.1.4.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.23.0 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.0

File hashes

Hashes for wbsync-0.1.4.tar.gz
Algorithm Hash digest
SHA256 0371c62d571a7158095e08a0565a9507241bf929aef48dd57443ccbb13c86355
MD5 341cd0afcac0b4977a24ccb7dfd8765b
BLAKE2b-256 8c7a2aaafdce186021df220acce8990784b1860e56a4b26b8efe41cef3b5bfb6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page