Skip to main content

A tool for syncing the dataset-metadata between MADATA and Wikidata

Project description

madata

PyPI version

madata syncs the metadata of datasets between MADATA (Mannheim Data Repository) and Wikidata. It provides access to the MADATA metadata records directly in Python.

Table of contents

Installation

pip install madata

or

git clone https://github.com/UB-Mannheim/madata
cd madata/
pip install .

Initialization

By initialization madata harvests the MADATA OAI-PMH interface, stores the Dublin Core metadata records in records.OAI_DC and queries the Wikidata SPARQL endpoint for the list of metadata records published at MADATA. Example:

from madata import Metadata
records = Metadata()
print(records)
[('OAI', 'https://madata.bib.uni-mannheim.de/cgi/oai2'),
 ('MADATA records from OAI-PMH', 163),
 ('MADATA records at Wikidata', 1),
 ('In sync?', False)]

Every record rec in the the list records.OAI_DChas the following attributes: rec.metadata (structured metadata record), rec.header (structured header for a metadata record) and rec.raw (raw DC metadata record). The raw header is available via rec.header.raw. Additionally, a pandas-dataframe with metadata records is stored in records.OAI_DC_df.

Syncing

In order to upload the MADATA metadata records to Wikidata, you need an account at Wikidata. If you have an account, use

from madata import Metadata
records = Metadata()
records._sync()
>>> Wikidata username: 
>>> Wikidata password: 

Type your username and password, then madata starts to sync the metadata records at MADATA and Wikidata.

SPARQL queries

The MADATA-subset at Wikidata: https://w.wiki/6s7R.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

madata-0.2.0.tar.gz (5.3 kB view hashes)

Uploaded Source

Built Distribution

madata-0.2.0-py3-none-any.whl (5.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page