Skip to main content

A library for managing DCAT metadata using Apache Atlas

Project description

atlasdcat

Tests codecov PyPI Read the Docs

A Python library for mapping Apache Atlas Glossary terms to DCAT metadata and vice versa.

Specification the Norwegian Application Profile of the DCAT standard.

Notice This library is part of the DCAT for Apache Atlas lifecycle and not a complete solution. The complete lifecycle contains the following actions:

  • Create glossary
  • Add dataset descriptions by using atlasdcat mapper (this library)
  • Assign dataset description glossary terms to entities (actual datasets)
  • Fetch glossary terms as DCAT catalog in RDF format (this library)

Usage

Install

% pip install atlasdcat

Getting started

Setup mapper

# Example...
from atlasdcat import AtlasDcatMapper, AtlasGlossaryClient
from pyapacheatlas.auth import BasicAuthentication

atlas_auth = BasicAuthentication(username="dummy", password="dummy")
atlas_client = AtlasGlossaryClient(
    endpoint_url="http://atlas", authentication=atlas_auth
)

mapper = AtlasDcatMapper(
    glossary_client=atlas_client,
    glossary_id="myglossary",
    catalog_uri="https://domain/catalog",
    catalog_language="http://publications.europa.eu/resource/authority/language/NOB",
    catalog_title="Catalog",
    catalog_publisher="https://domain/publisher",
    dataset_uri_template="http://domain/datasets/{guid}",
    distribution_uri_template="http://domain/distributions/{guid}",
    language="nb",
)

Map glossary terms to DCAT Catalog RDF resource

try:
    mapper.fetch_glossary()
    catalog = mapper.map_glossary_terms_to_dataset_catalog()
    print(catalog.to_rdf())
except Exception as e:
    print(f"An exception occurred: {e}")

Map DCAT Catalog RDF resource to glossary terms

catalog = Catalog()
catalog.identifier = "http://catalog-uri"
catalog.title = {"nb": "mytitle"}
catalog.publisher = "http://publisher"
catalog.language = ["nb"]
catalog.license = ""

dataset = Dataset()
dataset.title = {"nb": "Dataset"}
dataset.description = {"nb": "Dataset description"}
catalog.datasets = [dataset]

try:
    mapper.fetch_glossary()
    mapper.map_dataset_catalog_to_glossary_terms(catalog)
    mapper.save_glossary_terms()
except Exception as e:
    print(f"An exception occurred: {e}")

For an example of usage of this library in a simple server, see example.

Development

Requirements

% pip install poetry==1.1.13
% pip install nox==2022.1.7
% pip inject nox nox-poetry==1.0.0

Install developer tools

% git clone https://github.com/Informasjonsforvaltning/atlasdcat.git
% cd atlasdcat
% pyenv install 3.8.12
% pyenv install 3.9.10
% pyenv install 3.10.
% pyenv local 3.8.12 3.9.10 3.10.
% poetry install

Run all sessions

% nox

Run all tests with coverage reporting

% nox -rs tests

Debugging

You can enter into Pdb by passing --pdb to pytest:

nox -rs tests -- --pdb

You can set breakpoints directly in code by using the function breakpoint().

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atlasdcat-1.1.2.tar.gz (16.5 kB view hashes)

Uploaded Source

Built Distribution

atlasdcat-1.1.2-py3-none-any.whl (15.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page