CLI tools for the creation of Kudaf Data Source components: 1) Variables Metadata, and 2) REST API Datasource back-end
Project description
KUDAF Datasource CLI Tools
This is a set of Command Line Interface (CLI) tools to facilitate the technical tasks requirered from Data Providers that want to make their data available on the KUDAF data-sharing platform.
The CLI can create the following Kudaf Data Source components:
- Variables Metadata, and /or
- REST API Datasource back-end (including variables metadata and possibly even the data files themselves)
About KUDAF
(Summary info and links for the Kudaf initiative)
High-level workflow for Data Source administrators
(Summary of the process for Datasource admins to make their data available in KUDAF, and where these CLI Tools fit in that journey)
Local installation instructions (Linux/Mac)
Download the package to your computer
-
Open up your browser and navigate to the project's GitLab page:
https://gitlab.sikt.no/kudaf/kudaf-datasource-tools
-
Once there, download a ZIP file with the source code
-
Move the zipped file to whichever directory you want to use for this installation
-
Open a Terminal window and navigate to the directory where the zipped file is
-
Unzip the downloaded file, it will create a folder called
kudaf-datasource-tools-main
-
Switch to the newly created folder
$ cd path/to/kudaf-datasource-tools-main
Make sure Python3 is installed on your computer (versions from 3.8 up to 3.11 should work fine)
$ python3 --version
Install Poetry (Python package and dependency manager) on your computer
Full Poetry documentation can be found here: https://python-poetry.org/docs/
The official installer should work fine on the command line for Linux, macOS and Windows:
$ curl -sSL https://install.python-poetry.org | python3 -
If the installation was successful, configure this option:
$ poetry config virtualenvs.in-project true
Mac users: Troubleshooting
In case of errors installing Poetry on your Mac, you may have to try installing it with pipx
. But to install that, we need to have Homebrew
installed first.
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
(Homebrew documentation: https://brew.sh/)
Once Homebrew
is installed, proceed to install pipx
:
$ brew install pipx
$ pipx ensurepath
Finally, install Poetry
:
$ pipx install poetry
Create a Python virtual environment and activate it
$ python3 -m venv .venv
This created the virtualenv under the hidden folder .venv
Activate it with:
$ source .venv/bin/activate
Install Kudaf Datasource Tools and other required Python packages
$ poetry install
Creating a YAML configuration file
Click here for a basic YAML syntax tutorial
Example YAML configuration file
The following file is included in the package and can be found in the datasource_tools/config
folder:
config_example.yaml
---
projectInfo:
name: "Kudaf-datasource-API-my-Datasource-Name"
author: Author Name
datasourceName: "my Datasource Name"
datasourceId: "my-FeideKundeportal-Datasource-UUID"
dataFiles:
- dataFile: &mydatafile
fileNameExt: mydatafile.csv
csvParseDelimiter: ";" # Valgfritt (som standard ","). Angir tegnet som brukes i CSV-filen for å skille verdier innenfor hver rad
fileDirectory: /path/to/my/datafiles/directory # Bare nødvendig hvis forskjellig fra gjeldende katalog
unitTypes:
- MIN_ENHETSTYPE: &min_enhetstype # Bare nødvendig hvis forskjellig fra de globale enhetstypene: PERSON/VIRKSOMHET/KOMMUNE/FYLKE
shortName: MIN_ENHETSTYPE
name: Kort identifikasjonsetikett
description: Detaljert beskrivelse av denne enhetstypen
dataType: LONG # En av STRING/DATE/LONG/DOUBLE
variables:
- name: VARIABELENS_NAVN
temporalityType: EVENT # En av FIXED/EVENT/STATUS/ACCUMULATED
sensitivityLevel: NONPUBLIC # En av PUBLIC/NONPUBLIC
populationDescription:
- Beskrivelse av populasjonen som denne variabelen måler
spatialCoverageDescription:
- Norge
- Annen geografisk beskrivelse som gjelder disse dataene
subjectFields:
- Temaer/konsepter/begreper som disse dataene handler om
identifierVariables:
- unitType: *min_enhetstype # Kan også være en av de globale enhetstypene: PERSON/VIRKSOMHET/KOMMUNE/FYLKE
measureVariables:
- label: Kort etikett på hva denne variabelen måler/viser
description: Detaljert beskrivelse av hva denne variabelen måler/viser
dataType: STRING # En av STRING/LONG/DATE/DOUBLE
dataMapping:
dataFile: *mydatafile
identifierColumns:
- Min_Identificatorkolonne # CSV-filkolonne for identifikatoren
measureColumns:
- Min_Målkolonnen # CSV-filkolonne for det som måles
attributeColumns: # Kun nødvendig for EVENT/STATUS/ACCUMULATED temporalityType(s)
- Start_Time # CSV-filkolonnen for starttidspunktet
- End_Time # CSV-filkolonne for sluttidspunktet
- name: VARIABELENS_NAVN_ACCUM
temporalityType: ACCUMULATED # En av FIXED/EVENT/STATUS/ACCUMULATED
sensitivityLevel: NONPUBLIC # En av PUBLIC/NONPUBLIC
populationDescription:
- Beskrivelse av populasjonen som denne variabelen måler
spatialCoverageDescription:
- Norge
- Annen geografisk beskrivelse som gjelder disse dataene
subjectFields:
- Temaer/konsepter/begreper som disse dataene handler om
identifierVariables:
- unitType: *min_enhetstype # Kan også være en av de globale enhetstypene: PERSON/VIRKSOMHET/KOMMUNE/FYLKE
measureVariables:
- label: Kort etikett på hva denne variabelen måler/viser
description: Detaljert beskrivelse av hva denne variabelen måler/viser
dataType: LONG # Hvis akkumulerte data må summeres, bør dette enten være en av LONG/DOUBLE
dataMapping:
dataFile: *mydatafile
identifierColumns:
- Min_Identificatorkolonne # CSV-filkolonne for identifikatoren
measureColumns:
- Min_Målkolonnen # CSV-filkolonne for det som måles
measureColumnsAccumulated: False # Valgfritt (True/False, som standard False). Angir om dataene i filen allerede er akkumulert
attributeColumns: # Kun nødvendig for EVENT/STATUS/ACCUMULATED temporalityType(s)
- Start_Time # CSV-filkolonnen for starttidspunktet
- End_Time # CSV-filkolonne for sluttidspunktet
...
Kudaf Datasource Tools CLI operation
Navigate to the project directory and activate the virtual environment (only if not already activated):
$ source .venv/bin/activate
The kudaf-generate
command should be now activated. This is the main entry point to the CLI's functionalities.
Displaying the help menus
$ kudaf-generate --help
Usage: kudaf-generate [OPTIONS] COMMAND [ARGS]...
Kudaf Datasource Tools
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion Install completion for the current shell. │
│ --show-completion Show completion for the current shell, to copy it or customize the installation. │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ api Generate a Kudaf Datasource REST API back-end │
│ metadata Generate Variables/UnitTypes Metadata │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
As we can see, there are two sub-commands available: api
and metadata
.
We can obtain help on them as well:
$ kudaf-generate api --help
Usage: kudaf-generate api [OPTIONS]
Generate a Kudaf Datasource REST API back-end
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --config-yaml-path PATH Absolute path to the YAML configuration file │
│ [default: /home/me/current/directory/config.yaml] │
│ --input-data-files-dir PATH Absolute path to the data files directory │
│ [default: /home/me/current/directory] │
│ --output-api-dir PATH Absolute path to directory where the Datasource API folder is to be written │
│ to │
│ [default: /current/directory] │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
$ kudaf-generate metadata --help
Usage: kudaf-generate metadata [OPTIONS]
Generate Variables/UnitTypes Metadata
JSON metadata files ('variables.json' and maybe 'unit_types.json') will be written to the
(optionally) given output directory.
If any of the optional directories is not specified, the current directory is used as default.
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --config-yaml-path PATH Absolute path to the YAML configuration file │
│ [default: /home/me/current/directory/config.yaml] │
│ --output-metadata-dir PATH Absolute path to directory where the Metadata files are to be written to │
│ [default: /home/me/current/directory] │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Generating metadata only from a YAML configuration file
$ kudaf-generate metadata --config-yaml-path /home/me/path/to/config.yaml --output-metadata-dir /home/me/path/to/folder
Generating an API
$ kudaf-generate api --config-yaml-path /home/me/path/to/config.yaml --output-api-dir /home/me/path/to/folder
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for kudaf_datasource_tools-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ed87da2c4478a647eb6b3b8e1c681d656d9645125f6a1de4a291add661ec4049 |
|
MD5 | 4a77992a7dad424f196d53b2fd3a9106 |
|
BLAKE2b-256 | 8c905f1ae8d6a613ebaa613b735dfae98a973733eee60cdf87c1b044608422b3 |
Hashes for kudaf_datasource_tools-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f519bb9bd0940724d4c9699fd6471f1db95fa08a8ba31ad1c4cb928209aaefb |
|
MD5 | 11f08fb79c2d8fcb9583346cc536cb01 |
|
BLAKE2b-256 | 32281f06b904f37ac726068d45989d2453c6d4428b2def3559d5156298958dc6 |