Skip to main content

Document dataset metadata. For use in Statistics Norway's metadata system.

Project description

Datadoc

Datadoc Unit tests Code coverage PyPI version Code style: black

Document datasets in Statistics Norway

Usage

DataDoc in use

From Jupyter

  1. Open https://jupyter.dapla-staging.ssb.no or another Jupyter Lab environment
  2. Run pip install ssb-datadoc[gcs] in the terminal
  3. Upload a dataset to your Jupyter server (e.g. https://github.com/statisticsnorway/datadoc/blob/master/klargjorte_data/person_data_v1.parquet)
  4. Run from datadoc import main; main("./person_data_v1.parquet") in a notebook
  5. Datadoc will open in a new tab

Contributing

Prerequisites

  • Python >3.8 (3.10 is preferred)
  • Poetry, install via curl -sSL https://install.python-poetry.org | python3 -

Dependency Management

Poetry is used for dependency management. Poe the Poet is used for running poe tasks within poetry's virtualenv. Upon cloning this project first install necessary dependencies, then run the tests to verify everything is working.

Install all dependencies

poetry install --all-extras

Add dependencies

Main

poetry add <python package name>

Dev

poetry add --group dev <python package name>

Run tests

poetry run poe test

Run project locally

To run the project locally:

poetry run poe datadoc "gs://ssb-staging-dapla-felles-data-delt/datadoc/klargjorte_data/person_data_v1.parquet"

Run project locally in Jupyter

To run the project locally in Jupyter run:

poetry run poe jupyter

A Jupyter instance should open in your browser. Open and run the cells in the .ipynb file to demo datadoc.

Bump version

poetry run poe bump-patch-version

:warning: Run this on the default branch

This command will:

  1. Increment version strings in files
  2. Commit the changes
  3. Tag the commit with the new version

Then just run git push origin --tags to push the changes and trigger the release process.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssb-datadoc-0.2.1.tar.gz (321.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ssb_datadoc-0.2.1-py3-none-any.whl (332.4 kB view details)

Uploaded Python 3

File details

Details for the file ssb-datadoc-0.2.1.tar.gz.

File metadata

  • Download URL: ssb-datadoc-0.2.1.tar.gz
  • Upload date:
  • Size: 321.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.1 CPython/3.10.2 Linux/5.15.0-1020-azure

File hashes

Hashes for ssb-datadoc-0.2.1.tar.gz
Algorithm Hash digest
SHA256 e0ee3ca6800f12c85d02a17b2e79d62ac38a3725a5adb2cdad91d979aae70c85
MD5 bee266326d8fc80b6afbd0b43796ca6c
BLAKE2b-256 bc47d9c90ceafd41c8ece88a3befb825d4de1a1d83670a4c5fccc76e81280fe7

See more details on using hashes here.

File details

Details for the file ssb_datadoc-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: ssb_datadoc-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 332.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.1 CPython/3.10.2 Linux/5.15.0-1020-azure

File hashes

Hashes for ssb_datadoc-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0d5df896088dccc277fbdac91117be9873873298770a1d479f805eb522e04edd
MD5 aa33ba39f1586d2c8138661ecb3bbfef
BLAKE2b-256 c4f7636033b7e739c71e7060fa9652696676d4313e1222282f6a03c8a98d5597

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page