Skip to main content

Document dataset metadata. For use in Statistics Norway's metadata system.

Project description

Datadoc

Datadoc Unit tests Code coverage PyPI version Code style: black

Document datasets in Statistics Norway

Usage

DataDoc in use

From Jupyter

  1. Open https://jupyter.dapla-staging.ssb.no or another Jupyter Lab environment
  2. Datadoc comes preinstalled in Statistics Norway environments. Elsewhere, run Run pip install ssb-datadoc to install
  3. Upload a dataset to your Jupyter server (e.g. https://github.com/statisticsnorway/datadoc/blob/master/klargjorte_data/befolkning/person_testdata_p2021-12-31_p2021-12-31_v1.parquet)
  4. Run the demo.ipynb Notebook
  5. Datadoc will open in the notebook

Contributing

Local environment

Poetry is used for dependency management. Poe the Poet is used for running poe tasks within poetry's virtualenv. Upon cloning this project first install necessary dependencies, then run the tests to verify everything is working.

1. Prerequisites

  • Python >=3.10
  • Poetry, install via curl -sSL https://install.python-poetry.org | python3 -

2. Install dependencies

poetry install

3. Install pre-commit hooks

poetry run pre-commit install

4. Run tests

poetry run poe test

Add dependencies

Main

poetry add <python package name>

Dev

poetry add --group dev <python package name>

Run project locally

To run the project locally:

poetry run poe datadoc

Run project locally in Jupyter

To run the project locally in Jupyter run:

poetry run poe jupyter

A Jupyter instance should open in your browser. Open and run the cells in the .ipynb file to demo datadoc.

Running the Dockerized Application Locally

docker run -p 8050:8050 \
-v $HOME/.config/gcloud/application_default_credentials.json/:/application_default_credentials.json \
-e GOOGLE_APPLICATION_CREDENTIALS="/application_default_credentials.json" \
datadoc

Release process

Run the relevant version command on a branch e.g.

poetry version patch
poetry version minor

Commit with message like Bump version x.x.x -> y.y.y.

Open and merge a PR.

Use Github to tag and release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssb_datadoc-0.4.0.tar.gz (329.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ssb_datadoc-0.4.0-py3-none-any.whl (348.0 kB view details)

Uploaded Python 3

File details

Details for the file ssb_datadoc-0.4.0.tar.gz.

File metadata

  • Download URL: ssb_datadoc-0.4.0.tar.gz
  • Upload date:
  • Size: 329.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.2 Linux/6.2.0-1018-azure

File hashes

Hashes for ssb_datadoc-0.4.0.tar.gz
Algorithm Hash digest
SHA256 2e697b5a96f2bf27fb96150c9f67693a446841f92a8b6a06675582166775af2a
MD5 c28a7760582a8e16b03d528acc19ef4f
BLAKE2b-256 a0f873bd29af9b5050e27d6505703f9ed1c67731f99ae5b016161f807c4bd451

See more details on using hashes here.

File details

Details for the file ssb_datadoc-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: ssb_datadoc-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 348.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.2 Linux/6.2.0-1018-azure

File hashes

Hashes for ssb_datadoc-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b041efd94b1b40391f85c8f348c0bf11175673eefcc0adef3d2ae7d7cd768059
MD5 6ec1a31b4047f9de7ccaf48cb14e8d4b
BLAKE2b-256 b636cc38366d8384a0dafa9c030095eac89137f2e6d0d39d7743f383cb6463fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page