Handles data transfer Statbank <-> Dapla for Statistics Norway

These details have not been verified by PyPI

Project description

Dapla Statbank Client

Used internally by SSB (Statistics Norway). Validates and transfers data from Dapla to Statbank. Gets data from public and internal statbank.

Installing from Pypi with Poetry

If your project has been set up with ssb-project create, navigate into the folder with the terminal. cd project-name. Then install the package:

poetry add dapla-statbank-client
ssb-project build

Make a notebook with the project's kernel, try this code to verify that you can "log in":

from statbank import StatbankClient
stat_client = StatbankClient()
# Change LASTEBRUKER to your load-statbank-username
# Fill out password
# Default publishing-date is TOMORROW
print(stat_client)
# Printing will show you all the default settings on the client.
# You can change for example date by specifying it: StatbankClient(date="2023-02-16")

Be aware that from the dapla-staging environment you will be sending to statbank-TEST-database, your changes will not be published. For this you need the "test-password", which is for the same user (lastebruker), but different from the ordinary password (lastepassord). If you are missing the test-password, have the statbank-team send it to you for your loaduser. If you are in the main dapla-jupyterlab (prod), you WILL publish to statbanken, in the PROD database. So pay extra attention to the publishing-date when in dapla-main-prod-jupyterlab. And be aware of which password you are entering, based on your environment. To see data actually published to the test-database, you can use this link if you work at SSB.

Usage Transferring

stat_client.transfer({"deltabellfilnavn.dat" : df_06399}, "06339")

The simplest form of usage, is directly-transferring using the transfer-method under the client-class. The statbanktable expects named "deltabeller" in a dictionary, see transferdata_template() below to easily get the deltabell-names in a dict. This might be all you need if this data has been sent in the same shape to statbanken before... If you are unsure at all, keep reading.

Building datasets

You can look at the "filbeskrivelse" which is returned from stat_client.get_description() in its own local class: StatbankUttrekksBeskrivelse

description_06339 = stat_client.get_description(tableid="06339")
print(description_06339)

This should have all the information you are used to reading out from the old "Filbeskrivelse". And describes how you should construct your data.

Your data must be placed in a datastructure, a dict of pandas dataframes. Take a look at how the dict should be constructed with:

description_06339.transferdata_template()

This both returns the dict, and prints it, depending on what you want to do with it. Use it to insert your own DataFrames into, and send it to .validate() and/or .transfer(). It might look like this:

{"deltabellfilnavn.dat" : df_06399}

Other interesting attributes can be retrieved from the UttrekksBeskrivelse-object:

description_06339.subtables
description_06339.variables
description_06339.codelists
description_06339.suppression

After starting to construct your data, you can validate it against the Uttrekksbeskrivelse, using the validate-method, without starting a transfer. This is done against validation rules in this package, NOT by actually sending any data to statbanken. Call vlidate like this:

stat_client.validate({"deltabellfilnavn.dat" : df_06399}, tableid="06339")

Validation will happen by default on user-side, in Python. Validation happens on the number of tables, number of columns, code usage in categorical columns, code usage in "suppression-columns" (prikkekolonner), and on timeformats (both length and characters used) and more. This might be a lot of feedback, but understanding this will help you to debug what might be wrong with your data, before sending it in. If your data contains floats, it might hint at you to use the .round_data()-method to prepare your data, it uses the amount of decimals defined in UttrekksBeskrivelse to round UPWARDS (if you have any "pure 0.5 values") and convert to strings with comma as the decimal sign along the way, it is used like this:

data_dict_06339 = description_06339.round_data({"deltabellfilnavn.dat" : df_06399})

Getting apidata

These functions to retrieve public facing data in the statbank, can be imported directly and will then not ask for username and password, but are also available through the client (which asks for username and password during initialization)...

from statbank import apidata_all, apidata, apidata_rotate

df_06339 = apidata_all("06339", include_id=True)

apidata_all, does not need a specified query, it will build its own query, trying to get all the data from the table. This might be too much, resulting in an error.

The include_id-parameter is a bit magical, it gets both codes and value-columns for categorical columns, and tries to merge these next to each other, it also makes a check if the content is the same, then it will not include the content twice.

If you want to specify a query, to limit the response, use the method apidata instead.
Here we are requesting an "internal table" which only people at SSB have access to, with a specified URL and query.

query = {'query': [{'code': 'Region', 'selection': {'filter': 'vs:Landet', 'values': ['0']}}, {'code': 'Alder', 'selection': {'filter': 'vs:AldGrupp19', 'values': ['000', '001', '002', '003', '004', '005', '006', '007', '008', '009', '010', '011', '012', '013', '014', '015', '016', '017', '018', '019', '020', '021', '022', '023', '024', '025', '026', '027', '028', '029', '030', '031', '032', '033', '034', '035', '036', '037', '038', '039', '040', '041', '042', '043', '044', '045', '046', '047', '048', '049', '050', '051', '052', '053', '054', '055', '056', '057', '058', '059', '060', '061', '062', '063', '064', '065', '066', '067', '068', '069', '070', '071', '072', '073', '074', '075', '076', '077', '078', '079', '080', '081', '082', '083', '084', '085', '086', '087', '088', '089', '090', '091', '092', '093', '094', '095', '096', '097', '098', '099', '100', '101', '102', '103', '104', '105', '106', '107', '108', '109', '110', '111', '112', '113', '114', '115', '116', '117', '118', '119+']}}, {'code': 'Statsbrgskap', 'selection': {'filter': 'vs:Statsborgerskap', 'values': ['000']}}, {'code': 'Tid', 'selection': {'filter': 'item', 'values': ['2022']}}], 'response': {'format': 'json-stat2'}}

df_folkemengde = apidata("https://i.ssb.no/pxwebi/api/v0/no/prod_24v_intern/START/be/be01/folkemengde/Rd0002Aa",
                                     query,
                                     include_id = True
                                    )

apimetadata gets metadata from the public api, like apidata does.

meta = apimetadata("05300")

apicodelist gets a specific codelist out of the metadata, or all the codelists.

all_codelists = apicodelist("05300")
avstand_codelist = apicodelist("05300", "Avstand1")

apidata_rotate is a thin wrapper around pivot_table. Stolen from: https://github.com/sehyoun/SSB_API_helper/blob/master/src/ssb_api_helper.py

df_folkemengde_rotert = apidata_rotate(df_folkemengde, 'tidskolonne', "verdikolonne")

Using a date-widget for publish day

For easier setting of the date on the client, after it has been initialized, you can use a date-picker in JupyterLab from ipywidgets.

date = stat_client.date_picker()
date
# Do a cell shift here, run the cell above and then set the date (dont run the cell again, cause youll have to set the data again).
# When this is then run, it should update the date on the client:
stat_client.set_publish_date(date)

Saving and restoring Uttrekksbeskrivelser and Transfers as json

From stat_client.transfer() you will recieve a StatbankTransfer object, from stat_client.get_description() a StatbankUttrekksBeskrivelse-object. These can be serialized and saved to disk, and later be restored, maybe this can be a form of logging on which transfers were done? This can also be used to:

have one notebook get all the descriptions of all tables produced from the pipeline (requires password).
then have a notebook for each table, restoring the description from the local jsos, which can actually run .validate (without typing in password).
then have a notebook at the end, that sends all the tables (requiring password-entry a second time).

filbesk_06339 = stat_client.get_description("06339")
filbesk_06339.to_json("path.json")
# Later the file can be restored with
filbesk_06339_new = stat_client.read_description_json("path.json")

Some deeper data-structures, like the dataframes in the transfer will not be serialized and stored with the transfer-object in its json. All request-parts that might include Auth are stripped.

The logger

Statbank-package makes its own logger using the python logging package. The logger is available at statbank.logger. A lot of the validations are logged as the level "info", if they seem ok. Or on the level "warning" if things are not ok. The levels are colorized with colorama, green for INFO, magenta for WARNING. If you dont want to see the info-parts of the validate-method, you can change the loggers level before calling validate, like this:

import statbank
import logging
statbank.logger.setLevel(logging.WARNING)

License

Distributed under the terms of the MIT license, Dapla Statbank Client is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was generated from Statistics Norway's SSB PyPI Template.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.3.5

Jun 19, 2025

1.3.4

May 27, 2025

1.3.3

May 23, 2025

1.3.2

May 23, 2025

1.3.1

Apr 5, 2025

1.3.0

Mar 14, 2025

1.2.10

Jan 29, 2025

1.2.9

Jan 22, 2025

1.2.8

Nov 20, 2024

1.2.7

Aug 30, 2024

1.2.6

Aug 28, 2024

1.2.4

Jul 2, 2024

1.2.3

May 22, 2024

1.2.2

May 2, 2024

1.2.1

Apr 22, 2024

1.2.0

Apr 2, 2024

1.1.2

Apr 2, 2024

1.1.1

Mar 22, 2024

1.1.0

Mar 5, 2024

1.0.10

Jan 26, 2024

1.0.9

Nov 14, 2023

1.0.8

Nov 14, 2023

1.0.7

Nov 13, 2023

1.0.6

Jun 21, 2023

1.0.5

Jun 21, 2023

1.0.4

May 8, 2023

1.0.2

Apr 13, 2023

1.0.0

Feb 16, 2023

0.0.11

Jan 20, 2023

0.0.10

Jan 19, 2023

0.0.9

Jan 10, 2023

0.0.8

Dec 21, 2022

0.0.5

Nov 18, 2022

0.0.4

Nov 18, 2022

0.0.3

Nov 11, 2022

0.0.2

Nov 3, 2022

0.0.1

Oct 24, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dapla_statbank_client-1.3.5.tar.gz (38.9 kB view details)

Uploaded Jun 19, 2025 Source

Built Distribution

dapla_statbank_client-1.3.5-py3-none-any.whl (40.8 kB view details)

Uploaded Jun 19, 2025 Python 3

File details

Details for the file dapla_statbank_client-1.3.5.tar.gz.

File metadata

Download URL: dapla_statbank_client-1.3.5.tar.gz
Upload date: Jun 19, 2025
Size: 38.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for dapla_statbank_client-1.3.5.tar.gz
Algorithm	Hash digest
SHA256	`1c1c229c008c6f5d764eb9330e234b8c0f1f4c99eceea9cfeb87a61df183bd81`
MD5	`095e1176e07c143ff954dd88db6ca231`
BLAKE2b-256	`3d94f05865bc60f1855e62c931806b8ba039e064477ea042c91b32c759eda17a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dapla_statbank_client-1.3.5.tar.gz:

Publisher: release.yml on statisticsnorway/dapla-statbank-client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dapla_statbank_client-1.3.5.tar.gz
- Subject digest: 1c1c229c008c6f5d764eb9330e234b8c0f1f4c99eceea9cfeb87a61df183bd81
- Sigstore transparency entry: 243895788
- Sigstore integration time: Jun 19, 2025
Source repository:
- Permalink: statisticsnorway/dapla-statbank-client@f10f79fa873eae5058ded392941f100a59531abb
- Branch / Tag: refs/heads/main
- Owner: https://github.com/statisticsnorway
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@f10f79fa873eae5058ded392941f100a59531abb
- Trigger Event: push

File details

Details for the file dapla_statbank_client-1.3.5-py3-none-any.whl.

File metadata

Download URL: dapla_statbank_client-1.3.5-py3-none-any.whl
Upload date: Jun 19, 2025
Size: 40.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for dapla_statbank_client-1.3.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`473c3568eeb27c227c0ab5e85bcc4989c0f0f7a76a357997c5f22a09d5784ab9`
MD5	`a1f6dac693b11bca168fc7dea312e9d6`
BLAKE2b-256	`cab41cba0fbd286d06b3f35c6d6cad1c5752c386eeddcbbc6fafcd2c5687fcc2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dapla_statbank_client-1.3.5-py3-none-any.whl:

Publisher: release.yml on statisticsnorway/dapla-statbank-client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dapla_statbank_client-1.3.5-py3-none-any.whl
- Subject digest: 473c3568eeb27c227c0ab5e85bcc4989c0f0f7a76a357997c5f22a09d5784ab9
- Sigstore transparency entry: 243895791
- Sigstore integration time: Jun 19, 2025
Source repository:
- Permalink: statisticsnorway/dapla-statbank-client@f10f79fa873eae5058ded392941f100a59531abb
- Branch / Tag: refs/heads/main
- Owner: https://github.com/statisticsnorway
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@f10f79fa873eae5058ded392941f100a59531abb
- Trigger Event: push

dapla-statbank-client 1.3.5

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Dapla Statbank Client

Installing from Pypi with Poetry

Usage Transferring

Building datasets

Getting apidata

Using a date-widget for publish day

Saving and restoring Uttrekksbeskrivelser and Transfers as json

The logger

License

Issues

Credits

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance