Pseudonymization extensions for Dapla Toolbelt
Project description
Pseudonymization extensions for Dapla Toolbelt
Pseudonymize, repseudonymize and depseudonymize data on Dapla.
Usage
See the command-line reference for details.
Pseudonymize
from dapla_pseudo import pseudonymize
# Pseudonymize fields in a local file using the default key:
pseudonymize(file_path="./data/personer.json", fields=["fnr", "fornavn"])
# Pseudonymize fields in a local file, explicitly denoting the key to use:
pseudonymize(file_path="./data/personer.json", fields=["fnr", "fornavn"], key="ssb-common-key-1")
# Pseudonymize a local file using a custom key:
import json
custom_keyset = json.dumps( {
"encryptedKeyset": "CiQAp91NBhLdknX3j9jF6vwhdyURaqcT9/M/iczV7fLn...8XYFKwxiwMtCzDT6QGzCCCM=",
"keysetInfo": {
"primaryKeyId": 1234567890,
"keyInfo": [
{
"typeUrl": "type.googleapis.com/google.crypto.tink.AesSivKey",
"status": "ENABLED",
"keyId": 1234567890,
"outputPrefixType": "TINK",
}
],
},
"kekUri": "gcp-kms://projects/some-project-id/locations/europe-north1/keyRings/some-keyring/cryptoKeys/some-kek-1",
})
pseudonymize(file_path="./data/personer.json", fields=["fnr", "fornavn"], key=custom_keyset)
# Operate on data in a streaming manner:
import shutil
with pseudonymize("./data/personer.json", fields=["fnr", "fornavn", "etternavn"], stream=True) as res:
with open("./data/personer_deid.json", 'wb') as f:
res.raw.decode_content = True
shutil.copyfileobj(res.raw, f)
# Map certain fields to stabil ID
pseudonymize(file_path="./data/personer.json", fields=["fornavn"], sid_fields=["fnr"])
Repseudonymize
from dapla_pseudo import repseudonymize
# Repseudonymize fields in a local file, denoting source and target keys to use:
repseudonymize(file_path="./data/personer_deid.json", fields=["fnr", "fornavn"], source_key="ssb-common-key-1", target_key="ssb-common-key-2")
Depseudonymize
from dapla_pseudo import depseudonymize
# Depseudonymize fields in a local file using the default key:
depseudonymize(file_path="./data/personer_deid.json", fields=["fnr", "fornavn"])
# Depseudonymize fields in a local file, explicitly denoting the key to use:
depseudonymize(file_path="./data/personer_deid.json", fields=["fnr", "fornavn"], key="ssb-common-key-1")
Note that depseudonymization requires elevated access privileges.
Requirements
Installation
You can install dapla-toolbelt-pseudo via pip from PyPI:
pip install dapla-toolbelt-pseudo
Contributing
Contributions are very welcome. To learn more, see the Contributor Guide.
License
Distributed under the terms of the MIT license, Pseudonymization extensions for Dapla Toolbelt is free and open source software.
Issues
If you encounter any problems, please file an issue along with a detailed description.
Credits
This project was generated from @cjolowicz's Hypermodern Python Cookiecutter template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dapla_toolbelt_pseudo-0.2.6.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 326559041fc71f5111af4b5cccf246ad986c3359ac31c2507b1aac3ea2457609 |
|
MD5 | bea015961123d435553630b5f3dc9f46 |
|
BLAKE2b-256 | ce887b60bd95244f33aa112fc25bb673934533dd209b04bb885cb3d14b0ebdc7 |
Hashes for dapla_toolbelt_pseudo-0.2.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5ebbe395697457ce71593cda4166693fb11d551f92ea0a74aa1ec432b19d8a0 |
|
MD5 | 3ac28d7704f899103597bea1e0e3c9a9 |
|
BLAKE2b-256 | 42e2849e750b9a51dda9c779dbc5c5dbb144676c996e6f807677c32c259be6cf |