Skip to main content

Small helper utilities for Dataiku DSS project variables and dataset metadata.

Project description

dataiku-utils

dataiku-utils is a small Python helper package for Dataiku DSS projects. It provides utility classes for working with project variables, dataset metadata, and SQL-backed dataset columns.

The package is designed for code running inside Dataiku DSS recipes, scenarios, or libraries where the dataiku Python module is available.

Features

  • Read and update Dataiku project variables.
  • Update project variables with a lightweight lock to reduce concurrent-write conflicts.
  • Manage a single project variable through a small object-oriented API.
  • Track whether a managed variable was updated.
  • Inspect Dataiku dataset schemas and SQL table locations.
  • Discover minimum, maximum, and next available values from a SQL-backed dataset column.

Installation

pip install dataiku-utils

For local development from a source archive:

pip install dataiku_utils-0.1.0.tar.gz

Runtime requirement

This package expects the dataiku Python module to be available at runtime. In normal usage, Dataiku DSS provides this module inside code environments used by recipes and scenarios.

The package does not declare dataiku as a hard PyPI dependency because the runtime module is generally provided by the DSS environment rather than installed from public PyPI.

Quick start

Read and update project variables

from dataiku_utils import ProjectVariables

value = ProjectVariables.get_variable("last_processed_date")

ProjectVariables.safe_update_scope_variables(
    {"last_processed_date": "2026-01-31"},
    scope="standard",
)

Manage a single variable

from dataiku_utils import SimpleValueVariablesUtils

variable = SimpleValueVariablesUtils(
    key="processing_date",
    initial_value="2026-01-01",
)

print(variable.value)
variable.value = "2026-01-02"

Inspect a dataset

import dataiku
from dataiku_utils import DatasetUtils

dataset = dataiku.Dataset("input_dataset", ignore_flow=True)

columns = DatasetUtils.colnames_from_dataset(dataset)
table_name = DatasetUtils.table_fullname_from_sql_dataset(dataset)

Manage a variable based on a SQL-backed table column

from dataiku_utils import TableColumnValueVariablesUtils

variable = TableColumnValueVariablesUtils(
    value_key="last_processed_value",
    dataset_name="input_dataset",
    colname="event_date",
    coltype="DATE",
    start_min_value="2026-01-01",
    start_max_value="2026-12-31",
)

variable.create_if_not_exists()
updated = variable.update()

Logging

All public helper classes inherit from LogBase, which provides one logger per class.

import logging
from dataiku_utils import ProjectVariables

ProjectVariables.logger().setLevel(logging.DEBUG)

Debug statements that may compute expensive values are guarded with logger.isEnabledFor(logging.DEBUG) where relevant.

Public API

The package exposes the following classes:

  • LogBase
  • DatasetUtils
  • ProjectVariables
  • SimpleValueVariablesUtils
  • ValueVariablesUtils
  • WatchedValueVariablesUtils
  • TableColumnValueUtils
  • TableColumnValueVariablesUtils
  • TableColumnWatchedValueVariablesUtils

Development

Build locally:

python -m pip install --upgrade build twine
python -m build
twine check dist/*

Publish with the included helper script:

chmod +x .pypi.sh
./.pypi.sh

Use TestPyPI:

PYPI_REPOSITORY=testpypi ./.pypi.sh

License

Proprietary. Update the license metadata before publishing if you want to distribute it under an open-source license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataiku_utils-0.1.0.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataiku_utils-0.1.0-py3-none-any.whl (15.1 kB view details)

Uploaded Python 3

File details

Details for the file dataiku_utils-0.1.0.tar.gz.

File metadata

  • Download URL: dataiku_utils-0.1.0.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for dataiku_utils-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4e60c5310cef94595018eedde2324061842bab56a5f5c45c7a404d25022bcd32
MD5 310e682fc43f5f52e9181488c8581712
BLAKE2b-256 057c6d66761a0ce6e1de42f64cac68e688d26e81f7f19e44fcde186e9b59d947

See more details on using hashes here.

File details

Details for the file dataiku_utils-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dataiku_utils-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for dataiku_utils-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 66dbd41640a08642ea25096c987670fc0a6b54b843076cee8a1446058179403a
MD5 423c67b04bca5e5412a414e4eb12953f
BLAKE2b-256 6dab36230f5f12ded9d86c96d1bf4728a0c2880f6ea203367218577eb98fa2ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page