Skip to main content

GoodData Cloud lifecycle automation pipelines

Project description

GoodData Pipelines

A high-level library for automating the lifecycle of GoodData Cloud (GDC).

You can use the package to manage following resources in GDC:

  1. Provisioning (create, update, delete)
    • User profiles
    • User Groups
    • User/Group permissions
    • User Data Filters
    • Child workspaces (incl. Workspace Data Filter settings)
  2. Backup and restore of workspaces
    • Create and backup snapshots of workspace metadata to local storage, AWS S3, or Azure Blob Storage
  3. LDM Extension
    • extend the Logical Data Model of a child workspace with custom datasets and fields

In case you are not interested in incorporating a library in your own program but would like to use a ready-made script, consider having a look at GoodData Productivity Tools.

Provisioning

The entities can be managed either in full load or incremental way.

Full load means that the input data should represent the full and complete desired state of GDC after the script has finished. For example, you would include specification of all child workspaces you want to exist in GDC in the input data for workspace provisioning. Any workspaces present in GDC and not defined in the source data (i.e., your input) will be deleted.

On the other hand, the incremental load treats the source data as instructions for a specific change, e.g., a creation or a deletion of a specific workspace. You can specify which workspaces you would want to delete or create, while the rest of the workspaces already present in GDC will remain as they are, ignored by the provisioning script.

The provisioning module exposes Provisioner classes reflecting the different entities. The typical usage would involve importing the Provisioner class and the data input data model for the class and planned provisioning method:

import os
import logging

from csv import DictReader
from pathlib import Path

# Import the Entity Provisioner class and corresponding model from the gooddata_pipelines library
from gooddata_pipelines import UserFullLoad, UserProvisioner

# Create the Provisioner instance - you can also create the instance from a GDC yaml profile
provisioner = UserProvisioner(
    host=os.environ["GDC_HOSTNAME"], token=os.environ["GDC_AUTH_TOKEN"]
)

# Optional: set up logging and subscribe to logs emitted by the provisioner
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
provisioner.logger.subscribe(logger)

# Load your data from your data source
source_data_path: Path = Path("path/to/some.csv")
source_data_reader = DictReader(source_data_path.read_text().splitlines())
source_data = [row for row in source_data_reader]

# Validate your input data
full_load_data: list[UserFullLoad] = UserFullLoad.from_list_of_dicts(
    source_data
)

# Run the provisioning
provisioner.full_load(full_load_data)

Ready-made scripts covering the basic use cases can be found here in the GoodData Productivity Tools repository.

Backup and Restore of Workspaces

The backup and restore module allows you to create snapshots of GoodData Cloud workspaces and restore them later. Backups can be stored locally, in AWS S3, or Azure Blob Storage.

import os

from gooddata_pipelines import BackupManager
from gooddata_pipelines.backup_and_restore.models.storage import (
    BackupRestoreConfig,
    LocalStorageConfig,
    StorageType,
)

# Configure backup storage
config = BackupRestoreConfig(
    storage_type=StorageType.LOCAL,
    storage=LocalStorageConfig(),
)

# Create the BackupManager instance
backup_manager = BackupManager.create(
    config=config,
    host=os.environ["GDC_HOSTNAME"],
    token=os.environ["GDC_AUTH_TOKEN"]
)

# Backup specific workspaces
backup_manager.backup_workspaces(workspace_ids=["workspace1", "workspace2"])

# Backup workspace hierarchies (workspace + all children)
backup_manager.backup_hierarchies(workspace_ids=["parent_workspace"])

# Backup entire organization
backup_manager.backup_entire_organization()

For S3 or Azure Blob Storage, configure the appropriate storage type and credentials in BackupRestoreConfig.

Bugs & Requests

Please use the GitHub issue tracker to submit bugs or request features.

Changelog

See GitHub releases for released versions and a list of changes.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gooddata_pipelines-1.65.1.dev3.tar.gz (100.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gooddata_pipelines-1.65.1.dev3-py3-none-any.whl (100.9 kB view details)

Uploaded Python 3

File details

Details for the file gooddata_pipelines-1.65.1.dev3.tar.gz.

File metadata

  • Download URL: gooddata_pipelines-1.65.1.dev3.tar.gz
  • Upload date:
  • Size: 100.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gooddata_pipelines-1.65.1.dev3.tar.gz
Algorithm Hash digest
SHA256 24240f3cdd3d1b0d1402aa67383d6dbf32097c2a2c1692d0898b3cb4aef6b3a3
MD5 f1673df8c2df1d9566f64f0c3a3e2011
BLAKE2b-256 5e0d46eef2cda3985bfeb53e133dd9bb2d3c988ec70801ae4b5003eafc8cd01c

See more details on using hashes here.

Provenance

The following attestation bundles were made for gooddata_pipelines-1.65.1.dev3.tar.gz:

Publisher: dev-release.yaml on gooddata/gooddata-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gooddata_pipelines-1.65.1.dev3-py3-none-any.whl.

File metadata

File hashes

Hashes for gooddata_pipelines-1.65.1.dev3-py3-none-any.whl
Algorithm Hash digest
SHA256 b6f8473b46f261f5f3a1460dbeff5061500d5188b9ae422baf33bf5d63ea6ae7
MD5 1dc27936fcd7f0ed321ae6b3e99a54a7
BLAKE2b-256 f13bc16141dad6528b92d22d3d54469614fe1358f0c3660cbad4712d39625e62

See more details on using hashes here.

Provenance

The following attestation bundles were made for gooddata_pipelines-1.65.1.dev3-py3-none-any.whl:

Publisher: dev-release.yaml on gooddata/gooddata-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page