Skip to main content

A small example package

Project description

DataExtraction

Overview

This program is used to extract data from an implementation of Collibra into a SQL envrionment. In our case, we are extracting from Gore's instance of Collibra and into EXL_MDSDev. The program will extract the following objects:

  • Assets
  • Attributes
  • Attribute Types
  • Domains
  • Communities
  • Relations
  • Relation Types
  • Responsibilities Each of these object types are stored as their own table in SQL.

Instruction For Use

Setup

The source executable file is kept in -- Artifact URL --. This folder contains a main.exe file and all of its dependencies. You will need add a 'prod_config.yml' file into the root of this folder. This config file can be found within the source code of the project and should be structured like so:

API_CONFIG:
  limit: 1000000
AUTH:
  username: <Valid admin username in environment>
  password: <Valid admin password in environment>
  auth-header: <Auto-generated basic auth token, generated using postman>
ENVIRONMENT:
  gore: wlgore-<Envrionment Instance (dev,test,prod)>.collibra.com

Running

Open a cmd prompt in the root of the project folder. Type main.exe and hit enter. The program will start to run and log its progress. During the run, the program will extract all data and overwrite the raw sql tables.

SQL Tables

The following sql tables are created/overwritten during the run on this program:

  • collibra_assets_raw
  • collibra_attributes_raw
  • collibra_attribute_types_raw
  • collibra_communities_raw
  • collibra_domains_raw
  • collibra_relations_raw
  • collibra_relation_types_raw
  • collibra_responsibilities_raw

SQL Stored Procedures

The following stored procedures are run on the raw tables to manipulate the data and add batch id's:

  • collibra.load_collibra_assets
  • collibra.load_collirba_attributes
  • collibra.load_collibra_attribute_types
  • collibra.load_collibra_communities
  • collibra.load_collibra_domains
  • collibra.load_collibra_relations
  • collibra.load_collibra_relation_types
  • collibra.load_collibra_responsibilities

All of these procedures may be run simultaneously with the collibra._load_entire_batch procedure.

Migrating from dev/test to prod

In order to change the environment in which the data extracton runs, the prod_config.yml file within the src folder will need to be changed. The username and password will need to be changed to that of a user in the new environment. Additionally the gore environment variable will need to be changed to the prod instance's URL. Example of a correctly configured prod_config.yml file for the prod environment:

API_CONFIG:
  limit: 1000000
AUTH:
  username: <Valid admin username in PROD environment>
  password: <Valid admin password in PROD environment>
  auth-header: <Auto-generated basic auth token, generated using postman>
ENVIRONMENT:
  gore: wlgore-prod.collibra.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-extraction-c.lynch278-0.0.7.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file data-extraction-c.lynch278-0.0.7.tar.gz.

File metadata

  • Download URL: data-extraction-c.lynch278-0.0.7.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.1

File hashes

Hashes for data-extraction-c.lynch278-0.0.7.tar.gz
Algorithm Hash digest
SHA256 8ac774b785e06a26d86bd6026c74c5cb45671a95434de80f1fe6a278572402ca
MD5 d48c6e1e3ce830f5a6ce1a90554df34c
BLAKE2b-256 b1ccb193c86b87cf1bf34d7dd8014500012c74f5b068b0c49c181399d205c9d1

See more details on using hashes here.

File details

Details for the file data_extraction_c.lynch278-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: data_extraction_c.lynch278-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.1

File hashes

Hashes for data_extraction_c.lynch278-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 911b2afd4a3e45b19d4b97732b82dcb1de03a8cf72fa7f985d126995e6c4eb8f
MD5 5fcd47f0dcf4b5abcfc03d6072c45255
BLAKE2b-256 affe25c4ab1d7ccb594cc447938b0998418fc2be78295e8619c959aaea23bd3b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page