An alternative approach utility for data warehouse DevOps for Snowflake - limited usage.

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
- Python :: Implementation
Topic

Project description

Hashmap DataDefinitionOps

An alternative approach utility for data warehouse DevOps for Snowflake

Database DevOps has been an ongoing challenge with few players in comparison to other functionalities like Data synchronization, Data Validation, Data governance, etc… We get asked by customers, on every engagement to define and implement DevOps for their Datawarehouse (Snowflake) implementation. In general, database DevOps has involved quite a bit of complexity and ongoing tweaking to try and get it right. There are some tools available in the market today including:

Sqitch
Snowchange
Redgate
etc...

Most of these are imperative style. In an earlier article, I also had developed a declarative approach for Don’t Do Analytics Engineering in Snowflake Until You Read This, which had been implemented in multiple clients.

Of late though, I had realized that in a typical development environment clients end up with tools like Erwin or Squirrel, etc… which creates these tables on the development database. Not all analysts can keep track of the latest changes in their Snowflake development database and inform the developer of what needs to get promoted. Tools like Sqitch, Flyway, etc works by you developing the script and ensuring you are aware of what needs to get promoted. They also maintain deployment history by storing the state of deployment on a set of tables in the database or a schema. Apart from the naming of the script, there is no clear direction of what is actually getting deployed. If there is a new table getting deployed vs a new stage getting created. Also, you still will need to use GIT for storing the code as a safe practice.

These and other factors got me thinking about an alternative approach to perform Database DevOps. I wanted to come from the direction that

It does not matter how you had created the tables, views, and other database objects; we should be able to identify what gets changed from the last deployment.
Git is not just version control for code, you can store states and keep track of state changes over multiple releases. Widely adopted in the software engineering/development phase there are well-defined, documented approaches to revert, changes, etc...
All most all database has some form of metadata database or schema, INFORMATION_SCHEMA in case of Snowflake, which reflects the current state of various objects defined in it.
Using information from the metadata we could technically reconstruct the object. Tools like Squirrel etc generate DDL scripts at any time using this approach.

Sounds compelling? If I have piqued your interest then what if I say the above is possible and I am sharing with you an Alpha version of the ‘Hashmap DataDefinitionOps’.

Head over to our Git repo for documentations.

Features

These are current implemented features:

Specific implementation for Snowflake
Identify what has been added since last deployment or fresh start
Detect if column has been added or modified
Templatized jinja script
Apply to multiple targets without any modifications
Dry run script generated for specific target
Store state in git for future deployment
DDL generated for
- Schema
- Table (create & alter)
- View
- Functions
- Procedures
- Grants
Modify generated script with additional Snowflake specific capabilities
Substitute environment variable in generated DDL scripts

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
- Python :: Implementation
Topic

Release history Release notifications | RSS feed

This version

0.0.0.28

Feb 12, 2021

0.0.0.27

Feb 11, 2021

0.0.0.26

Feb 11, 2021

0.0.0.25

Feb 11, 2021

0.0.0.24

Feb 8, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hashmap-data-definitionOps-0.0.0.28.tar.gz (22.3 kB view details)

Uploaded Feb 12, 2021 Source

File details

Details for the file hashmap-data-definitionOps-0.0.0.28.tar.gz.

File metadata

Download URL: hashmap-data-definitionOps-0.0.0.28.tar.gz
Upload date: Feb 12, 2021
Size: 22.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.7.7

File hashes

Hashes for hashmap-data-definitionOps-0.0.0.28.tar.gz
Algorithm	Hash digest
SHA256	`618f588757775bd02643948072c1d7ce753cb08675f31f0a33ab07ec0c4654d2`
MD5	`c8a2b4a1815d2acda93291431791e8d7`
BLAKE2b-256	`be3396cfb468150503027a6aa9695fc392c54398b7386a4979a6b8bd8caa7c7f`

See more details on using hashes here.

hashmap-data-definitionOps 0.0.0.28

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Hashmap DataDefinitionOps

Features

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes