Facilitate data engineering using a combination of Databricks, dbt, and Azure Data Factory
Project description
Ingenii Data Engineering Package
Details
- Current Version: 0.1.0
Overview
This package provides utilities for data engineering on Ingenii's Azure Data Platform. This can be both used for local development, and is used in the Ingenii Databricks Runtime.
Usage
Import the package to use the functions within.
import ingenii_data_engineering
dbt
See details of how we validate dbt schemas in the dbt README file
Pre-processing
See details of working with pre-processing functions in the pre-processing README file
Development
Prerequisites
- A working knowledge of git SCM
- Installation of Python 3.7.3
Set up
- Complete the 'Getting Started > Prerequisites' section
- For Windows only:
- Go to ezwinports - this is required to be able to run
make
commands - Download make-4.2.1-without-guile-w32-bin.zip (get the version without guile)
- Extract zip and Copy the contents to C:\ProgramFiles\Git\mingw64\ merging the folders, but do NOT overwrite/replace any exisiting files.
- Go to ezwinports - this is required to be able to run
- Run
make setup
: to copy the .env into place (.env-dist
>.env
)
Getting started
-
Complete the 'Getting Started > Set up' section
-
From the root of the repository, in a terminal (preferably in your IDE) run the following commands to set up a virtual environment:
python -m venv venv . venv/bin/activate pip install -r requirements-dev.txt pre-commit install
or for Windows:
python -m venv venv . venv/Scripts/activate pip install -r requirements-dev.txt pre-commit install
-
Note: if you get a
permission denied
error when executing thepre-commit install
command you'll need to runchmod -R 775 venv/bin/
to recursively update permissions in thevenv/bin/
dir -
The following checks are run as part of pre-commit hooks: flake8(note unit tests are not run as a hook)
Building
- Complete the 'Getting Started > Set up' section
- Run
make build
to create the package in./dist
- Run
make clean
to remove dist files
Testing
- Complete the 'Getting Started > Set up' and 'Development' sections
- Run
make test
to run the unit tests using pytest - Run
flake8
to run lint checks using flake8 - Run
make qa
to run the unit tests and linting in a single command - Run
make qa
to remove pytest files
Version History
0.1.0
: dbt schema validation, pre-processing class
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ingenii_data_engineering-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1e996363ff244532a3ee3d8ec250c73aedfaf65a069adeb15bea7d62da95d58 |
|
MD5 | 6e79f9442089995745bda600932432ab |
|
BLAKE2b-256 | 7ead0b87cc5e4559ef93ae2609a08863eef2d543149ff70f095efa524e061ed7 |
Hashes for ingenii_data_engineering-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 357ddf0d2fdc3575ce89c81c53db31cf3a1702e460a1474b65ffe5fdbc67ca4d |
|
MD5 | 2327bad5265508197341beb7ff3eda5b |
|
BLAKE2b-256 | 0106f0fd52399864c7d5b23deb69ac8d6a9d32bd2bbe24a7502c399c310cd0b0 |