Orchestration service for SQL only ETL workflows.
Project description
Why SQLizer
In many cases you can use SQL only for ETL (extract/transform/load) pipelines relying on CTAS (create table as) queries and the builting import/export futures of your RDBMS or data warehouse software (eg. Redshift).
What is SQLizer
A simple orchestration service for SQL-only ETL workflows. This service was born out of a need to orchestrate a complete data processing pipeline atop of AWS Redshift.
Roadmap
[x] PostgreSQL/Resdhift support [x] Execiting multiple queries from a folder [ ] Executing a named query [ ] Executing an inline query [ ] MySQL support/Aurora support [ ] MongoDB support [ ] parallel execution of queries in one stage [ ] validation of the wrokflow [ ] DAG for stages [ ] multi-connection support
Developing SQLizer
Setting up the development environment
python3 -m venv ./.venv
echo ".venv/" >> .gitignore
source .venv/bin/activate
pip install -e .
Optionally install development/test dependencies:
pip install pytest pytest-runner codecov pytest-cov recommonmark
Prepare the docker image (and test it):
docker build -t sqlizer .
docker run --rm --name sqlizer-runner -e "job_id=sqlizer" -e "bucket=sss" sqlizer
Prepare test data:
aws s3 mb s3://sqlizer-workflows --profile your-profile
aws s3 sync ~/Code/sqlizer/test-data/ s3://sqlizer-workflows --profile your-profile
Add parameters to the Systems Manager's Parameter Store:
aws ssm put-parameter --overwrite --name sqlizer.default.auth --value user:password --type SecureString --description "authentication details for data-source" --profile your-profile
aws ssm put-parameter --overwrite --name sqlizer.default.host --value "some-cluster.redshift.amazonaws.com:5439/database" --type SecureString --description "url access for default data source" --profile your-profile
Run it locally:
export AWS_PROFILE=your-profile
#sqlizer --connection-url="root:some_secret_pass@some-cluster.redshift.amazonaws.com:5439/database" --bucket="s3://sqlizer-workflows"
sqlizer
Prepare the distribution:
pip install -U setuptools wheel
python setup.py build -vf && python setup.py bdist_wheel
pip install -U twine
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.