Skip to main content

Orchestration service for SQL only ETL workflows.

Project description

Why SQLizer

In many cases you can use SQL only for ETL (extract/transform/load) pipelines relying on CTAS (create table as) queries and the builting import/export futures of your RDBMS or data warehouse software (eg. Redshift).

What is SQLizer

A simple orchestration service for SQL-only ETL workflows. This service was born out of a need to orchestrate a complete data processing pipeline atop of AWS Redshift.


[x] PostgreSQL/Resdhift support [x] Execiting multiple queries from a folder [ ] Executing a named query [ ] Executing an inline query [ ] MySQL support/Aurora support [ ] MongoDB support [ ] parallel execution of queries in one stage [ ] validation of the wrokflow [ ] DAG for stages [ ] multi-connection support

Developing SQLizer

Setting up the development environment

python3 -m venv ./.venv
echo ".venv/" >> .gitignore
source .venv/bin/activate
pip install -e .

Optionally install development/test dependencies:

pip install pytest pytest-runner codecov pytest-cov recommonmark

Prepare the docker image (and test it):

docker build -t sqlizer .
docker run --rm  --name sqlizer-runner -e "job_id=sqlizer" -e "bucket=sss" sqlizer

Prepare test data:

aws s3 mb s3://sqlizer-workflows --profile your-profile
aws s3 sync ~/Code/sqlizer/test-data/ s3://sqlizer-workflows --profile your-profile

Add parameters to the Systems Manager's Parameter Store:

aws ssm put-parameter --overwrite --name sqlizer.default.auth --value user:password --type SecureString --description "authentication details for data-source" --profile your-profile
aws ssm put-parameter --overwrite --name --value "" --type SecureString --description "url access for default data source" --profile your-profile

Run it locally:

export AWS_PROFILE=your-profile
#sqlizer --connection-url="" --bucket="s3://sqlizer-workflows"

Prepare the distribution:

pip install -U setuptools wheel
python build -vf && python bdist_wheel
pip install -U twine

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release. See tutorial on generating distribution archives.

Built Distribution

sqlizer-0.0.1-py3-none-any.whl (11.5 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page