Simple python ETL tool.
Project description
DOMETL (Python ETL Tool)
Dometl is a Python ETL package.
Process
- Init - Initializes the database
dometl -t init
- Stage - Moves files into staging tables
dometl -t stage
- Live - Runs transformations to move data from staging to live tables
dometl -t live
- Test - Runs very simple tests on the data
dometl -t test
How to Install & Run the Package?
Run the initialization step
dometl -t init -cp dometl_config
# if you don't install the package
# python -c "from dometl import run_dometl; run_dometl()" -t init -cp dometl_config
Run the staging step
dometl -t stage -ep datasets\\game_data\\daily\\20221105_g.csv -tb ST_GAME -cp dometl_config
# if you don't install the package
# python -c "from dometl import run_dometl; run_dometl()" -t stage -ep datasets\\game_data\\daily\\20221105_g.csv -tb st_game -cp dometl_config
# python -c "from dometl import run_dometl; run_dometl()" -t stage -ep datasets\\game_data\\seasons -tb st_game -cp dometl_config
Run the live step
dometl -t live -tb game -cp dometl_config
# if you don't install the package
# python -c "from dometl import run_dometl; run_dometl()" -t live -tb game -cp dometl_config
Run the test step
dometl -t test -tb game -cp dometl_config
# if you don't install the package
# python -c "from dometl import run_dometl; run_dometl()" -t test -tb game -cp dometl_config
The simple testing is made up of testing queries which are placed into the config.yaml folder like below
tests:
table_name: ["some_test.sql", "other_test.sql"]
Each table can have a set of test queries. The queries need to be written in a way that they return 0 rows when the test passes. If the query returns more than 0 rows the test will fail. As a suggestion the rows that are returned should help find the root cause of the failure.
Configuration Folder
\folder
config.yaml # structure defined below
db_create.sql # custom file which creates and initializes the db
file1.sql # custom SQL file
file2.sql # custom SQL file
file3.sql # custom SQL file
file4.sql # custom SQL file
file5.sql # custom SQL file
Structure for config.yaml
credentials_path: "path/to/creds.yaml"
init_order: [
"db_create.sql",
"file1.sql",
"file2.sql",
]
etl:
table_name_1: "file3.sql"
table_name_2: "file4.sql"
table_name_3: "file5.sql"
Structure for the creds.yaml
db_credentials:
username: ""
password: ""
hostname: ""
port: ""
db_name: ""
Bonus
Run a script with psql
psql -U postgres -h 127.0.0.1 -d DBNAME -f path\path\file_name.sql
Copy CSV into a table
psql -U postgres -h 127.0.0.1 -d DBNAME -c "COPY table_name FROM '/'some_name.csv' WITH (FORMAT csv)"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dometl-0.0.1.tar.gz.
File metadata
- Download URL: dometl-0.0.1.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ab406e5e46a53a2b93890bb9252e810093663ba070ad237a13172582ae4dbc5
|
|
| MD5 |
be19f4987a2242b4c8621b5113a4e1b4
|
|
| BLAKE2b-256 |
e0ee8bae49a6d479626364e90ebb9e54207d38593bbaf60401b5bb549581fc70
|
File details
Details for the file dometl-0.0.1-py3-none-any.whl.
File metadata
- Download URL: dometl-0.0.1-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5254b02ef484d9bd4072348b93c0c14091128e16753298b497510a92dc809d05
|
|
| MD5 |
421a808d055e217068bf9414cc173bc5
|
|
| BLAKE2b-256 |
4892002c5b8dcdd78818d4076effa622042cebf84dcbf514982db461cda35f09
|