A pared-down metadata scraper + SQL runner.
Project description
whale-pipelines
whale-pipelines is a library based on amundsen's databuilder that enables easy extraction of metadata into whale's markdown format. The library references static config files in ~/.whale/
to establish connections and customize the scraping process. Whale also provides hooks into SQLAlchemy for easy execution of SQL queries against these locally defined connections, without having to specify connection strings at every request.
For information on the full CLI platform, visit whale.
There are two main functions: pull
, which handles metadata extraction, and run
, which is enables execution of SQL queries.
pull
While whale invokes a build_script.py
function to run pull
, it does nothing else than call pull()
, with some logging set up around it. If, therefore, you'd like to pare down/write a custom CI/CD pipeline, all you need to do is:
pip install whale-pipelines
then run:
import whale as wh
wh.pull()
run
While libraries like pydobc, sqlalchemy, pyhive, etc. provide easy-to-use interfaces against a warehouse, the stateless nature of these libraries can make it a bit repetitive -- whenever you need to write a query, you generally need to open a cursor, specifying your warehouse URI and credentials. While somewhat trivial, run
simply wraps SQLAlchemy, enabling you to open a connection automatically against connections defined in ~/.whale/config/connections.yaml
.
To use this, simply run:
import whale as wh
wh.run()
A warehouse_name
kwarg can be specified, which will force run
to establish a connection with the first warehouse with the corresponding name
field matching the argument passed. If not given, the first warehouse in the list will be used.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file whale-pipelines-1.5.1.tar.gz
.
File metadata
- Download URL: whale-pipelines-1.5.1.tar.gz
- Upload date:
- Size: 35.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f73365871b29548e72b7bf4cc15179506a3b0585603a6fc448df3617f792993 |
|
MD5 | eb9fa52f127a59176aebac36ec278d31 |
|
BLAKE2b-256 | 1662503ad9d8bcba0536ed018aea8125806ba6be711538fb89eda354c40f4d02 |
File details
Details for the file whale_pipelines-1.5.1-py2.py3-none-any.whl
.
File metadata
- Download URL: whale_pipelines-1.5.1-py2.py3-none-any.whl
- Upload date:
- Size: 48.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67214c3de2e2e88824dc9105c231f0f813875f05b9663a5f4207a3bdf70b9f13 |
|
MD5 | bb7f86314cfc1f3088ad93d8c1ea4784 |
|
BLAKE2b-256 | 60dd86f377f5ae138d6657dd1d8868f4a30168ee71aecf7e6248b35ea1a163f7 |