Versatile Data Kit SDK plugin provides support for PostgreSQL database and postgres transformation templates.
Project description
This plugin allows vdk-core to interface with and execute queries against a PostgreSQL database.
Usage
Run
pip install vdk-postgres
After this, data jobs will have access to a Postgres database connection, managed by Versatile Data Kit SDK.
If you want to use single postgres database instance. If it is the only database plugin installed , vdk would automatically use it. Otherwise, users need to set VDK_DB_DEFAULT_TYPE=POSTGRES as an environment variable or set 'db_default_type' option in the data job config file (config.ini).
Add the required configuration values using the config file, environment variables, or VDK secrets. The supported configuration variables include:
POSTGRES_DSN - libpq connection string. Check https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING
POSTGRES_DBNAME - database name
POSTGRES_USER - user name
POSTGRES_PASSWORD - user password
POSTGRES_HOST - the host we need to connect to, defaulting to UNIX socket, https://www.psycopg.org/docs/module.html"
POSTGRES_PORT - The port to connect to, defaulting to 5432
Set a default database in config.ini like this:
[vdk]
postgres_dbname=postgres
postgres_user=postgres
postgres_password=postgres
postgres_host=localhost
postgres_port=5433
Note: Default database configurations must be in the [vdk] section.
You can connect to the default database through 'job_input'. For instance
def run(job_input: IJobInput):
job_input.execute_query("select 'Hi Postgres!'")
You can register multiple Postgres databases, but there should always be a default one set in the vdk section. Additional databases are added in subsections (with names like "vdk_"). Here's an example config.ini with an additional database:
[vdk]
postgres_dbname=postgres
postgres_user=postgres
postgres_password=postgres
postgres_host=localhost
postgres_port=5432
[vdk_postgres_second]
postgres_dbname=postgres_second
postgres_user=postgres
postgres_password=postgres
postgres_host=localhost
postgres_port=5433
To connect to databases, use the 'job_input'. Here's an example that demonstrates creating tables in default and secondary databases:
def run(job_input: IJobInput):
job_input.execute_query(
sql="CREATE TABLE default_table "
"(some_data varchar, more_data varchar, "
"int_data bigint, float_data real, bool_data boolean)",
database="postgres", # executed against the default; database option can be omitted
)
job_input.execute_query(
sql="CREATE TABLE secondary_table "
"(some_data varchar, more_data varchar, "
"int_data bigint, float_data real, bool_data boolean)",
database="postgres_second", # executed against the secondary; database option is mandatory if omitted it will be executed against the default
)
VDK also supports data ingestion. Here's an example of sending data for ingestion into the default and secondary databases:
def run(job_input: IJobInput):
.....
job_input.send_object_for_ingestion(
payload=payload,
destination_table="default_table",
method="postgres",
target="postgres",
)
job_input.send_object_for_ingestion(
payload=payload,
destination_table="secondary_table",
method="postgres_second",
target="postgres_second",
)
Configuration
You can also run vdk config-help - search for those prefixed with "POSTGRES_" to see what configuration options are available.
Testing
Testing this plugin locally requires installing the dependencies listed in vdk-plugins/vdk-postgres/requirements.txt
Run
pip install -r requirements.txt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for vdk_postgres-0.0.1269221667.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb70f738de0118f4586d6da0cdffa6cc06a415f4b9c55a99a7175033ab22271d |
|
MD5 | cb9f2888ed417dc88fd8bc4392466760 |
|
BLAKE2b-256 | 4fc39028d4bea7e183e396d2fbcfc41023c27a2690a3b121291980355fa447c7 |