Skip to main content

DuckDB Plugin for VDK.

Project description

duckdb

monthly download count for vdk-duckdb

DuckDB plugin for the Versatile Data Kit (VDK), which enables users to connect to and interact with DuckDB databases. The purpose is to simplify data extraction, transformation, and loading tasks when working with DuckDB as a data source or destination

Usage

pip install vdk-duckdb

Configuration

Run vdk config-help to browse all available configuration options for your VDK installation.

Example

Query Execution

You specify you want to use duckdb in the job config file config.ini

[vdk]
db_default_type = duckdb

Then you can use it

    def run(job_input: IJobInput):
        job_input.execute_query("select 'Hi Duck!'")

Ingestion

This plugin allows users to ingest data to a DuckDB database, which can be preferable to inserting data manually as it automatically handles serializing, packaging and sending of the data asynchronously with configurable batching and throughput. To do so, you must set the expected variables to connect to Greenplum, plus the following environment variable:

export VDK_INGEST_METHOD_DEFAULT=DUCKDB

Then, from inside the run function in a Python step, you can use the send_object_for_ingestion or send_tabular_data_for_ingestion methods to ingest your data.

Build and testing

pip install -r requirements.txt
pip install -e .
pytest

In VDK repo ../build-plugin.sh script can be used also.

Note about the CICD:

.plugin-ci.yaml is needed only for plugins part of Versatile Data Kit Plugin repo.

The CI/CD is separated in two stages, a build stage and a release stage. The build stage is made up of a few jobs, all which inherit from the same job configuration and only differ in the Python version they use (3.7, 3.8, 3.9 and 3.10). They run according to rules, which are ordered in a way such that changes to a plugin's directory trigger the plugin CI, but changes to a different plugin does not.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vdk-duckdb-0.2.1216772137.tar.gz (7.8 kB view details)

Uploaded Source

File details

Details for the file vdk-duckdb-0.2.1216772137.tar.gz.

File metadata

  • Download URL: vdk-duckdb-0.2.1216772137.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.13

File hashes

Hashes for vdk-duckdb-0.2.1216772137.tar.gz
Algorithm Hash digest
SHA256 c23d156a0fc3a2c741d2f25e8e083a78bcecf8f8ae2e383bd59a8804318457a9
MD5 f02754c650a8916d53a6b35687ef02d0
BLAKE2b-256 8a694a545cf1441a052aea02a1cd0133480c051163da383dd918e3df02af31ed

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page