Skip to main content

Combine duckdb-dbt and Kedro Dataset to easily read Kedro Dataset configs (yaml), enabling conversion of Kedro projects to dbt.

Project description

Combine duckdb-dbt and Kedro Datasets to enable:

  • extension of dbt to ingest wide array of data, and;
  • conversion of Kedro projects to dbt by easily reading your Kedro data catalog configs (yaml files)

Demo

You can add your existing Kedro definitions to your dbt sources like so:

pip install dbt_duckdb_kedro_datasets

version: 2

sources:
  - name: my_source # can call this anything
    schema: main
    meta:
      plugin: dbt_duckdb_kedro_datasets # this library
    tables:
      - name: my_table # can call this anything
        description: "A dbt_duckdb_kedro_datasets test"
        meta:
          type: pandas.CSVDataset
          filepath: ./data/1_raw/bikes.csv # file to ingest
          load_args:
            sep: ','

Now we can access this CSV in dbt

select *
from {{ source('my_source', 'my_table') }}

For a more complete example look at this

Functionality

This gives you access to read/write Excel Sheets, Parquet, Json, DeltaTable, Pickle and many more!

note: I've only tested this with CSV data so far so please let me know if you run into any issues. Particularly non-tabular data (i.e. picture bit values etc.) will probably not be compatible (since dbt expects dataframe like objects returned).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_duckdb_kedro_datasets-0.1.2.tar.gz (2.3 kB view hashes)

Uploaded Source

Built Distribution

dbt_duckdb_kedro_datasets-0.1.2-py3-none-any.whl (2.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page