Skip to main content

A Singer target / Meltano loader for CrateDB, built with the Meltano SDK, and based on the Meltano PostgreSQL target.

Project description

Singer target / Meltano loader for CrateDB

Tests Test coverage Python versions

License Status PyPI Downloads

About

A Singer target for CrateDB, built with the Meltano SDK for custom extractors and loaders, and based on the Meltano PostgreSQL target.

In order to learn more about Singer, Meltano, and friends, navigate to the Singer Intro.

Operating the package successfully needs CrateDB 6.2 or higher.

Install

Usually, you will not install this package directly, but rather on behalf of a Meltano project. A corresponding snippet is outlined in the next section.

After adding it to your meltano.yml project definition file, you can install all defined components and their dependencies with a single command.

meltano install

Usage

You can run the CrateDB Singer target target-cratedb by itself, or in a pipeline using Meltano.

Meltano

Using the meltano add subcommand, you can add the plugin to your Meltano project.

meltano add loader target-cratedb

NB: It will only work like this when released and registered on Meltano Hub. In the meanwhile, please add the configuration snippet manually.

CrateDB Cloud

In order to connect to CrateDB Cloud, configure the sqlalchemy_url setting within your meltano.yml configuration file like this.

- name: target-cratedb
  namespace: cratedb
  variant: cratedb
  pip_url: meltano-target-cratedb
  config:
    sqlalchemy_url: "crate://admin:K4IgMXNvQBJM3CiElOiPHuSp6CiXPCiQYhB4I9dLccVHGvvvitPSYr1vTpt4@example.aks1.westeurope.azure.cratedb.net:4200?ssl=true"}
    add_record_metadata: true

On localhost

In order to connect to a standalone or on-premise instance of CrateDB, configure the sqlalchemy_url setting within your meltano.yml configuration file like this.

- name: target-cratedb
  namespace: cratedb
  variant: cratedb
  pip_url: meltano-target-cratedb
  config:
    sqlalchemy_url: crate://crate@localhost/
    add_record_metadata: true

Then, invoke the pipeline by using meltano run, similar like this.

meltano run tap-xyz target-cratedb

Standalone

You can also invoke it standalone by using the target-cratedb program. This example demonstrates how to load a file into the database.

First, acquire an example file in Singer format, including the list of countries of the world.

wget https://github.com/MeltanoLabs/target-postgres/raw/v0.0.9/target_postgres/tests/data_files/tap_countries.singer

Now, define the database connection string including credentials in SQLAlchemy format.

echo '{"sqlalchemy_url": "crate://crate@localhost/"}' > settings.json

By using Unix pipes, load the data file into the database, referencing the path to the configuration file.

cat tap_countries.singer | target-cratedb --config=settings.json

Using the interactive terminal program, crash, you can run SQL statements on CrateDB.

pip install crash
crash --hosts localhost:4200

Now, you can verify that the data has been loaded correctly.

SELECT
    "code", "name", "capital", "emoji", "languages[1]"
FROM
    "melty"."countries"
ORDER BY
    "name"
LIMIT
    42;

Write Strategy

Meltano's target-postgres uses a temporary table to receive data first, and then update the effective target table with information from that.

CrateDB's target-cratedb offers the possibility to also write directly into the target table, yielding speed improvements, which may be important in certain situations.

The environment variable MELTANO_CRATEDB_STRATEGY_DIRECT controls the behavior.

  • MELTANO_CRATEDB_STRATEGY_DIRECT=true: Directly write to the target table.
  • MELTANO_CRATEDB_STRATEGY_DIRECT=false: Use a temporary table to stage updates.

Note: The current default value is true, effectively short-cutting the native way of how Meltano handles database updates. The reason is that the vanilla way does not satisfy all test cases, yet.

Development

In order to work on this adapter dialect on behalf of a real pipeline definition, link your sandbox to a development installation of meltano-target-cratedb, and configure the pip_url of the component to point to a different location than the vanilla package on PyPI.

Use this URL to directly point to a specific Git repository reference.

pip_url: git+https://github.com/crate/meltano-target-cratedb.git@main

Use a pip-like notation to link the CrateDB Singer target in development mode, so you can work on it at the same time while running the pipeline, and iterating on its definition.

pip_url: --editable=/path/to/sources/meltano-target-cratedb

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meltano_target_cratedb-0.0.2.tar.gz (23.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

meltano_target_cratedb-0.0.2-py3-none-any.whl (23.9 kB view details)

Uploaded Python 3

File details

Details for the file meltano_target_cratedb-0.0.2.tar.gz.

File metadata

  • Download URL: meltano_target_cratedb-0.0.2.tar.gz
  • Upload date:
  • Size: 23.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for meltano_target_cratedb-0.0.2.tar.gz
Algorithm Hash digest
SHA256 24904fe25ba8eeeb36bd7fd3a7710c693e6c64f4d64e806680028232980b32c5
MD5 1e7aa4062c5662573b70d3c69a26944b
BLAKE2b-256 2978cfef4a740f861ea843ce45aebf4518789311698202718ea0d24f3115f6d8

See more details on using hashes here.

File details

Details for the file meltano_target_cratedb-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for meltano_target_cratedb-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8aa8ba9ffa7a690e4efad4b4b30b484226b2f6407806b9a8bd078e60c6575a5a
MD5 a888fa1083a537e0113a4ef41877e510
BLAKE2b-256 02dfb746b354bbff86013791eaefa629f54cae37ca3c571c72b400e9a6360e7a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page