A Singer target / Meltano loader for CrateDB, built with the Meltano SDK, and based on the Meltano PostgreSQL target.
Project description
Singer target / Meltano loader for CrateDB
About
A Singer target for CrateDB, built with the Meltano SDK for custom extractors and loaders, and based on the Meltano PostgreSQL target.
In order to learn more about Singer, Meltano, and friends, navigate to the Singer Intro.
Operating the package successfully needs CrateDB 6.2 or higher.
Install
Usually, you will not install this package directly, but rather on behalf of a Meltano project. A corresponding snippet is outlined in the next section.
After adding it to your meltano.yml project definition file, you can install
all defined components and their dependencies with a single command.
meltano install
Usage
You can run the CrateDB Singer target target-cratedb by itself, or
in a pipeline using Meltano.
Meltano
Using the meltano add subcommand, you can add the plugin to your
Meltano project.
meltano add loader target-cratedb
NB: It will only work like this when released and registered on Meltano Hub. In the meanwhile, please add the configuration snippet manually.
CrateDB Cloud
In order to connect to CrateDB Cloud, configure the sqlalchemy_url setting
within your meltano.yml configuration file like this.
- name: target-cratedb
namespace: cratedb
variant: cratedb
pip_url: meltano-target-cratedb
config:
sqlalchemy_url: "crate://admin:K4IgMXNvQBJM3CiElOiPHuSp6CiXPCiQYhB4I9dLccVHGvvvitPSYr1vTpt4@example.aks1.westeurope.azure.cratedb.net:4200?ssl=true"}
add_record_metadata: true
On localhost
In order to connect to a standalone or on-premise instance of CrateDB, configure
the sqlalchemy_url setting within your meltano.yml configuration file like this.
- name: target-cratedb
namespace: cratedb
variant: cratedb
pip_url: meltano-target-cratedb
config:
sqlalchemy_url: crate://crate@localhost/
add_record_metadata: true
Then, invoke the pipeline by using meltano run, similar like this.
meltano run tap-xyz target-cratedb
Standalone
You can also invoke it standalone by using the target-cratedb program.
This example demonstrates how to load a file into the database.
First, acquire an example file in Singer format, including the list of countries of the world.
wget https://github.com/MeltanoLabs/target-postgres/raw/v0.0.9/target_postgres/tests/data_files/tap_countries.singer
Now, define the database connection string including credentials in SQLAlchemy format.
echo '{"sqlalchemy_url": "crate://crate@localhost/"}' > settings.json
By using Unix pipes, load the data file into the database, referencing the path to the configuration file.
cat tap_countries.singer | target-cratedb --config=settings.json
Using the interactive terminal program, crash, you can run SQL
statements on CrateDB.
pip install crash
crash --hosts localhost:4200
Now, you can verify that the data has been loaded correctly.
SELECT
"code", "name", "capital", "emoji", "languages[1]"
FROM
"melty"."countries"
ORDER BY
"name"
LIMIT
42;
Write Strategy
Meltano's target-postgres uses a temporary table to receive data first, and
then update the effective target table with information from that.
CrateDB's target-cratedb offers the possibility to also write directly into
the target table, yielding speed improvements, which may be important in certain
situations.
The environment variable MELTANO_CRATEDB_STRATEGY_DIRECT controls the behavior.
MELTANO_CRATEDB_STRATEGY_DIRECT=true: Directly write to the target table.MELTANO_CRATEDB_STRATEGY_DIRECT=false: Use a temporary table to stage updates.
Note: The current default value is true, effectively short-cutting the native
way of how Meltano handles database updates. The reason is that the vanilla way
does not satisfy all test cases, yet.
Development
In order to work on this adapter dialect on behalf of a real pipeline definition,
link your sandbox to a development installation of meltano-target-cratedb, and
configure the pip_url of the component to point to a different location than the
vanilla package on PyPI.
Use this URL to directly point to a specific Git repository reference.
pip_url: git+https://github.com/crate/meltano-target-cratedb.git@main
Use a pip-like notation to link the CrateDB Singer target in development mode,
so you can work on it at the same time while running the pipeline, and iterating
on its definition.
pip_url: --editable=/path/to/sources/meltano-target-cratedb
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file meltano_target_cratedb-0.0.2.tar.gz.
File metadata
- Download URL: meltano_target_cratedb-0.0.2.tar.gz
- Upload date:
- Size: 23.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24904fe25ba8eeeb36bd7fd3a7710c693e6c64f4d64e806680028232980b32c5
|
|
| MD5 |
1e7aa4062c5662573b70d3c69a26944b
|
|
| BLAKE2b-256 |
2978cfef4a740f861ea843ce45aebf4518789311698202718ea0d24f3115f6d8
|
File details
Details for the file meltano_target_cratedb-0.0.2-py3-none-any.whl.
File metadata
- Download URL: meltano_target_cratedb-0.0.2-py3-none-any.whl
- Upload date:
- Size: 23.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8aa8ba9ffa7a690e4efad4b4b30b484226b2f6407806b9a8bd078e60c6575a5a
|
|
| MD5 |
a888fa1083a537e0113a4ef41877e510
|
|
| BLAKE2b-256 |
02dfb746b354bbff86013791eaefa629f54cae37ca3c571c72b400e9a6360e7a
|