airflow provider for rudderstack
Project description
The Customer Data Platform for Developers
Website · Documentation · Slack Community
RudderStack Airflow Provider
The RudderStack Airflow Provider lets you programmatically schedule and trigger your Reverse ETL syncs from outside RudderStack and integrate them with your existing Airflow workflows.
For more information on using the Airflow Provider utility, refer to the documentation. |
---|
Installation
pip install rudderstack-airflow-provider
Usage
RudderstackOperator
[!NOTE]
Use RudderstackRETLOperator for reverse ETL connections
A simple DAG for triggering syncs for a RudderStack source:
with DAG(
'rudderstack-sample',
default_args=default_args,
description='A simple tutorial DAG',
schedule_interval=timedelta(days=1),
start_date=datetime(2021, 1, 1),
catchup=False,
tags=['rs']
) as dag:
rs_operator = RudderstackOperator(
source_id='<source-id>',
task_id='<any-task-id>',
connection_id='rudderstack_conn'
)
For the complete code, refer to this example.
Operator Parameters
Parameter | Description | Type | Default |
---|---|---|---|
source_id |
Valid RudderStack source ID | String | None |
task_id |
A unique task ID within a DAG | String | None |
wait_for_completion |
If True , the task will wait for sync to complete. |
Boolean | False |
connection_id |
The Airflow connection to use for connecting to the Rudderstack API. | String | rudderstack_default |
The RudderStack operator also supports all the parameters supported by the Airflow base operator.
For details on how to run the DAG in Airflow, refer to the documentation.
RudderstackRETLOperator
Trigger syncs for RETL connections
with DAG('rudderstack-sample',
default_args=default_args,
description='A simple tutorial DAG',
schedule_interval=timedelta(days=1),
start_date=datetime(2021, 1, 1),
catchup=False,
tags=['rs']) as dag:
rs_operator = RudderstackRETLOperator(
retl_connection_id='2aiDQzMqP6LNuUokWstmaubcZOP',
task_id='retl-test-sync',
connection_id='rudder_yeshwanth_dev',
sync_type='full',
wait_for_completion=True
)
Operator parameters
Parameter | Description | Type | Default |
---|---|---|---|
retl_connection_id |
Valid RudderStack RETL connection ID | String (templatable) | None |
task_id |
A unique task ID within a DAG | String | None |
wait_for_completion |
If True , the task will wait for sync to complete. |
Boolean | False |
connection_id |
The Airflow connection to use for connecting to the Rudderstack API. | String | rudderstack_default |
sync_type |
Type of sync to trigger | incremental or full (templatable) |
incremental |
For details on how to run the DAG in Airflow, refer to the documentation.
Contribute
We would love to see you contribute to this project. Get more information on how to contribute here.
License
The RudderStack Airflow Provider is released under the MIT License.
Contact Us
For more information or queries on this feature, you can contact us or start a conversation in our Slack community.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for rudderstack_airflow_provider-1.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | a336bf2dc71c25ec83cd995ad762f6b3b55e21b493088c0037f955baa6737ce9 |
|
MD5 | 9138ad883309c374b54b268a42f39ddd |
|
BLAKE2b-256 | 11b8e027cd5248e1e09ddcd8ff60e80aa5bfb554159ffb660fc0e2c87f14c74f |
Hashes for rudderstack_airflow_provider-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26f98a80def9eabf59dc5e50ba92a62800c3b5660722ecf5e08e6097585f77e5 |
|
MD5 | a5a04328043a3392db95bf400a409bb6 |
|
BLAKE2b-256 | 31af2dffc4e8aca27fa9994df2aa2948775e37a9c4f41c98bc605b820a31e365 |