Skip to main content

SDK for Watcher framework

Project description

Watcher SDK

Installation

You can install the Watcher SDK, etl-watcher-sdk, using your preferred package manager.

Store Pipeline and Address Lineage Configuration

Store your pipeline and address lineage configuration in a Python file.

from watcher.models.pipeline import Pipeline, PipelineConfig
from watcher.models.address_lineage import AddressLineage, SourceAddress, TargetAddress

MY_ETL_PIPELINE_CONFIG = PipelineConfig(
    pipeline=Pipeline(
        name="my-etl-pipeline",
        pipeline_type_name="extraction",
        default_watermark="2024-01-01",
    ),
    address_lineage=AddressLineage(
        source_addresses=[
            SourceAddress(
                name="source_db.source_schema.source_table",
                address_type_name="postgres",
                address_type_group_name="database",
            )
        ],
        target_addresses=[
            TargetAddress(
                name="target_db.target_schema.target_table",
                address_type_name="snowflake",
                address_type_group_name="warehouse",
            )
        ],
    ),
)

Sync Pipeline and Address Lineage Configuration

Sync your pipeline and address lineage configuration to the Watcher framework. This ensures your code is the source of truth for the pipeline and address lineage configuration.

from watcher.client import Watcher
from watcher.models.pipeline import PipelineConfig

watcher = Watcher("https://api.watcher.example.com")
synced_config = watcher.sync_pipeline_config(MY_ETL_PIPELINE_CONFIG)
print(f"Pipeline synced!")

Track Pipeline Execution

from watcher import Watcher
from watcher.models.pipeline import PipelineConfig
from watcher.models.execution import ETLMetrics

watcher = Watcher("https://api.watcher.example.com")

synced_config = watcher.sync_pipeline_config(MY_ETL_PIPELINE_CONFIG)

@watcher.track_pipeline_execution(
    pipeline_id=synced_config.pipeline.id, 
    active=synced_config.active
)
def etl_pipeline():
    print("Starting ETL pipeline")
    
    # Your ETL work here
    
    return ETLMetrics(
        inserts=100,
        total_rows=100,
        execution_metadata={"partition": "2025-01-01"},
    )

etl_pipeline()

Contributing

I welcome contributions to the Watcher SDK! Please see the Contributing Guidelines for details on how to get started, the development process, and how to submit pull requests.

Quick Start for Contributors

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Spin up the Watcher framework for manual integration testing
  5. Run tests in the repo with make test
  6. Submit a pull request

For detailed information about the coding standards, testing requirements, and contribution process, please refer to the Contributing Guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

etl_watcher_sdk-0.1.0.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

etl_watcher_sdk-0.1.0-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file etl_watcher_sdk-0.1.0.tar.gz.

File metadata

  • Download URL: etl_watcher_sdk-0.1.0.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for etl_watcher_sdk-0.1.0.tar.gz
Algorithm Hash digest
SHA256 835296634cf33922832fe8fc9fcccb96ac00e281dd18b71410cba378fb0aea61
MD5 63ef8849c5ec6b2d10fa46a6d94706ee
BLAKE2b-256 3929798dacad4a4ad659cfe1aa8cb4e280ce766ee3c709d5b6680acb0b335bfd

See more details on using hashes here.

File details

Details for the file etl_watcher_sdk-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for etl_watcher_sdk-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4f5d87627836efb3a5014f8069f1dba88bccedba64f5baec279781a1d16f99e4
MD5 8915c6c1fd6cac8e10b8bb0e78c60600
BLAKE2b-256 f3b8b5acd738dd146b7576487ef9b098da36519a496be8f7772cea867bf2d6a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page