Snowflake loader for mkpipe.

These details have not been verified by PyPI

Project description

mkpipe-loader-snowflake

Snowflake loader plugin for MkPipe. Writes Spark DataFrames into Snowflake tables using the native Snowflake Spark connector (spark-snowflake), which stages data via internal cloud storage (S3/Azure/GCS) — significantly faster than JDBC for large datasets.

Documentation

For more detailed documentation, please visit the GitHub repository.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Connection Configuration

connections:
  snowflake_target:
    variant: snowflake
    host: myaccount.snowflakecomputing.com
    port: 443
    database: MY_DATABASE
    schema: MY_SCHEMA
    user: myuser
    password: mypassword
    warehouse: MY_WAREHOUSE

With RSA key pair authentication:

connections:
  snowflake_target:
    variant: snowflake
    host: myaccount.snowflakecomputing.com
    port: 443
    database: MY_DATABASE
    schema: MY_SCHEMA
    user: myuser
    warehouse: MY_WAREHOUSE
    private_key_file: /path/to/rsa_key.p8
    private_key_file_pwd: mypassphrase

Table Configuration

pipelines:
  - name: pg_to_snowflake
    source: pg_source
    destination: snowflake_target
    tables:
      - name: public.events
        target_name: STG_EVENTS
        replication_method: full
        batchsize: 50000

Write Strategy

Control how data is written to Snowflake:

      - name: public.events
        target_name: STG_EVENTS
        write_strategy: upsert       # append | replace | upsert | merge
        write_key: [id]              # required for upsert/merge

Strategy	Snowflake Behavior
`append`	Insert via Spark connector (default for incremental)
`replace`	Overwrite table via Spark connector (default for full). Use `if_exists: append` to preserve existing table
`upsert`	Write to temp table, then `MERGE INTO target USING temp ON ... WHEN MATCHED THEN UPDATE ... WHEN NOT MATCHED THEN INSERT ...`
`merge`	Same as upsert for Snowflake

Note: upsert/merge requires write_key. The loader creates a temp table, writes data there, executes a MERGE statement, then drops the temp table.

Write Parallelism & Throughput

Snowflake loader uses the native Spark connector. Two parameters control write performance:

      - name: public.events
        target_name: STG_EVENTS
        replication_method: full
        batchsize: 50000        # rows per batch insert (default: 10000)
        write_partitions: 4     # coalesce DataFrame to N partitions before writing

How they work

batchsize: number of rows buffered before sending to Snowflake. Larger batches reduce round-trips and staging overhead.
write_partitions: calls coalesce(N) on the DataFrame before writing, controlling the number of concurrent write operations to Snowflake.

Performance Notes

Snowflake Warehouse size is the primary write performance lever. A larger warehouse processes inserts faster regardless of partition count.
The Spark connector stages data internally before committing. Large batchsize (50,000+) reduces staging overhead.
For very large loads, consider using Snowflake's native COPY INTO via an external stage (S3/GCS) instead — that is significantly faster but requires additional infrastructure.
write_partitions: 4–8 is a good default to balance throughput and connection count.

All Table Parameters

Parameter	Type	Default	Description
`name`	string	required	Source table name
`target_name`	string	required	Snowflake destination table name
`replication_method`	`full` / `incremental`	`full`	Replication strategy
`batchsize`	int	`10000`	Rows per batch insert
`write_partitions`	int	—	Coalesce DataFrame to N partitions before writing
`write_strategy`	string	—	`append`, `replace`, `upsert`, `merge`
`write_key`	list	—	Key columns for upsert/merge (required)
`if_exists`	string	—	`replace` (drop+create) or `append` (preserve table). Inherits from settings
`dedup_columns`	list	—	Columns used for `mkpipe_id` hash deduplication
`tags`	list	`[]`	Tags for selective pipeline execution
`pass_on_error`	bool	`false`	Skip table on error instead of failing

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.7.0

May 6, 2026

0.6.0

Apr 4, 2026

0.5.3

Mar 19, 2026

0.5.2

Mar 19, 2026

0.5.1

Mar 18, 2026

0.5.0

Mar 10, 2026

0.4.0

Mar 3, 2026

0.3.0

Feb 9, 2026

0.1.3

Jun 13, 2025

0.1.2

Jun 13, 2025

0.1.1

Jun 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mkpipe_loader_snowflake-0.7.0.tar.gz (9.4 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mkpipe_loader_snowflake-0.7.0-py3-none-any.whl (10.1 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file mkpipe_loader_snowflake-0.7.0.tar.gz.

File metadata

Download URL: mkpipe_loader_snowflake-0.7.0.tar.gz
Upload date: May 6, 2026
Size: 9.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for mkpipe_loader_snowflake-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`66884c82df751a0fba460935a2efeeeb9367c4fa9392479ce5e0c320bfe0cf24`
MD5	`1f867d71497c4632719721a07521bac3`
BLAKE2b-256	`863cfd7df1fccfd84abe8acc428762a3ae564722e79f18407fe26f18a23cfcef`

See more details on using hashes here.

File details

Details for the file mkpipe_loader_snowflake-0.7.0-py3-none-any.whl.

File metadata

Download URL: mkpipe_loader_snowflake-0.7.0-py3-none-any.whl
Upload date: May 6, 2026
Size: 10.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for mkpipe_loader_snowflake-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0b12036e5b3326f580205b3ace577c9ef017ceab5b96bcdd2c1e462b2e7d2e54`
MD5	`9f20bb3c1ec4c0e0ae6be7f341e97ff1`
BLAKE2b-256	`4927e49eae684f458094c000acbfc91a5a618da9ebb9954c2044ffbf5278ca25`

See more details on using hashes here.

mkpipe-loader-snowflake 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

mkpipe-loader-snowflake

Documentation

License

Connection Configuration

Table Configuration

Write Strategy

Write Parallelism & Throughput

How they work

Performance Notes

All Table Parameters

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes