Skip to main content

Snowflake Data Validation

Project description

Snowflake Data Validation

License Apache-2.0 Python

Snowflake Data Validation is a command-line tool and Python library for validating data migrations and ensuring data quality between source and target databases, with a focus on Snowflake and SQL Server.


This package is in Private Preview.

๐Ÿš€ Features

  • Multi-level validation: schema, statistical metrics, and data integrity.
  • Database connectors: support for SQL Server and Snowflake.
  • User-friendly CLI: commands for automation and orchestration.
  • Flexible configuration: YAML-based validation workflows.
  • Detailed reporting: comprehensive reports and progress tracking.
  • Extensible: architecture ready for more database engines.

๐Ÿ“ฆ Installation

pip install snowflake-data-validation

For SQL Server support:

pip install "snowflake-data-validation[sqlserver]"

For development and testing:

pip install "snowflake-data-validation[all]"

โšก Quick Start

Run a validation from SQL Server to Snowflake:

snowflake-data-validation sqlserver run-validation --data-validation-config-file ./config/conf.yaml

Or using the short alias:

sdv sqlserver run-validation --data-validation-config-file ./config/conf.yaml

๐Ÿ› ๏ธ Configuration

Create a YAML file to define your validation workflow:

source_platform: SqlServer
target_platform: Snowflake
output_directory_path: /path/to/output
parallelization: false

source_connection:
  mode: credentials
  host: "server"
  port: 1433
  username: "user"
  password: "password"
  database: "db"

target_connection:
  mode: name
  name: "SnowflakeConnection"

validation_configuration:
  schema_validation: true
  metrics_validation: true
  row_validation: false

comparison_configuration:
  tolerance: 0.01

tables:
  - fully_qualified_name: database.schema.table1
    use_column_selection_as_exclude_list: false
    column_selection_list:
      - column1
      - column2

See the documentation for more advanced configuration examples.


๐Ÿ—๏ธ Architecture

  • CLI: main_cli.py, sqlserver_cli.py, snowflake_cli.py
  • Connectors: connector/
  • Extractors: extractor/
  • Validation: validation/
  • Configuration: configuration/
  • Orchestrator: comparison_orchestrator.py

Project structure:

snowflake-data-validation/
โ”œโ”€โ”€ src/snowflake/snowflake_data_validation/
โ”‚   โ”œโ”€โ”€ main_cli.py
โ”‚   โ”œโ”€โ”€ sqlserver/
โ”‚   โ”œโ”€โ”€ snowflake/
โ”‚   โ”œโ”€โ”€ connector/
โ”‚   โ”œโ”€โ”€ extractor/
โ”‚   โ”œโ”€โ”€ validation/
โ”‚   โ”œโ”€โ”€ configuration/
โ”‚   โ”œโ”€โ”€ utils/
โ”‚   โ””โ”€โ”€ comparison_orchestrator.py
โ”œโ”€โ”€ docs/
โ”œโ”€โ”€ tests/
โ””โ”€โ”€ config_files/

๐Ÿ“Š Reports

  • Schema validation results
  • Statistical comparison metrics
  • Detailed error logs and recommendations

๐Ÿค Contributing

We welcome contributions! See our Contributing Guide for details on how to collaborate, set up your development environment, and submit PRs.


๐Ÿ“„ License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.


๐Ÿ†˜ Support


Developed with โ„๏ธ by Snowflake

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snowflake_data_validation-0.0.6.tar.gz (135.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

snowflake_data_validation-0.0.6-py3-none-any.whl (167.5 kB view details)

Uploaded Python 3

File details

Details for the file snowflake_data_validation-0.0.6.tar.gz.

File metadata

File hashes

Hashes for snowflake_data_validation-0.0.6.tar.gz
Algorithm Hash digest
SHA256 ff8c73570b308108038ae05e2c5e7028ede3a0a7adb53830ce09cbdb5f54dedf
MD5 5d2a81fa7e7defce3d9538faa8a584a4
BLAKE2b-256 a16c72d94d97bacebe52232de949c385ed27a5c69c6d60dde61d92e0360cb3d5

See more details on using hashes here.

File details

Details for the file snowflake_data_validation-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for snowflake_data_validation-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 5f76920b8a16df769ce2f402f1dcd07c7a909201c20a8d6c1133ebc181f69708
MD5 ed87ff4cdab7e2713cb9a3e0c9c3ea58
BLAKE2b-256 d84eb2fcbc99bb0d410415f058b9975a5572b6e0a8a9462e9f4a124fc93f6573

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page