Snowflake Data Validation
Project description
Snowflake Data Validation
Snowflake Data Validation is a command-line tool and Python library for validating data migrations and ensuring data quality between source and target databases, with a focus on Snowflake and SQL Server.
This package is in Private Preview.
๐ Features
- Multi-level validation: schema, statistical metrics, and data integrity.
- Database connectors: support for SQL Server and Snowflake.
- User-friendly CLI: commands for automation and orchestration.
- Flexible configuration: YAML-based validation workflows.
- Detailed reporting: comprehensive reports and progress tracking.
- Extensible: architecture ready for more database engines.
๐ฆ Installation
pip install snowflake-data-validation
For SQL Server support:
pip install "snowflake-data-validation[sqlserver]"
For development and testing:
pip install "snowflake-data-validation[all]"
โก Quick Start
Run a validation from SQL Server to Snowflake:
snowflake-data-validation sqlserver run-validation --data-validation-config-file ./config/conf.yaml
Or using the short alias:
sdv sqlserver run-validation --data-validation-config-file ./config/conf.yaml
๐ ๏ธ Configuration
Create a YAML file to define your validation workflow:
source_platform: SqlServer
target_platform: Snowflake
output_directory_path: /path/to/output
parallelization: false
source_connection:
mode: credentials
host: "server"
port: 1433
username: "user"
password: "password"
database: "db"
target_connection:
mode: name
name: "SnowflakeConnection"
validation_configuration:
schema_validation: true
metrics_validation: true
row_validation: false
comparison_configuration:
tolerance: 0.01
tables:
- fully_qualified_name: database.schema.table1
use_column_selection_as_exclude_list: false
column_selection_list:
- column1
- column2
See the documentation for more advanced configuration examples.
๐๏ธ Architecture
- CLI:
main_cli.py,sqlserver_cli.py,snowflake_cli.py - Connectors:
connector/ - Extractors:
extractor/ - Validation:
validation/ - Configuration:
configuration/ - Orchestrator:
comparison_orchestrator.py
Project structure:
snowflake-data-validation/
โโโ src/snowflake/snowflake_data_validation/
โ โโโ main_cli.py
โ โโโ sqlserver/
โ โโโ snowflake/
โ โโโ connector/
โ โโโ extractor/
โ โโโ validation/
โ โโโ configuration/
โ โโโ utils/
โ โโโ comparison_orchestrator.py
โโโ docs/
โโโ tests/
โโโ config_files/
๐ Reports
- Schema validation results
- Statistical comparison metrics
- Detailed error logs and recommendations
๐ค Contributing
We welcome contributions! See our Contributing Guide for details on how to collaborate, set up your development environment, and submit PRs.
๐ License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
๐ Support
- Documentation: Full documentation
- Issues: GitHub Issues
Developed with โ๏ธ by Snowflake
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file snowflake_data_validation-0.0.15.tar.gz.
File metadata
- Download URL: snowflake_data_validation-0.0.15.tar.gz
- Upload date:
- Size: 217.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a36018b24e3d22477c8ddc2fc1ad75204769f89acc6628280db716748a69d05c
|
|
| MD5 |
e80d33e76881b71d82d7a89b83a88302
|
|
| BLAKE2b-256 |
1ad47e5f2ee964cdf0fc09d121520355887b526c804b1da58c90269686d8b327
|
File details
Details for the file snowflake_data_validation-0.0.15-py3-none-any.whl.
File metadata
- Download URL: snowflake_data_validation-0.0.15-py3-none-any.whl
- Upload date:
- Size: 255.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b85b398a4d42bd4192160a20ed393998134ec0572b840ca1363d94816523881
|
|
| MD5 |
b2b9d4e8d3db2ce6499a0dd4be2d39c0
|
|
| BLAKE2b-256 |
bc4ba7d638daea93ef9542df0969d794e197e50ba87c54557195a43b73d0e26b
|