Run Snowflake SQL locally, powered by DuckDB - lightweight in-memory SQL engine for development and testing
Project description
โ๏ธ๐ฆ SnowDuck
Run Snowflake SQL locally, powered by DuckDB
SnowDuck is a lightweight, in-memory SQL engine that emulates Snowflake's behavior for development and testing. Write and test Snowflake SQL locally without cloud access or costs.
Recent Improvements:
- โ Zero linting errors - Professional, clean codebase
- โ Modern CI/CD - PyPI Trusted Publishers with package attestations
- โ 128 tests passing in <2s - Fast, comprehensive test suite
- โ Production-ready for development & testing
Why SnowDuck?
- ๐ Fast Development - Test SQL queries instantly without waiting for cloud connections
- ๐ฐ Zero Cloud Costs - Develop and test locally without Snowflake compute charges
- ๐งช Easy Testing - Mock Snowflake databases for unit tests and CI/CD pipelines
- โก Lightning Fast - Powered by DuckDB's in-memory execution engine
- ๐ Drop-in Compatible - Uses Snowflake's connector interface - just patch and go
Features
| Feature | Support |
|---|---|
| DDL Operations | โ CREATE/DROP DATABASE, SCHEMA, TABLE |
| String Functions | โ UPPER, LOWER, CONCAT, LENGTH, SUBSTRING |
| Aggregate Functions | โ COUNT, SUM, AVG, MIN, MAX, GROUP BY |
| Window Functions | โ ROW_NUMBER, RANK, DENSE_RANK, PARTITION BY |
| Session Variables | โ SET/SELECT $variable syntax |
| Information Schema | โ Query metadata (databases, tables, columns) |
| Advanced SQL | โ CTEs, JOINs, subqueries, CASE statements |
Note: SnowDuck is experimental. Use for development/testing only, not production workloads.
Quick Start
Installation
# Using uv (recommended - 10-100x faster)
uv pip install snowduck
# Or using pip
pip install snowduck
Usage
import snowflake.connector
from snowduck import start_patch_snowflake
# Patch the Snowflake connector to use DuckDB
start_patch_snowflake()
# Use Snowflake connector as normal - it's now backed by DuckDB!
with snowflake.connector.connect() as conn:
cursor = conn.cursor()
# Create a database
cursor.execute("CREATE DATABASE my_database")
cursor.execute("USE DATABASE my_database")
# Create a table
cursor.execute("""
CREATE TABLE employees (
id INTEGER,
name VARCHAR,
salary INTEGER
)
""")
# Insert data
cursor.execute("""
INSERT INTO employees VALUES
(1, 'Alice', 95000),
(2, 'Bob', 75000),
(3, 'Carol', 105000)
""")
# Query with Snowflake SQL
cursor.execute("""
SELECT
name,
salary,
RANK() OVER (ORDER BY salary DESC) as rank
FROM employees
""")
for row in cursor.fetchall():
print(f"{row[0]}: ${row[1]:,} (Rank: {row[2]})")
Output:
Carol: $105,000 (Rank: 1)
Alice: $95,000 (Rank: 2)
Bob: $75,000 (Rank: 3)
Data Persistence Options
SnowDuck supports two storage modes:
In-Memory (Default): Fast, isolated, data lost on exit
start_patch_snowflake() # or start_patch_snowflake(db_file=':memory:')
File-Based: Persistent, data survives restarts, perfect for testing
# Data persists to file and survives program restarts
start_patch_snowflake(db_file='test_data.duckdb')
with snowflake.connector.connect() as conn:
cursor = conn.cursor()
cursor.execute("CREATE TABLE employees (...)")
# Data saved to test_data.duckdb
# Later - data still exists!
with snowflake.connector.connect() as conn:
cursor = conn.cursor()
cursor.execute("SELECT * FROM employees") # โ
Works!
3. Fresh Start: Reset clears existing data (great for notebooks!)
# Deletes existing database file for a clean slate
start_patch_snowflake(db_file='demo.duckdb', reset=True)
Use file-based storage for:
- ๐งช Test fixtures that persist across test runs
- ๐ Shared test data across multiple test files
- ๐ CI/CD pipelines with reproducible datasets
- ๐ Development with sample data you want to keep
Data Seeding Made Easy
SnowDuck includes seed_table() for effortless test data creation:
from snowduck import seed_table
import pandas as pd
with snowflake.connector.connect() as conn:
# From dict of lists (easiest!)
seed_table(conn, 'customers', {
'id': [1, 2, 3],
'name': ['Acme Corp', 'TechStart', 'DataCo'],
'revenue': [1000000, 50000, 800000]
})
# From pandas DataFrame
df = pd.read_csv('test_data.csv')
seed_table(conn, 'orders', df)
# From list of dicts
seed_table(conn, 'products', [
{'id': 1, 'name': 'Widget', 'price': 9.99},
{'id': 2, 'name': 'Gadget', 'price': 19.99}
])
Why seed_table() is awesome:
- โ One line to create and populate tables
- โ Automatic data type inference
- โ Handles NULL values, timestamps, special characters
- โ Drops existing table by default (perfect for test fixtures)
- โ Works with dicts, DataFrames, or list of dicts
More Examples
Check out examples/notebook.ipynb for comprehensive examples including:
- Database and schema management
- Table operations and data manipulation
- String and aggregate functions
- Window functions and analytics
- Session variables
- Information schema queries
- Complex CTEs and subqueries
Use Cases
1. Local Development
# Develop and test Snowflake SQL without cloud access
from snowduck import start_patch_snowflake
start_patch_snowflake()
# Your existing Snowflake code works unchanged!
import snowflake.connector
conn = snowflake.connector.connect()
2. Unit Testing
Recommended: Use the @mock_snowflake decorator for clean, isolated tests
import pytest
import snowflake.connector
from snowduck import mock_snowflake, seed_table
# Simple unit test - automatic setup and cleanup
@mock_snowflake
def test_query_execution():
"""Each test gets a fresh, isolated in-memory database."""
conn = snowflake.connector.connect()
cursor = conn.cursor()
cursor.execute("CREATE TABLE users (id INT, name VARCHAR)")
cursor.execute("INSERT INTO users VALUES (1, 'Alice')")
cursor.execute("SELECT * FROM users")
assert cursor.fetchone() == (1, 'Alice')
# โ
Automatic cleanup - no side effects!
# Test with seed data
@mock_snowflake
def test_with_seed_data():
"""Use seed_table() for easy test data creation."""
conn = snowflake.connector.connect()
# One line to create and populate table!
seed_table(conn, 'customers', {
'id': [1, 2, 3],
'name': ['Acme Corp', 'TechStart', 'DataCo'],
'tier': ['enterprise', 'startup', 'enterprise']
})
cursor = conn.cursor()
cursor.execute("SELECT COUNT(*) FROM customers WHERE tier = 'enterprise'")
assert cursor.fetchone()[0] == 2
For shared fixtures across tests, use conftest.py:
# conftest.py
import pytest
from snowduck import patch_snowflake
import snowflake.connector
@pytest.fixture
def conn():
"""Per-test isolation with automatic cleanup."""
with patch_snowflake():
conn = snowflake.connector.connect()
yield conn
# โ
Automatic cleanup on exit
def test_my_feature(conn):
cursor = conn.cursor()
cursor.execute("SELECT 1")
assert cursor.fetchone()[0] == 1
Testing Patterns:
- โ
Simple tests: Use
@mock_snowflakedecorator - โ
Shared setup: Use fixture with
patch_snowflake()context manager - โ Isolated tests: In-memory (default) - fresh DB per test
- โ
Persistent data: File-based -
start_patch_snowflake(db_file='test.duckdb')
3. CI/CD Integration
# .github/workflows/test.yml
- name: Test SQL transformations
run: |
pip install snowduck pytest
pytest tests/ # Uses SnowDuck instead of real Snowflake
4. REST API Server
# Install with server extras
uv pip install snowduck[server]
# Start the server
just serve
# Or: uvicorn snowduck.server:app --reload
# Server runs at http://localhost:8000
# - Execute SQL queries via REST API
# - Get results in Arrow IPC format
# - Multi-session support with session management
# - Compatible with Snowflake REST API clients
Architecture
SnowDuck works by:
- Patching - Intercepts Snowflake connector calls via
start_patch_snowflake() - Translating - Converts Snowflake SQL dialect to DuckDB-compatible SQL
- Executing - Runs queries in DuckDB's fast in-memory engine
- Emulating - Mimics Snowflake's information schema and metadata
โโโโโโโโโโโโโโโโโโโโโโโ
โ Your Application โ
โ (Snowflake code) โ
โโโโโโโโโโโโฌโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ SnowDuck Patch โ โ Intercepts connector calls
โโโโโโโโโโโโฌโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ SQL Translator โ โ Snowflake โ DuckDB dialect
โโโโโโโโโโโโฌโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโ
โ DuckDB Engine โ โ Fast in-memory execution
โโโโโโโโโโโโโโโโโโโโโโโ
Development
Prerequisites: Python 3.11+, uv, just
# Clone and setup
git clone https://github.com/hupe1980/snowduck.git
cd snowduck
uv sync
# Run tests (128 tests, <2s)
just test
# Run all quality checks
just check
# See all commands
just --list
Testing Best Practices
When writing tests with SnowDuck:
-
Use
@mock_snowflakefor unit tests - Clean, automatic cleanupfrom snowduck import mock_snowflake @mock_snowflake def test_my_feature(): conn = snowflake.connector.connect() # Test code here - fresh DB, auto cleanup!
-
Use
seed_table()for test data - One-line table creationseed_table(conn, 'test_table', {'id': [1, 2], 'name': ['a', 'b']})
-
Choose the right storage mode:
- In-memory (default): Fast, isolated tests
- File-based: Shared fixtures, persistent data
# Isolated: Each test gets fresh DB @mock_snowflake # Uses in-memory by default def test_isolated(): ... # Shared: Data persists across tests @pytest.fixture(scope="session") def shared_data(): start_patch_snowflake(db_file='fixtures.duckdb')
-
Avoid anti-patterns:
- โ Don't mix
start_patch_snowflake()with@mock_snowflake - โ Don't forget cleanup with manual
start_patch_snowflake() - โ Use decorators or context managers for automatic cleanup
- โ Don't mix
Project Status
- โ 128 tests (100% passing, <2s execution)
- โ Zero linting errors - Professional, clean codebase
- โ 90%+ Snowflake compatibility - DDL, DML, functions, CTEs, window functions
- โ Modern CI/CD - PyPI trusted publishers with package attestations
- โ Production-ready for development & testing use cases
- ๐ง Experimental - Not for production Snowflake replacement
What's Next:
- Type safety improvements (mypy strict mode)
- Performance monitoring and benchmarks
- Enhanced error messages with helpful suggestions
- API documentation with Sphinx
Roadmap
- Additional Snowflake functions (JSON, ARRAY, etc.)
- Stored procedure emulation
- External table support
- Enhanced security features
- Performance benchmarks vs. Snowflake
Contributing
Contributions welcome! We'd love help with:
- ๐ Bug reports and fixes
- โจ New Snowflake function implementations
- ๐ Documentation improvements
- ๐งช Additional test coverage
Development Setup:
- ๐ Documentation improvements
- ๐งช Additional test coverage
Development Setup:
git clone https://github.com/hupe1980/snowduck.git
cd snowduck
uv sync
just test # Run 128 tests in <2s
Code Quality:
- Zero linting errors (ruff)
- Comprehensive test coverage
- Modern CI/CD with package attestations
License
MIT License - see LICENSE for details.
**Built with โ๏ธ and ๐ฆ
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file snowduck-0.0.3.tar.gz.
File metadata
- Download URL: snowduck-0.0.3.tar.gz
- Upload date:
- Size: 179.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3621f642f5ed94a73955d3223be46178ef1ac7f0b1c6259f52dcf4ea5dcd082f
|
|
| MD5 |
050ab82f08e4e7153afb59a026a6effb
|
|
| BLAKE2b-256 |
a015de532f4cece74a309766dbdda3661a3b05591a37b59123241d11ce522185
|
Provenance
The following attestation bundles were made for snowduck-0.0.3.tar.gz:
Publisher:
release.yml on hupe1980/snowduck
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
snowduck-0.0.3.tar.gz -
Subject digest:
3621f642f5ed94a73955d3223be46178ef1ac7f0b1c6259f52dcf4ea5dcd082f - Sigstore transparency entry: 844579578
- Sigstore integration time:
-
Permalink:
hupe1980/snowduck@12ac2d2a574a17282f81a29c3946395971c50220 -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/hupe1980
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@12ac2d2a574a17282f81a29c3946395971c50220 -
Trigger Event:
push
-
Statement type:
File details
Details for the file snowduck-0.0.3-py3-none-any.whl.
File metadata
- Download URL: snowduck-0.0.3-py3-none-any.whl
- Upload date:
- Size: 48.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9520b46d68f998da9c5f29199c09a9d845db0c8ebf9a443cfaca8f62ac1ec4e2
|
|
| MD5 |
670ee391599612da0e16e5ecb2731cef
|
|
| BLAKE2b-256 |
8466308d3aad4c7cdf6d44641f01138a834e5579591140352c80398ffceb9c59
|
Provenance
The following attestation bundles were made for snowduck-0.0.3-py3-none-any.whl:
Publisher:
release.yml on hupe1980/snowduck
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
snowduck-0.0.3-py3-none-any.whl -
Subject digest:
9520b46d68f998da9c5f29199c09a9d845db0c8ebf9a443cfaca8f62ac1ec4e2 - Sigstore transparency entry: 844579581
- Sigstore integration time:
-
Permalink:
hupe1980/snowduck@12ac2d2a574a17282f81a29c3946395971c50220 -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/hupe1980
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@12ac2d2a574a17282f81a29c3946395971c50220 -
Trigger Event:
push
-
Statement type: