Db2 dialect plugin for SQLGlot
Project description
Db2 SQLGlot Dialect Plugin
A Db2 dialect plugin for SQLGlot - a powerful SQL parser, transpiler, and optimizer.
Requirements
- Python: 3.10 - 3.12
- SQLGlot: 30.8.0 - 30.9.x (compatible with SQLMesh and other tools using SQLGlot 30.8.0)
Features
- Full Db2 SQL syntax support
- Cross-dialect transpilation (Db2 ↔ PostgreSQL, MySQL, Snowflake, etc.)
- Type mapping (BOOLEAN → SMALLINT, NCHAR/NVARCHAR support, etc.)
- Db2-specific functions (POSSTR, VARCHAR_FORMAT, DAYOFWEEK, DAYOFYEAR)
- FETCH FIRST syntax support
- NULL ordering support
- Window functions: Full support for RANK, DENSE_RANK, PERCENT_RANK, ROW_NUMBER, PARTITION BY
- SQLMesh compatibility: Automatic conversion of SQLMesh-specific functions (DATE_STR_TO_DATE, TIME_STR_TO_TIME, StrToTime) for SEED model support
✅ Test Results
All tests passing: 12 tests with 87% code coverage
$ python3 -m pytest tests/test_db2_dialect.py -v
============================= test session starts ==============================
tests/test_db2_dialect.py ............ [100%]
============================== 12 passed in 0.12s ==============================
Code Coverage:
Name Stmts Miss Cover
----------------------------------------------
db2_sqlglot/__init__.py 8 2 75%
db2_sqlglot/dialect.py 13 0 100%
db2_sqlglot/generator.py 57 9 84%
db2_sqlglot/parser.py 6 0 100%
----------------------------------------------
TOTAL 84 11 87%
Test Coverage
The test suite validates:
- ✅ Basic SQL: SELECT, INSERT, UPDATE, DELETE, CREATE/DROP TABLE
- ✅ Type conversions: INTEGER→INT, NCHAR→GRAPHIC, NVARCHAR→VARGRAPHIC, DBCLOB→CLOB
- ✅ Functions: POSSTR, VARCHAR_FORMAT, DAYOFWEEK, DAYOFYEAR, GREATEST, LEAST
- ✅ Boolean handling: TRUE/FALSE → 1/0
- ✅ Date/Time: CURRENT DATE, CURRENT TIMESTAMP, DATEDIFF→DAYS
- ✅ FETCH FIRST: Pagination with FETCH FIRST n ROWS ONLY
- ✅ OFFSET: OFFSET n ROWS syntax
- ✅ NULL ordering: NULLS FIRST, NULLS LAST
- ✅ Joins: INNER, LEFT, RIGHT joins
- ✅ Aggregations: COUNT, SUM, AVG, MIN, MAX
- ✅ Subqueries & CTEs: WITH clause support
- ✅ CASE expressions: Simple and searched CASE
- ✅ Operators: IN, BETWEEN, LIKE, IS NULL
- ✅ Set operations: UNION, UNION ALL
- ✅ Variable tokens: @var syntax
- ✅ Typed division: Proper numeric division handling
- ✅ SQLMesh compatibility: DATE_STR_TO_DATE, TIME_STR_TO_TIME, StrToTime function conversions
Installation
From Source (Development)
cd db2-sqlglot-dialect
pip install -e .
From PyPI
pip install db2-sqlglot-dialect
Usage
Once installed, the Db2 dialect is automatically discovered by SQLGlot:
from sqlglot import transpile
# Transpile from PostgreSQL to Db2 (LIMIT → FETCH FIRST)
result = transpile(
"SELECT * FROM table1 LIMIT 10",
read="postgres",
write="db2"
)
print(result[0])
# Output: SELECT * FROM table1 FETCH FIRST 10 ROWS ONLY
# Transpile from Db2 to PostgreSQL
# Note: FETCH FIRST is preserved (PostgreSQL supports it natively - SQL standard)
result = transpile(
"SELECT * FROM table1 FETCH FIRST 10 ROWS ONLY",
read="db2",
write="postgres"
)
print(result[0])
# Output: SELECT * FROM table1 FETCH FIRST 10 ROWS ONLY
# Both LIMIT and FETCH FIRST work in PostgreSQL
# If you need LIMIT specifically, use it in the source query
result = transpile(
"SELECT * FROM table1 LIMIT 10",
read="db2",
write="postgres"
)
print(result[0])
# Output: SELECT * FROM table1 LIMIT 10
Supported Features
Data Types
Native Db2 Types (Perfect Roundtrip)
- Standard types: INTEGER, BIGINT, SMALLINT, DECIMAL, VARCHAR, CHAR, DATE, TIMESTAMP, CLOB, BLOB, BOOLEAN
- Db2-specific types: GRAPHIC, VARGRAPHIC, DBCLOB, DATALINK, ROWID, DECFLOAT, XML
Cross-Dialect Type Conversions
When transpiling from other databases to Db2:
- ✅ TEXT → CLOB
- ✅ BYTEA/BINARY → BLOB
- ✅ TINYINT → SMALLINT
- ✅ TIMESTAMPTZ → TIMESTAMP
- ⚠️ SERIAL/BIGSERIAL → Preserved as-is (may need manual conversion)
- ⚠️ LONGTEXT/MEDIUMTEXT → Preserved as-is (may need manual conversion)
Note: Unsupported or dialect-specific constructs may be preserved as-is during transpilation. This behavior prioritizes syntax preservation over potentially lossy or incorrect transformations. Users should review and adjust preserved types as needed for their specific Db2 version and requirements.
Functions
POSSTR(haystack, needle)- String positionVARCHAR_FORMAT(timestamp, format)- Time to string conversionDAYOFWEEK(date)- Extract day of weekDAYOFYEAR(date)- Extract day of yearCURRENT DATE,CURRENT TIMESTAMP- Date arithmetic with
+and-
SQL Features
FETCH FIRST n ROWS ONLYsyntaxOFFSET n ROWSsyntax- NULL ordering (
NULLS FIRST,NULLS LAST) - Typed division
- Variable tokens (
@var)
Window Functions
Full support for SQL window functions commonly used in analytics and data transformation:
- RANK() - Assigns a rank to each row within a partition
- DENSE_RANK() - Like RANK() but without gaps in ranking
- PERCENT_RANK() - Relative rank of a row within a partition
- ROW_NUMBER() - Sequential number for each row
- PARTITION BY - Divides result set into partitions
- ORDER BY - Defines ordering within window
Example:
-- SQLMesh model with window functions
SELECT
customer_id,
total_revenue,
RANK() OVER (ORDER BY total_revenue DESC) AS revenue_rank,
DENSE_RANK() OVER (ORDER BY total_revenue DESC) AS dense_rank,
PERCENT_RANK() OVER (ORDER BY total_revenue DESC) AS percent_rank
FROM customer_stats
All window functions work seamlessly with Db2 and are fully compatible with SQLMesh models.
SQLMesh Integration
This dialect includes full support for SQLMesh:
SEED Models:
- DATE_STR_TO_DATE() → Automatically converted to Db2's
DATE()function - TIME_STR_TO_TIME() → Automatically converted to Db2's
TIME()function - StrToTime() → Automatically converted to Db2's
TIMESTAMP()function
Analytics Models:
- Full window function support (RANK, DENSE_RANK, PERCENT_RANK, ROW_NUMBER)
- CTE (Common Table Expressions) support
- Complex aggregations with GROUP BY
These features enable SQLMesh to work seamlessly with Db2 for both data loading (SEED models) and transformation (analytics models).
Example:
# SQLMesh generates this for SEED models:
SELECT DATE_STR_TO_DATE('2024-01-15') FROM table
# Automatically transpiled to Db2:
SELECT DATE('2024-01-15') FROM table
CI/CD
GitHub Actions Workflows
This project includes two automated CI/CD workflows:
-
Unit Tests (
.github/workflows/unit-tests.yml)- Triggers: Push to main, Pull Requests, Manual dispatch
- Platforms: Ubuntu, macOS, Windows
- Python versions: 3.10, 3.11, 3.12
- Features:
- Concurrency control (cancels outdated runs)
- Unit tests with coverage reporting
- Code quality checks (ruff, flake8)
- Codecov integration
-
Build and Publish Release (
.github/workflows/build_release.yaml)- Triggers:
- Version tag push (e.g.,
v1.0.1) - Publishes to Test PyPI only - Manual workflow dispatch (with choice: Test PyPI or PyPI, plus optional ref)
- Version tag push (e.g.,
- Features:
- Authorized release gating (only ShubhamKapoor992 and amitkumar293)
- Builds distribution packages (wheel + sdist)
- Tests installation on multiple platforms before publish
- Tag push: Automatically publishes to Test PyPI for testing
- Production release: Requires manual workflow dispatch to publish to PyPI
- Supports approval-gated publishing through environment protection rules
- Triggers:
Creating a Release
Step 1: Update Version
# Edit pyproject.toml
version = "1.0.1"
# Commit and push
git commit -am "Bump version to 1.0.1"
git push origin main
Step 2: Trigger the Release Workflow
You can trigger the release workflow in either of these ways:
Option A: Push a Version Tag
git tag v1.0.1
git push origin v1.0.1
This triggers .github/workflows/build_release.yaml.
Option B: Run Manually from GitHub Actions
- Go to Actions tab → Build and Publish Release
- Click "Run workflow"
- Select:
- Where to publish? →
test-pypiorpypi - Git ref → tag or branch (for example,
v1.0.1ormain)
- Where to publish? →
- Click "Run workflow"
Step 3: What Happens Next
When you push a tag (e.g., v1.0.1):
- Workflow checks authorization (only ShubhamKapoor992 and amitkumar293)
- Builds the package
- Validates with
twine check - Tests wheel installation on Ubuntu, macOS, Windows
- Publishes to Test PyPI automatically for testing
To publish to production PyPI:
- Go to Actions tab → Build and Publish Release
- Click "Run workflow"
- Select:
- Where to publish? →
pypi - Git ref → tag (e.g.,
v1.0.1)
- Where to publish? →
- Click "Run workflow"
- Approve if environment protection is configured
Step 4: Verify Installation
Test PyPI (after tag push):
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ db2-sqlglot-dialect
Production PyPI (after manual release):
pip install db2-sqlglot-dialect
Why This Approach Works Well
✅ Authorized releases only: Release workflow checks allowed GitHub users ✅ Build verification first: Package is built and validated before publish ✅ Install verification: Wheel is tested before publish ✅ Approval support: GitHub environments can require approval before publish ✅ Flexible triggering: Supports both version tags and manual dispatch
Development
Running Tests
pytest tests/
Running Tests with Coverage
pytest tests/ -v --cov=db2_sqlglot --cov-report=term --cov-report=html
Code Quality
The project uses ruff for formatting and linting:
# Run formatting
ruff format .
# Run linting
ruff check .
# Run formatting and repository checks
pre-commit run --all-files
Contributors should always run pre-commit run --all-files before opening a pull request.
Project Structure
db2-sqlglot-dialect/
├── db2_sqlglot/
│ ├── __init__.py
│ ├── dialect.py # Main dialect class
│ ├── generator.py # SQL generation logic
│ └── parser.py # SQL parsing logic
├── tests/
│ └── test_db2_dialect.py
├── pyproject.toml # Modern Python packaging (PEP 517/518/621)
└── README.md
Note: This project uses modern Python packaging with pyproject.toml only (no setup.py needed). Entry points are defined in pyproject.toml and work seamlessly with SQLGlot's plugin discovery system.
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Run
pre-commit run --all-files - Ensure all tests pass and code passes linting
- Submit a pull request
All pull requests will automatically run:
- Linting checks (ruff, flake8)
- Unit tests across multiple Python versions and platforms
- Code quality validation
Support
For issues and questions:
- GitHub Issues: https://github.com/IBM/db2-sqlglot-plugin/issues
Credits
Built on top of SQLGlot by Toby Mao.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file db2_sqlglot_dialect-1.0.0.tar.gz.
File metadata
- Download URL: db2_sqlglot_dialect-1.0.0.tar.gz
- Upload date:
- Size: 23.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a7a014114f23b6fc763342620aa3d351f5b4fd4f8232fa05c8e04049d151e94
|
|
| MD5 |
1e2e19a3ed8026d84b4010d736071939
|
|
| BLAKE2b-256 |
85e946cd8774e95c4718a74317c15069533154aabb33fe810026d0573e2cb3ab
|
Provenance
The following attestation bundles were made for db2_sqlglot_dialect-1.0.0.tar.gz:
Publisher:
build_release.yaml on IBM/db2-sqlglot-plugin
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
db2_sqlglot_dialect-1.0.0.tar.gz -
Subject digest:
0a7a014114f23b6fc763342620aa3d351f5b4fd4f8232fa05c8e04049d151e94 - Sigstore transparency entry: 1935540216
- Sigstore integration time:
-
Permalink:
IBM/db2-sqlglot-plugin@f41ca0e929c4c15c7db5a70392c4e37127fba623 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/IBM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build_release.yaml@f41ca0e929c4c15c7db5a70392c4e37127fba623 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file db2_sqlglot_dialect-1.0.0-py3-none-any.whl.
File metadata
- Download URL: db2_sqlglot_dialect-1.0.0-py3-none-any.whl
- Upload date:
- Size: 17.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d1ca15cea5a015f87032e4f1b0aa5f6663b4349bb6c31afdddceda30444c5cf
|
|
| MD5 |
20fddcdbadb85e8c8e07ec2a8cbd1a7d
|
|
| BLAKE2b-256 |
f716b8562734625fd3dec26a16a01dd3612be95ac11ce8afb5597bb0a424135d
|
Provenance
The following attestation bundles were made for db2_sqlglot_dialect-1.0.0-py3-none-any.whl:
Publisher:
build_release.yaml on IBM/db2-sqlglot-plugin
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
db2_sqlglot_dialect-1.0.0-py3-none-any.whl -
Subject digest:
8d1ca15cea5a015f87032e4f1b0aa5f6663b4349bb6c31afdddceda30444c5cf - Sigstore transparency entry: 1935540222
- Sigstore integration time:
-
Permalink:
IBM/db2-sqlglot-plugin@f41ca0e929c4c15c7db5a70392c4e37127fba623 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/IBM
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build_release.yaml@f41ca0e929c4c15c7db5a70392c4e37127fba623 -
Trigger Event:
workflow_dispatch
-
Statement type: