AI-Native Data Governance: TypeScript for Databases
Project description
GoQuality CLI
AI-Native Data Governance: TypeScript for Databases
GoQuality brings type safety to your data. Define types once, validate everywhere. Let AI generate the types, you govern the rules.
┌─────────────────────────────────────────────────────────────────┐
│ Database → AI Inference → YAML Types → Validation → ✓ │
│ │
│ "email" Email pattern: ^... 99.8% valid │
│ "amount" USD min: 0 100% valid │
│ "status" OrderStatus enum: [...] 98.2% valid │
└─────────────────────────────────────────────────────────────────┘
Installation
# Basic installation (includes PostgreSQL and DuckDB)
pip install goquality
# Cloud Data Warehouses
pip install goquality[snowflake] # Snowflake
pip install goquality[bigquery] # Google BigQuery
pip install goquality[databricks] # Databricks SQL & Unity Catalog
# Traditional Databases
pip install goquality[mysql] # MySQL / MariaDB
pip install goquality[mssql] # Microsoft SQL Server / Azure SQL
# All database connectors
pip install goquality[all-connectors]
# Development installation
pip install goquality[dev]
Quick Start
# 1. Initialize a new project
goquality init
# 2. Generate types from your database using AI
goquality generate --source postgres://user:pass@localhost/mydb
# 3. Review and edit the generated goquality.yaml
# 4. Run validation checks
goquality check --source postgres://user:pass@localhost/mydb
# 5. Diagnose any issues
goquality doctor --source postgres://user:pass@localhost/mydb
Commands
goquality init
Initialize a new GoQuality configuration file.
goquality init [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--source |
-s |
Database connection string to test |
--path |
-p |
Path for configuration file (default: goquality.yaml) |
Examples:
# Create default config
goquality init
# Create config and test database connection
goquality init --source postgres://localhost/mydb
# Create config at custom path
goquality init --path config/goquality.yaml
goquality generate
Generate type mappings using AI inference. Profiles your database schema and uses an LLM to suggest appropriate types for each column.
goquality generate [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--source |
-s |
Database connection string (required) |
--output |
-o |
Output file path (default: goquality.yaml) |
--schema |
Database schema to profile | |
--provider |
LLM provider: openai, anthropic, ollama (default: openai) |
Environment Variables:
OPENAI_API_KEY- Required for OpenAI providerANTHROPIC_API_KEY- Required for Anthropic providerOLLAMA_HOST- Ollama server URL (default:http://localhost:11434)
Examples:
# Generate using OpenAI (default)
goquality generate --source postgres://localhost/mydb
# Generate using Anthropic Claude
goquality generate --source postgres://localhost/mydb --provider anthropic
# Generate for specific schema
goquality generate --source postgres://localhost/mydb --schema public
# Generate using local Ollama
OLLAMA_HOST=http://localhost:11434 goquality generate \
--source postgres://localhost/mydb \
--provider ollama
goquality check
Run validation checks against your database. This is the core command that validates your data against the defined types.
goquality check [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--config |
-c |
Configuration file path (default: goquality.yaml) |
--source |
-s |
Database connection string |
--table |
-t |
Only check this specific table |
--output |
-o |
Output format: table, json, yaml, csv, markdown, junit |
--fail-threshold |
Percentage of failures allowed (0-100) | |
--fail-on-error/--no-fail-on-error |
Exit with error code on failures (default: true) | |
--quiet |
-q |
Only show errors and summary |
--skip-references |
Skip reference (FK) validation | |
--skip-contracts |
Skip contract validation | |
--skip-freshness |
Skip freshness validation | |
--skip-volume |
Skip volume (row count) validation | |
--only |
Only run specific validation: types, references, contracts, freshness, volume |
|
--parallel |
Run table validations in parallel for faster execution | |
--workers |
Number of parallel workers (default: 4, only used with --parallel) | |
--sample-size |
Validate only N random rows per table (for large tables) | |
--sample-percent |
Validate only X% of rows per table (0.0-100.0) | |
--notify/--no-notify |
Send notifications configured in goquality.yaml (default: true) | |
--webhook |
Send results to this webhook URL | |
--slack-webhook |
Send Slack notification to this webhook URL |
Exit Codes:
0- All checks passed (or within threshold)1- Validation failures detected (above threshold)
Examples:
# Basic check
goquality check --source postgres://localhost/mydb
# Check specific table
goquality check --source postgres://localhost/mydb --table users
# Output as JSON (for CI/CD pipelines)
goquality check --source postgres://localhost/mydb --output json
# Output as JUnit XML (for CI/CD test reporting)
goquality check --source postgres://localhost/mydb --output junit > results.xml
# Allow up to 5% failures
goquality check --source postgres://localhost/mydb --fail-threshold 5
# Generate markdown report
goquality check --source postgres://localhost/mydb --output markdown > report.md
# Send results to a webhook
goquality check --source postgres://localhost/mydb --webhook https://your-api.com/results
# Send Slack notification
goquality check --source postgres://localhost/mydb --slack-webhook https://hooks.slack.com/services/xxx
# Quiet mode for scripts
goquality check --source postgres://localhost/mydb --quiet
# Don't fail on errors (always exit 0)
goquality check --source postgres://localhost/mydb --no-fail-on-error
# Run in parallel for faster validation of many tables
goquality check --source postgres://localhost/mydb --parallel --workers 8
# Only run type validation (skip references and contracts)
goquality check --source postgres://localhost/mydb --only types
# Only run reference/FK validation
goquality check --source postgres://localhost/mydb --only references
# Only run freshness checks
goquality check --source postgres://localhost/mydb --only freshness
# Only run volume checks
goquality check --source postgres://localhost/mydb --only volume
# Use sampling for large tables (10,000 rows per table)
goquality check --source postgres://localhost/mydb --sample-size 10000
# Sample 1% of each table for quick validation
goquality check --source postgres://localhost/mydb --sample-percent 1
# Save metrics to JSON file
goquality check --source postgres://localhost/mydb --metrics-file metrics.json
# Enable performance profiling
goquality check --source postgres://localhost/mydb --profile
# Both metrics and profiling
goquality check --source postgres://localhost/mydb --metrics-file metrics.json --profile
Output Formats:
Table (default):
┏━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Column ┃ Type ┃ Rows ┃ Valid % ┃ Status ┃ Details ┃
┡━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ email │ Email │ 10,000 │ 99.8% │ ✓ PASS │ │
│ status │ Status │ 10,000 │ 98.2% │ ✗ FAIL │ 180 invalid │
└──────────┴────────┴────────┴─────────┴──────────┴───────────────┘
JSON:
{
"summary": {
"total_checks": 5,
"passed": 4,
"failed": 1,
"failure_rate": 20.0,
"threshold": 0.0,
"threshold_passed": false
},
"tables": [...]
}
JUnit XML:
<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="GoQuality Data Validation" tests="5" failures="1">
<testsuite name="public.users" tests="3" failures="0">
<testcase name="email (Email)" classname="public.users"/>
<testcase name="id (UUID)" classname="public.users"/>
</testsuite>
<testsuite name="public.orders" tests="2" failures="1">
<testcase name="total (USD)" classname="public.orders">
<failure message="3 rows failed validation" type="ValidationError">
Column: total
Type: USD
Invalid rows: 3 (0.03%)
</failure>
</testcase>
</testsuite>
</testsuites>
goquality validate
Validate configuration file syntax without connecting to a database.
goquality validate [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--config |
-c |
Configuration file to validate (default: goquality.yaml) |
Examples:
# Validate default config
goquality validate
# Validate specific config
goquality validate --config staging.yaml
goquality types
List and search available types in the standard library.
goquality types [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--search |
-s |
Search types by name or description |
--tag |
-t |
Filter by tag |
--show |
Show details for a specific type |
Examples:
# List all types
goquality types
# Search for email types
goquality types --search email
# Filter by tag
goquality types --tag finance
goquality types --tag healthcare
goquality types --tag regional
# Show type details
goquality types --show Email
goquality types --show CreditCardNumber
Available Tags:
core- Basic string/number typesfinance- Currency, banking, paymentshealthcare- Medical codes, identifiersecommerce- Products, orders, shippingsaas- API keys, tokens, SaaS identifiersregional- Country-specific formatsanalytics- Metrics, percentages, scoresiot- Sensors, devices, protocolspii- Personally identifiable information
goquality doctor
Diagnose your GoQuality environment and configuration.
goquality doctor [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--config |
-c |
Configuration file to check |
--source |
-s |
Database connection to test |
--verbose |
-v |
Show detailed information |
Checks Performed:
- Python version compatibility
- Core dependencies installed
- Database drivers available
- LLM providers configured
- Type library loading
- Configuration file validity
- Database connectivity
- Environment variables
Examples:
# Basic diagnostics
goquality doctor
# Check with database connection
goquality doctor --source postgres://localhost/mydb
# Verbose output
goquality doctor --verbose
goquality stats
Show statistics about the type library and configuration.
goquality stats [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--config |
-c |
Configuration file path |
Examples:
goquality stats
goquality version
Show version information.
goquality version
goquality connections
Manage database connections configured in goquality.toml.
# List configured connections
goquality connections list
# Test a specific connection
goquality connections test dev
# Test all connections
goquality connections test-all
# Show connection details (credentials masked)
goquality connections show dev
goquality config
Manage project configuration (goquality.toml).
# Create a new goquality.toml
goquality config init
# Show current configuration
goquality config show
# Validate configuration
goquality config validate
# Show config file path
goquality config path
Configuration File
GoQuality uses YAML configuration files. The default file is goquality.yaml.
Full Example
# GoQuality Configuration
# https://goquality.dev/docs
# Custom type definitions (extend or override stdlib)
types:
# Simple type with pattern
- name: EmployeeId
description: "Internal employee identifier"
base: String
pattern: "^EMP-[0-9]{6}$"
min_length: 10
max_length: 10
# Type extending stdlib
- name: CorporateEmail
description: "Company email address"
base: String
extends: Email
pattern: "^[a-z.]+@acme\\.com$"
# Numeric type with range
- name: DiscountPercent
description: "Discount percentage"
base: Decimal
min: 0
max: 100
precision: 2
# Enum type
- name: Department
description: "Company department"
base: String
enum: ["engineering", "sales", "marketing", "hr", "finance"]
# Type with uniqueness constraint
- name: ProductSKU
description: "Unique product SKU"
base: String
pattern: "^[A-Z]{2}-[0-9]{6}$"
unique: true
# Model mappings (table → column types)
models:
- table: public.users
columns:
- name: id
type: UUID
- name: email
type: CorporateEmail
- name: employee_id
type: EmployeeId
- name: department
type: Department
- name: created_at
type: Timestamp
# Volume check: ensure table isn't empty
volume:
min_rows: 1
- table: public.orders
columns:
- name: id
type: UUID
- name: user_id
type: UUID
- name: total_amount
type: USD
- name: discount
type: DiscountPercent
allow_null: true # Override type's nullability
- name: status
type: OrderStatus
# Freshness check: ensure recent data
freshness:
column: created_at
warn_after:
hours: 1
error_after:
hours: 6
# Volume check: bounded row count
volume:
min_rows: 1000
max_rows: 50000000
# Pattern matching - apply to all audit tables
- table: "*_audit"
columns:
- name: "*_at" # Match created_at, updated_at, deleted_at
type: Timestamp
- name: "*_by" # Match created_by, updated_by
type: UUID
# Explicit relationships (FK validation)
relationships:
- from: orders.user_id
to: users.id
- from: orders.shipping_address_id
to: addresses.id
name: "Order Shipping Address"
nullable: true
- from: order_items.order_id
to: orders.id
# Ad-hoc checks (quick SQL rules)
checks:
- "on": orders
name: "Order integrity"
rules:
- "total_amount >= 0"
- "created_at <= NOW()"
- "status IS NOT NULL"
- "on": users
name: "User constraints"
rules:
- "email IS NOT NULL"
- "created_at <= NOW()"
# SQL contracts (complex cross-table validation)
contracts:
- name: order_items_sum_matches_total
description: "Order items should sum to order total"
sql: |
SELECT o.id, o.total, SUM(oi.quantity * oi.unit_price) as items_sum
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
GROUP BY o.id, o.total
HAVING ABS(o.total - SUM(oi.quantity * oi.unit_price)) > 0.01
expect: empty
severity: error
- name: recent_orders_exist
description: "Should have orders in the last 24 hours"
sql: |
SELECT 1 FROM orders
WHERE created_at > NOW() - INTERVAL '24 hours'
LIMIT 1
expect: not_empty
severity: warning
# Notifications (optional)
notifications:
# Slack notification on failures
- type: slack
url: ${SLACK_WEBHOOK_URL}
trigger: failure
mention_on_failure:
- U12345678 # Slack user ID
# Webhook for custom integrations
- type: webhook
url: https://your-api.com/goquality-results
trigger: always
headers:
Authorization: "Bearer ${API_TOKEN}"
Notifications Configuration
Configure notifications to alert your team when validation runs complete.
| Field | Type | Description |
|---|---|---|
type |
string | Notification type: webhook, slack |
url |
string | Webhook URL (use ${ENV_VAR} for secrets) |
trigger |
string | When to notify: always, failure, success, threshold_exceeded |
headers |
object | Custom HTTP headers (optional) |
auth_token |
string | Bearer token for Authorization header (optional) |
timeout_seconds |
float | Request timeout (default: 30) |
retry_count |
int | Number of retries on failure (default: 3) |
include_samples |
bool | Include sample failure values (default: true) |
max_failures_shown |
int | Maximum failures to show (default: 10) |
channel |
string | Slack channel override (Slack only) |
mention_on_failure |
array | Slack user IDs to mention on failure (Slack only) |
Type Definition Fields
| Field | Type | Description |
|---|---|---|
name |
string | PascalCase type name (required) |
description |
string | Human-readable description (required) |
base |
string | Base type: String, Integer, Decimal, Boolean, Date, Timestamp |
extends |
string | Parent type to inherit from |
pattern |
string | Regex pattern (String types) |
min_length |
int | Minimum string length |
max_length |
int | Maximum string length |
not_empty |
bool | Reject empty/whitespace strings |
min |
number | Minimum value (numeric types) |
max |
number | Maximum value (numeric types) |
precision |
int | Decimal places (Decimal type) |
enum |
array | Allowed values |
allow_null |
bool | Whether NULL is permitted (default: false) |
unique |
bool | Values must be unique |
references |
string | FK reference as table.column (for cross-table validation) |
where |
string | SQL WHERE clause filter on referenced table |
tags |
array | Searchable tags |
examples |
array | Example valid values |
deprecated |
bool | Mark as deprecated |
Connection Strings
GoQuality supports multiple database backends via connection strings.
PostgreSQL
# Full format
postgres://user:password@host:port/database
# Examples
postgres://postgres:secret@localhost:5432/mydb
postgresql://user:pass@db.example.com/production
postgres://localhost/mydb # Local with defaults
DuckDB
# In-memory database
duckdb://:memory:
# File database
duckdb:///path/to/database.db
# CSV/Parquet files (auto-detected)
/path/to/data.csv
/path/to/data.parquet
./relative/path/data.csv
Snowflake
# Full format
snowflake://user@account/database/schema?warehouse=WAREHOUSE
# Examples
snowflake://john@xy12345/analytics/public?warehouse=COMPUTE_WH
snowflake://user@account/db/schema?warehouse=WH&role=ANALYST
Environment Variables:
export SNOWFLAKE_ACCOUNT=xy12345
export SNOWFLAKE_USER=john
export SNOWFLAKE_PASSWORD=secret
export SNOWFLAKE_DATABASE=analytics
export SNOWFLAKE_SCHEMA=public
export SNOWFLAKE_WAREHOUSE=COMPUTE_WH
BigQuery
# Format
bigquery://project-id/dataset
# Examples
bigquery://my-project/analytics
bigquery://prod-data-warehouse/sales
Environment Variables:
export GOOGLE_CLOUD_PROJECT=my-project
export BIGQUERY_DATASET=analytics
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
MySQL / MariaDB
# Full format
mysql://user:password@host:port/database
# Examples
mysql://root:secret@localhost:3306/mydb
mysql://user:pass@db.example.com/production
mariadb://user:pass@localhost/mydb # MariaDB compatible
# Cloud providers
mysql://admin:pass@mydb.cluster-xxxxx.us-east-1.rds.amazonaws.com:3306/mydb # AWS RDS
mysql://user:pass@34.xxx.xxx.xxx:3306/mydb # Cloud SQL
Environment Variables:
export MYSQL_HOST=localhost
export MYSQL_PORT=3306
export MYSQL_USER=myuser
export MYSQL_PASSWORD=secret
export MYSQL_DATABASE=mydb
Microsoft SQL Server
# Full format
mssql://user:password@host:port/database
# Examples
mssql://sa:Password123@localhost:1433/mydb
sqlserver://user:pass@server.database.windows.net/mydb # Azure SQL
# With schema
mssql://user:pass@localhost/mydb?schema=dbo
# Windows Authentication
mssql://localhost/mydb?trusted_connection=true
Environment Variables:
export MSSQL_HOST=localhost
export MSSQL_PORT=1433
export MSSQL_USER=sa
export MSSQL_PASSWORD=secret
export MSSQL_DATABASE=mydb
export MSSQL_SCHEMA=dbo
Databricks
# Full format
databricks://hostname/schema?http_path=/sql/...&catalog=main&access_token=xxx
# Examples
databricks://my-workspace.cloud.databricks.com/default?http_path=/sql/1.0/warehouses/abc123&access_token=dapiXXX
# Azure Databricks
databricks://adb-123456789.7.azuredatabricks.net/default?http_path=/sql/1.0/warehouses/abc&access_token=dapiXXX
# With Unity Catalog
databricks://hostname/myschema?http_path=/sql/1.0/warehouses/abc&catalog=production&access_token=dapiXXX
Environment Variables:
export DATABRICKS_HOST=my-workspace.cloud.databricks.com
export DATABRICKS_TOKEN=dapiXXXXXXXXXX
export DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/abc123
export DATABRICKS_CATALOG=main
export DATABRICKS_SCHEMA=default
Project Configuration (goquality.toml)
Store database connections, AI provider settings, and environments in a TOML configuration file.
File Location
GoQuality looks for project config in:
./goquality.toml(project root)./.goquality/config.toml~/.config/goquality/config.toml(user-level defaults)
Quick Start
# Create a new goquality.toml
goquality config init
# List configured connections
goquality connections list
# Test a connection
goquality connections test dev
# Validate configuration
goquality config validate
Full Example
# goquality.toml - Project configuration
[project]
name = "My Data Project"
#──────────────────────────────────────────────────────────────────────────────
# DATABASE CONNECTIONS
#──────────────────────────────────────────────────────────────────────────────
[connections]
default = "dev" # Default connection when --source not specified
[connections.local]
connection_string = "duckdb://:memory:"
description = "Local testing with DuckDB"
[connections.dev]
dialect = "postgres"
host = "localhost"
port = 5432
database = "myapp_dev"
user = "${DB_USER}" # Environment variable interpolation
password = "${DB_PASSWORD}"
description = "Development database"
[connections.staging]
dialect = "postgres"
host = "${STAGING_DB_HOST}"
database = "myapp_staging"
user = "${STAGING_DB_USER}"
password = "${STAGING_DB_PASSWORD}"
description = "Staging environment"
[connections.prod]
connection_string = "postgres://${PROD_USER}:${PROD_PASS}@prod.example.com/myapp"
description = "Production database (read-only)"
[connections.warehouse]
dialect = "snowflake"
host = "xy12345.snowflakecomputing.com"
database = "analytics"
schema = "public"
user = "${SNOWFLAKE_USER}"
password = "${SNOWFLAKE_PASSWORD}"
description = "Snowflake data warehouse"
[connections.warehouse.options]
warehouse = "COMPUTE_WH"
role = "ANALYST"
#──────────────────────────────────────────────────────────────────────────────
# AI / LLM CONFIGURATION (for `goquality generate`)
#──────────────────────────────────────────────────────────────────────────────
[ai]
default = "openai" # Default AI provider
[ai.openai]
api_key = "${OPENAI_API_KEY}"
model = "gpt-4o" # Optional: override default model
[ai.anthropic]
api_key = "${ANTHROPIC_API_KEY}"
model = "claude-sonnet-4-20250514"
[ai.ollama]
host = "http://localhost:11434"
model = "llama3"
#──────────────────────────────────────────────────────────────────────────────
# ENVIRONMENTS (bundle connection + AI + settings per environment)
#──────────────────────────────────────────────────────────────────────────────
[environments]
default = "dev"
[environments.dev]
connection = "dev"
ai = "ollama" # Use local LLM in dev
fail_threshold = 10 # More lenient in development
[environments.staging]
connection = "staging"
ai = "openai"
fail_threshold = 5
[environments.prod]
connection = "prod"
ai = "openai"
fail_threshold = 0 # Zero tolerance in production
#──────────────────────────────────────────────────────────────────────────────
# DEFAULT CLI OPTIONS
#──────────────────────────────────────────────────────────────────────────────
[defaults]
parallel = true
workers = 4
output = "table"
notify = true
Using Named Connections
# Use default connection (from goquality.toml)
goquality check
# Use named connection
goquality check --source dev
goquality check --source staging
goquality check --source warehouse
# Explicit connection string still works
goquality check --source postgres://user:pass@localhost/mydb
Using Environments
# Use default environment
goquality check
# Use named environment (bundles connection + AI + settings)
goquality check --env prod
goquality generate --env dev
# Environment via environment variable
export GOQUALITY_ENV=staging
goquality check
Connection Management Commands
# List all configured connections
goquality connections list
goquality connections list --verbose
# Test a specific connection
goquality connections test dev
goquality connections test # Tests default connection
# Test all connections
goquality connections test-all
# Show connection details (credentials masked)
goquality connections show dev
Config Management Commands
# Create new config file
goquality config init
goquality config init --name "My Project"
# Show current configuration
goquality config show
goquality config show --verbose
# Validate configuration
goquality config validate
# Show config file path
goquality config path
CI/CD Integration
GitHub Actions
name: Data Quality
on:
push:
branches: [main]
schedule:
- cron: '0 6 * * *' # Daily at 6 AM
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install GoQuality
run: pip install goquality[postgres]
- name: Validate Configuration
run: goquality validate
- name: Run Data Quality Checks
run: |
goquality check \
--source ${{ secrets.DATABASE_URL }} \
--output junit \
--fail-threshold 1 \
> results.xml
- name: Upload Test Results
uses: actions/upload-artifact@v4
if: always()
with:
name: quality-report
path: results.xml
- name: Publish Test Results
uses: dorny/test-reporter@v1
if: always()
with:
name: GoQuality Results
path: results.xml
reporter: java-junit
GitHub Actions with Slack Notifications
name: Data Quality with Notifications
on:
schedule:
- cron: '0 6 * * *' # Daily at 6 AM
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install GoQuality
run: pip install goquality[postgres]
- name: Run Data Quality Checks
run: |
goquality check \
--source ${{ secrets.DATABASE_URL }} \
--output junit \
--slack-webhook ${{ secrets.SLACK_WEBHOOK_URL }} \
> results.xml
env:
GOQUALITY_SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK_URL }}
GitLab CI
data-quality:
image: python:3.11
stage: test
script:
- pip install goquality[postgres]
- goquality validate
- goquality check --source $DATABASE_URL --output junit > report.xml
artifacts:
reports:
junit: report.xml
paths:
- report.xml
expire_in: 1 week
Pre-commit Hook
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: goquality-validate
name: Validate GoQuality Config
entry: goquality validate
language: system
files: goquality\.yaml$
pass_filenames: false
Standard Library Types
GoQuality includes 320+ pre-defined types organized by category.
Core Types
| Type | Base | Description |
|---|---|---|
Email |
String | Email address |
EmailNullable |
String | Optional email |
UUID |
String | UUID v4 |
URL |
String | HTTP/HTTPS URL |
PhoneNumber |
String | International phone |
Hostname |
String | DNS hostname |
Finance Types
| Type | Base | Description |
|---|---|---|
USD |
Decimal | US Dollar amount |
EUR |
Decimal | Euro amount |
CreditCardNumber |
String | Credit card (Luhn) |
IBAN |
String | International bank account |
BIC |
String | Bank identifier code |
ABARoutingNumber |
String | US routing number |
Healthcare Types
| Type | Base | Description |
|---|---|---|
ICD10 |
String | ICD-10 diagnosis code |
CPT |
String | CPT procedure code |
NPI |
String | National Provider ID |
NDC |
String | National Drug Code |
LOINC |
String | Lab test code |
E-commerce Types
| Type | Base | Description |
|---|---|---|
SKU |
String | Stock keeping unit |
UPC |
String | UPC-A barcode |
EAN13 |
String | EAN-13 barcode |
ASIN |
String | Amazon product ID |
ISBN13 |
String | Book ISBN-13 |
Regional Types
| Type | Base | Description |
|---|---|---|
SSN |
String | US Social Security |
USZipCode |
String | US ZIP code |
USState |
String | US state code |
GermanVATNumber |
String | German VAT |
UKPostcode |
String | UK postcode |
IndianPAN |
String | Indian tax ID |
Analytics Types
| Type | Base | Description |
|---|---|---|
Percentage |
Decimal | 0-100 percentage |
Rate |
Decimal | 0-1 rate |
Score |
Decimal | 0-100 score |
MRR |
Decimal | Monthly recurring revenue |
NPSScore |
Integer | Net promoter score |
Browse all types:
goquality types
goquality types --tag finance
goquality types --search email
Custom Validators (Plugins)
GoQuality supports custom validation logic via Python plugins.
Creating a Validator
# .goquality/plugins/my_validators.py
from goquality.plugins import register_validator
@register_validator("is_palindrome", description="Check if string is palindrome")
def is_palindrome(value: str) -> bool:
clean = value.lower().replace(" ", "")
return clean == clean[::-1]
@register_validator("divisible_by", description="Check divisibility")
def divisible_by_three(value: int) -> bool:
return value % 3 == 0
Built-in Advanced Validators
| Validator | Description |
|---|---|
luhn |
Luhn checksum (credit cards) |
iban |
IBAN checksum |
isbn10 |
ISBN-10 checksum |
isbn13 |
ISBN-13 checksum |
ean13 |
EAN-13 barcode checksum |
upc |
UPC-A barcode checksum |
email_format |
Email format validation |
ipv4 |
IPv4 address format |
ipv6 |
IPv6 address format |
mac_address |
MAC address format |
json |
Valid JSON string |
base64 |
Valid Base64 encoding |
future_date |
Date in the future |
past_date |
Date in the past |
Security & Observability
Security Features
GoQuality includes comprehensive security features for production use:
SQL Injection Prevention
All contract SQL is validated before execution:
- Only SELECT statements are allowed
- Dangerous keywords are blocked (DROP, DELETE, UPDATE, etc.)
- SQL is parsed and validated using AST analysis
- Invalid SQL is rejected at config load time
contracts:
- name: safe_contract
sql: SELECT * FROM users WHERE active = true # ✅ Valid
expect: not_empty
# This would be rejected:
# sql: DROP TABLE users # ❌ Rejected
Query Timeout
Configure query timeouts to prevent hanging validations:
# Via environment variable
export GOQUALITY_QUERY_TIMEOUT_SECONDS=600
# Or in .env file
GOQUALITY_QUERY_TIMEOUT_SECONDS=600
Default timeout: 300 seconds (5 minutes)
Observability Features
Structured Logging
GoQuality supports structured logging to files:
# Log to file
export GOQUALITY_LOG_FILE=goquality.log
# JSON format for log aggregation
export GOQUALITY_LOG_JSON=true
# Set log level
export GOQUALITY_LOG_LEVEL=DEBUG
Log files automatically rotate (10MB max, 5 backups).
Metrics Collection
Collect validation metrics for analysis:
# Save metrics to JSON
goquality check --source postgres://... --metrics-file metrics.json
Metrics include:
- Overall statistics (tables, columns, checks, pass rates)
- Type validation metrics
- Reference validation metrics
- Contract validation metrics
- Performance metrics (query times, durations)
Example metrics output:
{
"run_id": "abc123",
"timestamp": "2024-01-15T10:30:00Z",
"total_tables": 10,
"total_columns": 45,
"total_checks": 45,
"passed_checks": 42,
"failed_checks": 3,
"duration_seconds": 12.5,
"pass_rate": 0.933,
"query_count": 45,
"avg_query_time_seconds": 0.278
}
Performance Profiling
Enable performance profiling to identify bottlenecks:
goquality check --source postgres://... --profile
Profiling shows:
- Query execution times
- Table validation durations
- Slowest tables
- Overall performance summary
Example output:
Performance Summary:
Total duration: 12.50s
Tables validated: 10
Total queries: 45
Total rows: 1,234,567
Avg query time: 0.278s
Slowest table: orders (3.45s)
Environment Variables
| Variable | Description | Default |
|---|---|---|
GOQUALITY_QUERY_TIMEOUT_SECONDS |
Query timeout in seconds | 300 |
GOQUALITY_LOG_FILE |
Path to log file | None (stderr) |
GOQUALITY_LOG_JSON |
Output logs as JSON | false |
GOQUALITY_LOG_LEVEL |
Log level (DEBUG, INFO, WARNING, ERROR) | INFO |
GOQUALITY_LOG_FILE_MAX_BYTES |
Max log file size before rotation | 10485760 (10MB) |
GOQUALITY_LOG_FILE_BACKUP_COUNT |
Number of backup log files | 5 |
Troubleshooting
Common Issues
"Config file not found"
# Create a config file
goquality init
# Or specify path
goquality check --config path/to/config.yaml
"Unknown type: X"
# List available types
goquality types --search X
# Check if custom type is defined in config
goquality validate
"Connection failed"
# Run diagnostics
goquality doctor --source YOUR_CONNECTION_STRING
# Check if driver is installed
pip install goquality[postgres] # or [snowflake], [bigquery]
"LLM API error"
# Check API key is set
echo $OPENAI_API_KEY
# Try different provider
goquality generate --source ... --provider anthropic
goquality generate --source ... --provider ollama
Debug Mode
# Enable verbose logging
GOQUALITY_DEBUG=1 goquality check --source ...
# Or use log level
GOQUALITY_LOG_LEVEL=DEBUG goquality check --source ...
Logging to File
# Log to file
export GOQUALITY_LOG_FILE=goquality.log
goquality check --source postgres://localhost/mydb
# JSON format for log aggregation
export GOQUALITY_LOG_JSON=true
export GOQUALITY_LOG_FILE=goquality.log
goquality check --source postgres://localhost/mydb
Notification Environment Variables
# Webhook URL for notifications (alternative to --webhook flag)
export GOQUALITY_WEBHOOK_URL=https://your-api.com/goquality-results
# Slack webhook URL for notifications (alternative to --slack-webhook flag)
export GOQUALITY_SLACK_WEBHOOK=https://hooks.slack.com/services/xxx/yyy/zzz
Getting Help
# General help
goquality --help
# Command-specific help
goquality check --help
goquality generate --help
License
MIT License - see LICENSE for details.
Contributing
Contributions welcome! See CONTRIBUTING.md for guidelines.
Links
- Documentation: https://goquality.dev/docs
- GitHub: https://github.com/goquality/goquality
- PyPI: https://pypi.org/project/goquality/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file goquality-0.2.0.tar.gz.
File metadata
- Download URL: goquality-0.2.0.tar.gz
- Upload date:
- Size: 325.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0633f17e5b819ca9dc8bd0a5bdd800607b03d59cf4cc15c995fb7465229b8b49
|
|
| MD5 |
95e1b36fb90772cde3d8dac07e0ceb77
|
|
| BLAKE2b-256 |
74385f1104b18b434bd2a0f99746463c25cbf2bd6921b63ac301a5875c706b3e
|
File details
Details for the file goquality-0.2.0-py3-none-any.whl.
File metadata
- Download URL: goquality-0.2.0-py3-none-any.whl
- Upload date:
- Size: 290.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c0c038e088baae6952566219335aca454cded272f2ffcf6205c34eb14cf9bb3
|
|
| MD5 |
f6541c669a88ab598c2a288655a3a508
|
|
| BLAKE2b-256 |
1ea6dbbab06f5478d96b86d99eb01f77e1424bcf60ef06390a4e2c2f43d6de79
|