AI-Native Data Governance: TypeScript for Databases
Project description
GoQuality CLI
AI-Native Data Governance: TypeScript for Databases
GoQuality brings type safety to your data. Define types once, validate everywhere. Let AI generate the types, you govern the rules.
┌─────────────────────────────────────────────────────────────────┐
│ Database → AI Inference → YAML Types → Validation → ✓ │
│ │
│ "email" Email pattern: ^... 99.8% valid │
│ "amount" USD min: 0 100% valid │
│ "status" OrderStatus enum: [...] 98.2% valid │
└─────────────────────────────────────────────────────────────────┘
Installation
# Basic installation
pip install goquality
# With PostgreSQL support
pip install goquality[postgres]
# With Snowflake support
pip install goquality[snowflake]
# With BigQuery support
pip install goquality[bigquery]
# With all database drivers
pip install goquality[all]
# Development installation
pip install goquality[dev]
Quick Start
# 1. Initialize a new project
goquality init
# 2. Generate types from your database using AI
goquality generate --source postgres://user:pass@localhost/mydb
# 3. Review and edit the generated goquality.yaml
# 4. Run validation checks
goquality check --source postgres://user:pass@localhost/mydb
# 5. Diagnose any issues
goquality doctor --source postgres://user:pass@localhost/mydb
Commands
goquality init
Initialize a new GoQuality configuration file.
goquality init [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--source |
-s |
Database connection string to test |
--path |
-p |
Path for configuration file (default: goquality.yaml) |
Examples:
# Create default config
goquality init
# Create config and test database connection
goquality init --source postgres://localhost/mydb
# Create config at custom path
goquality init --path config/goquality.yaml
goquality generate
Generate type mappings using AI inference. Profiles your database schema and uses an LLM to suggest appropriate types for each column.
goquality generate [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--source |
-s |
Database connection string (required) |
--output |
-o |
Output file path (default: goquality.yaml) |
--schema |
Database schema to profile | |
--provider |
LLM provider: openai, anthropic, ollama (default: openai) |
Environment Variables:
OPENAI_API_KEY- Required for OpenAI providerANTHROPIC_API_KEY- Required for Anthropic providerOLLAMA_HOST- Ollama server URL (default:http://localhost:11434)
Examples:
# Generate using OpenAI (default)
goquality generate --source postgres://localhost/mydb
# Generate using Anthropic Claude
goquality generate --source postgres://localhost/mydb --provider anthropic
# Generate for specific schema
goquality generate --source postgres://localhost/mydb --schema public
# Generate using local Ollama
OLLAMA_HOST=http://localhost:11434 goquality generate \
--source postgres://localhost/mydb \
--provider ollama
goquality check
Run validation checks against your database. This is the core command that validates your data against the defined types.
goquality check [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--config |
-c |
Configuration file path (default: goquality.yaml) |
--source |
-s |
Database connection string |
--table |
-t |
Only check this specific table |
--output |
-o |
Output format: table, json, yaml, csv, markdown |
--fail-threshold |
Percentage of failures allowed (0-100) | |
--fail-on-error/--no-fail-on-error |
Exit with error code on failures (default: true) | |
--quiet |
-q |
Only show errors and summary |
Exit Codes:
0- All checks passed (or within threshold)1- Validation failures detected (above threshold)
Examples:
# Basic check
goquality check --source postgres://localhost/mydb
# Check specific table
goquality check --source postgres://localhost/mydb --table users
# Output as JSON (for CI/CD pipelines)
goquality check --source postgres://localhost/mydb --output json
# Allow up to 5% failures
goquality check --source postgres://localhost/mydb --fail-threshold 5
# Generate markdown report
goquality check --source postgres://localhost/mydb --output markdown > report.md
# Quiet mode for scripts
goquality check --source postgres://localhost/mydb --quiet
# Don't fail on errors (always exit 0)
goquality check --source postgres://localhost/mydb --no-fail-on-error
Output Formats:
Table (default):
┏━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Column ┃ Type ┃ Rows ┃ Valid % ┃ Status ┃ Details ┃
┡━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ email │ Email │ 10,000 │ 99.8% │ ✓ PASS │ │
│ status │ Status │ 10,000 │ 98.2% │ ✗ FAIL │ 180 invalid │
└──────────┴────────┴────────┴─────────┴──────────┴───────────────┘
JSON:
{
"summary": {
"total_checks": 5,
"passed": 4,
"failed": 1,
"failure_rate": 20.0,
"threshold": 0.0,
"threshold_passed": false
},
"tables": [...]
}
goquality validate
Validate configuration file syntax without connecting to a database.
goquality validate [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--config |
-c |
Configuration file to validate (default: goquality.yaml) |
Examples:
# Validate default config
goquality validate
# Validate specific config
goquality validate --config staging.yaml
goquality types
List and search available types in the standard library.
goquality types [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--search |
-s |
Search types by name or description |
--tag |
-t |
Filter by tag |
--show |
Show details for a specific type |
Examples:
# List all types
goquality types
# Search for email types
goquality types --search email
# Filter by tag
goquality types --tag finance
goquality types --tag healthcare
goquality types --tag regional
# Show type details
goquality types --show Email
goquality types --show CreditCardNumber
Available Tags:
core- Basic string/number typesfinance- Currency, banking, paymentshealthcare- Medical codes, identifiersecommerce- Products, orders, shippingsaas- API keys, tokens, SaaS identifiersregional- Country-specific formatsanalytics- Metrics, percentages, scoresiot- Sensors, devices, protocolspii- Personally identifiable information
goquality doctor
Diagnose your GoQuality environment and configuration.
goquality doctor [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--config |
-c |
Configuration file to check |
--source |
-s |
Database connection to test |
--verbose |
-v |
Show detailed information |
Checks Performed:
- Python version compatibility
- Core dependencies installed
- Database drivers available
- LLM providers configured
- Type library loading
- Configuration file validity
- Database connectivity
- Environment variables
Examples:
# Basic diagnostics
goquality doctor
# Check with database connection
goquality doctor --source postgres://localhost/mydb
# Verbose output
goquality doctor --verbose
goquality stats
Show statistics about the type library and configuration.
goquality stats [OPTIONS]
Options:
| Option | Short | Description |
|---|---|---|
--config |
-c |
Configuration file path |
Examples:
goquality stats
goquality version
Show version information.
goquality version
Configuration File
GoQuality uses YAML configuration files. The default file is goquality.yaml.
Full Example
# GoQuality Configuration
# https://goquality.dev/docs
# Custom type definitions (extend or override stdlib)
types:
# Simple type with pattern
- name: EmployeeId
description: "Internal employee identifier"
base: String
pattern: "^EMP-[0-9]{6}$"
min_length: 10
max_length: 10
# Type extending stdlib
- name: CorporateEmail
description: "Company email address"
base: String
extends: Email
pattern: "^[a-z.]+@acme\\.com$"
# Numeric type with range
- name: DiscountPercent
description: "Discount percentage"
base: Decimal
min: 0
max: 100
precision: 2
# Enum type
- name: Department
description: "Company department"
base: String
enum: ["engineering", "sales", "marketing", "hr", "finance"]
# Type with uniqueness constraint
- name: ProductSKU
description: "Unique product SKU"
base: String
pattern: "^[A-Z]{2}-[0-9]{6}$"
unique: true
# Model mappings (table → column types)
models:
- table: public.users
columns:
- name: id
type: UUID
- name: email
type: CorporateEmail
- name: employee_id
type: EmployeeId
- name: department
type: Department
- name: created_at
type: Timestamp
- table: public.orders
columns:
- name: id
type: UUID
- name: user_id
type: UUID
- name: total_amount
type: USD
- name: discount
type: DiscountPercent
allow_null: true # Override type's nullability
- name: status
type: OrderStatus
# Ad-hoc checks (quick SQL rules)
checks:
- "on": orders
name: "Order integrity"
rules:
- "total_amount >= 0"
- "created_at <= NOW()"
- "status IS NOT NULL"
- "on": users
name: "User constraints"
rules:
- "email IS NOT NULL"
- "created_at <= NOW()"
Type Definition Fields
| Field | Type | Description |
|---|---|---|
name |
string | PascalCase type name (required) |
description |
string | Human-readable description (required) |
base |
string | Base type: String, Integer, Decimal, Boolean, Date, Timestamp |
extends |
string | Parent type to inherit from |
pattern |
string | Regex pattern (String types) |
min_length |
int | Minimum string length |
max_length |
int | Maximum string length |
not_empty |
bool | Reject empty/whitespace strings |
min |
number | Minimum value (numeric types) |
max |
number | Maximum value (numeric types) |
precision |
int | Decimal places (Decimal type) |
enum |
array | Allowed values |
allow_null |
bool | Whether NULL is permitted (default: false) |
unique |
bool | Values must be unique |
foreign_key |
string | Reference table.column for FK validation |
tags |
array | Searchable tags |
examples |
array | Example valid values |
deprecated |
bool | Mark as deprecated |
Connection Strings
GoQuality supports multiple database backends via connection strings.
PostgreSQL
# Full format
postgres://user:password@host:port/database
# Examples
postgres://postgres:secret@localhost:5432/mydb
postgresql://user:pass@db.example.com/production
postgres://localhost/mydb # Local with defaults
DuckDB
# In-memory database
duckdb://:memory:
# File database
duckdb:///path/to/database.db
# CSV/Parquet files (auto-detected)
/path/to/data.csv
/path/to/data.parquet
./relative/path/data.csv
Snowflake
# Full format
snowflake://user@account/database/schema?warehouse=WAREHOUSE
# Examples
snowflake://john@xy12345/analytics/public?warehouse=COMPUTE_WH
snowflake://user@account/db/schema?warehouse=WH&role=ANALYST
Environment Variables:
export SNOWFLAKE_ACCOUNT=xy12345
export SNOWFLAKE_USER=john
export SNOWFLAKE_PASSWORD=secret
export SNOWFLAKE_DATABASE=analytics
export SNOWFLAKE_SCHEMA=public
export SNOWFLAKE_WAREHOUSE=COMPUTE_WH
BigQuery
# Format
bigquery://project-id/dataset
# Examples
bigquery://my-project/analytics
bigquery://prod-data-warehouse/sales
Environment Variables:
export GOOGLE_CLOUD_PROJECT=my-project
export BIGQUERY_DATASET=analytics
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
Connection Configuration File
Store multiple database connections in a YAML file for easy switching.
File Location
GoQuality looks for connection configs in:
.goquality/connections.yamlgoquality-connections.yaml~/.config/goquality/connections.yaml
Example
# .goquality/connections.yaml
# Default connection to use
default: dev
connections:
local:
connection_string: duckdb://:memory:
description: Local testing with DuckDB
dev:
dialect: postgres
host: localhost
port: 5432
database: myapp_dev
user: developer
password: devpass
description: Development database
staging:
dialect: postgres
host: ${STAGING_DB_HOST}
database: myapp_staging
user: ${STAGING_DB_USER}
password: ${STAGING_DB_PASSWORD}
description: Staging environment
prod:
connection_string: postgres://${PROD_USER}:${PROD_PASS}@prod.example.com/myapp
description: Production database (read-only)
warehouse:
dialect: snowflake
host: xy12345.snowflakecomputing.com
database: analytics
schema: public
user: ${SNOWFLAKE_USER}
password: ${SNOWFLAKE_PASSWORD}
options:
warehouse: COMPUTE_WH
role: ANALYST
Using Named Connections
# Use default connection
goquality check
# Use named connection
goquality check --source dev
goquality check --source staging
goquality check --source warehouse
CI/CD Integration
GitHub Actions
name: Data Quality
on:
push:
branches: [main]
schedule:
- cron: '0 6 * * *' # Daily at 6 AM
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install GoQuality
run: pip install goquality[postgres]
- name: Validate Configuration
run: goquality validate
- name: Run Data Quality Checks
run: |
goquality check \
--source ${{ secrets.DATABASE_URL }} \
--output json \
--fail-threshold 1 \
> results.json
- name: Upload Results
uses: actions/upload-artifact@v4
with:
name: quality-report
path: results.json
GitLab CI
data-quality:
image: python:3.11
stage: test
script:
- pip install goquality[postgres]
- goquality validate
- goquality check --source $DATABASE_URL --output markdown > report.md
artifacts:
paths:
- report.md
expire_in: 1 week
Pre-commit Hook
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: goquality-validate
name: Validate GoQuality Config
entry: goquality validate
language: system
files: goquality\.yaml$
pass_filenames: false
Standard Library Types
GoQuality includes 300+ pre-defined types organized by category.
Core Types
| Type | Base | Description |
|---|---|---|
Email |
String | Email address |
EmailNullable |
String | Optional email |
UUID |
String | UUID v4 |
URL |
String | HTTP/HTTPS URL |
PhoneNumber |
String | International phone |
Hostname |
String | DNS hostname |
Finance Types
| Type | Base | Description |
|---|---|---|
USD |
Decimal | US Dollar amount |
EUR |
Decimal | Euro amount |
CreditCardNumber |
String | Credit card (Luhn) |
IBAN |
String | International bank account |
BIC |
String | Bank identifier code |
ABARoutingNumber |
String | US routing number |
Healthcare Types
| Type | Base | Description |
|---|---|---|
ICD10 |
String | ICD-10 diagnosis code |
CPT |
String | CPT procedure code |
NPI |
String | National Provider ID |
NDC |
String | National Drug Code |
LOINC |
String | Lab test code |
E-commerce Types
| Type | Base | Description |
|---|---|---|
SKU |
String | Stock keeping unit |
UPC |
String | UPC-A barcode |
EAN13 |
String | EAN-13 barcode |
ASIN |
String | Amazon product ID |
ISBN13 |
String | Book ISBN-13 |
Regional Types
| Type | Base | Description |
|---|---|---|
SSN |
String | US Social Security |
USZipCode |
String | US ZIP code |
USState |
String | US state code |
GermanVATNumber |
String | German VAT |
UKPostcode |
String | UK postcode |
IndianPAN |
String | Indian tax ID |
Analytics Types
| Type | Base | Description |
|---|---|---|
Percentage |
Decimal | 0-100 percentage |
Rate |
Decimal | 0-1 rate |
Score |
Decimal | 0-100 score |
MRR |
Decimal | Monthly recurring revenue |
NPSScore |
Integer | Net promoter score |
Browse all types:
goquality types
goquality types --tag finance
goquality types --search email
Custom Validators (Plugins)
GoQuality supports custom validation logic via Python plugins.
Creating a Validator
# .goquality/plugins/my_validators.py
from goquality.plugins import register_validator
@register_validator("is_palindrome", description="Check if string is palindrome")
def is_palindrome(value: str) -> bool:
clean = value.lower().replace(" ", "")
return clean == clean[::-1]
@register_validator("divisible_by", description="Check divisibility")
def divisible_by_three(value: int) -> bool:
return value % 3 == 0
Built-in Advanced Validators
| Validator | Description |
|---|---|
luhn |
Luhn checksum (credit cards) |
iban |
IBAN checksum |
isbn10 |
ISBN-10 checksum |
isbn13 |
ISBN-13 checksum |
ean13 |
EAN-13 barcode checksum |
upc |
UPC-A barcode checksum |
email_format |
Email format validation |
ipv4 |
IPv4 address format |
ipv6 |
IPv6 address format |
mac_address |
MAC address format |
json |
Valid JSON string |
base64 |
Valid Base64 encoding |
future_date |
Date in the future |
past_date |
Date in the past |
Troubleshooting
Common Issues
"Config file not found"
# Create a config file
goquality init
# Or specify path
goquality check --config path/to/config.yaml
"Unknown type: X"
# List available types
goquality types --search X
# Check if custom type is defined in config
goquality validate
"Connection failed"
# Run diagnostics
goquality doctor --source YOUR_CONNECTION_STRING
# Check if driver is installed
pip install goquality[postgres] # or [snowflake], [bigquery]
"LLM API error"
# Check API key is set
echo $OPENAI_API_KEY
# Try different provider
goquality generate --source ... --provider anthropic
goquality generate --source ... --provider ollama
Debug Mode
# Enable verbose logging
GOQUALITY_DEBUG=1 goquality check --source ...
Getting Help
# General help
goquality --help
# Command-specific help
goquality check --help
goquality generate --help
License
MIT License - see LICENSE for details.
Contributing
Contributions welcome! See CONTRIBUTING.md for guidelines.
Links
- Documentation: https://goquality.dev/docs
- GitHub: https://github.com/goquality/goquality
- PyPI: https://pypi.org/project/goquality/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file goquality-0.1.0.tar.gz.
File metadata
- Download URL: goquality-0.1.0.tar.gz
- Upload date:
- Size: 131.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ba1f261dc6c0b2053b65269f7ae7f228444b6ace8277ad522cc15f18315b3ae
|
|
| MD5 |
3468836d6ae237b0aebcd0061bcfe673
|
|
| BLAKE2b-256 |
4830896cd38f9f9a36ebe1594a07c53e1660e156976a3f913f3955e40873852e
|
File details
Details for the file goquality-0.1.0-py3-none-any.whl.
File metadata
- Download URL: goquality-0.1.0-py3-none-any.whl
- Upload date:
- Size: 116.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f2b5e5fd00ea518937f5393e8944af28f30d34a5af2e3758911a167b69e6096
|
|
| MD5 |
0ca161db7cb2a790baeeba7cf1ec9e08
|
|
| BLAKE2b-256 |
2b38f8ae1fa387eb9bdedce408a0e049813eb3b1759e8daba3e1171008bd9b20
|