Skip to main content

A flexible, extensible command-line tool for automated data quality validation

Project description

ValidateLite

Python 3.8+ License: MIT Code Coverage

ValidateLite: A lightweight data validation tool for engineers who need answers, fast.

Unlike other complex data validation tools, ValidateLite provides two powerful, focused commands for different scenarios:

  • vlite check: For quick, ad-hoc data checks. Need to verify if a column is unique or not null right now? The check command gets you an answer in 30 seconds, zero config required.

  • vlite schema: For robust, repeatable database schema validation. It's your best defense against schema drift. Embed it in your CI/CD and ETL pipelines to enforce data contracts, ensuring data integrity before it becomes a problem.


Core Use Case: Automated Schema Validation

The vlite schema command is key to ensuring the stability of your data pipelines. It allows you to quickly verify that a database table or data file conforms to a defined structure.

Scenario 1: Gate Deployments in CI/CD

Automatically check for breaking schema changes before they get deployed, preventing production issues caused by unexpected modifications.

Example Workflow (.github/workflows/ci.yml)

jobs:
  validate-db-schema:
    name: Validate Database Schema
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.9'

      - name: Install ValidateLite
        run: pip install validatelite

      - name: Run Schema Validation
        run: |
          vlite schema --conn "mysql://${{ secrets.DB_USER }}:${{ secrets.DB_PASS }}@${{ secrets.DB_HOST }}/sales" \
                           --rules ./schemas/customers_schema.json

Scenario 2: Monitor ETL/ELT Pipelines

Set up validation checkpoints at various stages of your data pipelines to guarantee data quality and avoid "garbage in, garbage out."

Example Rule File (customers_schema.json)

{
  "customers": {
    "rules": [
      { "field": "id", "type": "integer", "required": true },
      { "field": "name", "type": "string", "required": true },
      { "field": "email", "type": "string", "required": true },
      { "field": "age", "type": "integer", "min": 18, "max": 100 },
      { "field": "gender", "enum": ["Male", "Female", "Other"] },
      { "field": "invalid_col" }
    ]
  }
}

Run Command:

vlite schema --conn "mysql://user:pass@host:3306/sales" --rules customers_schema.json

Quick Start: Ad-Hoc Checks with check

For temporary, one-off validation needs, the check command is your best friend.

1. Install (if you haven't already):

pip install validatelite

2. Run a check:

# Check for nulls in a CSV file's 'id' column
vlite check --conn "customers.csv" --table customers --rule "not_null(id)"

# Check for uniqueness in a database table's 'email' column
vlite check --conn "mysql://user:pass@host/db" --table customers --rule "unique(email)"

Learn More


📝 Development Blog

Follow the journey of building ValidateLite through our development blog posts:


📄 License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

validatelite-0.4.2.tar.gz (285.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

validatelite-0.4.2-py3-none-any.whl (160.4 kB view details)

Uploaded Python 3

File details

Details for the file validatelite-0.4.2.tar.gz.

File metadata

  • Download URL: validatelite-0.4.2.tar.gz
  • Upload date:
  • Size: 285.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for validatelite-0.4.2.tar.gz
Algorithm Hash digest
SHA256 b92b2bd9b0fa2260e2f3cb2d4529ca5b1e61718b27a2b9e227a64306c1cd0215
MD5 805c6a82883d447b5565fd06af30fa51
BLAKE2b-256 d4d1651b2402a1c32e8b433e30f992e7c78afa46973a1a574449b21e059c1f1a

See more details on using hashes here.

File details

Details for the file validatelite-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: validatelite-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 160.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for validatelite-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0abc653fe06d31ef12847283800c1466eae4a5a18283c3f535c09fa0281a978b
MD5 deb8b9c1f8fbca3afc32941f01b9042c
BLAKE2b-256 5cb7b72312d0dff165ddd1514f9e2f6987a990d5f1dc8b34d4773f50936eced0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page