Skip to main content

Validoopsie is a simple and light data validation library.

Project description

validoopsie

A simple and easy to use Data Validation library for Python.

validoopsie

PyPI version PyPI Downloads Tests and Linters Documentation

Validoopsie

Validoopsie is a remarkably lightweight and user-friendly data validation library for Python. It’s designed to help you easily declare classes and chain validations together, in a style reminiscent of popular DataFrame libraries. This makes it a familiar and intuitive tool for developers who regularly work with dataframes.

Thanks to the excellent work by Narwhals, Validoopsie incorporates the "Bring Your Own DataFrame" (BYOD) concept. This flexibility allows you to use any DataFrame that Narwhals supports for your data validation tasks. To explore the full range of supported DataFrames, you can visit this link.

The syntax of Validoopsie has been thoughtfully crafted to ensure ease of use. Every validation function is encapsulated in its own method, which can be seamlessly linked together. This method-specific design prioritizes simplicity and readability, freeing you from the need to adapt to a new API each time you switch libraries. It allows you to focus on maintaining clean and understandable code.

Validoopsie draws significant inspiration from the Great Expectations library. It strives to distill the data validation process into something straightforward and efficient. Whether you're checking data integrity or ensuring compliance with data standards, Validoopsie provides a streamlined yet powerful solution to make these tasks accessible and straightforward.

Table of Contents

  1. Installation
  2. Getting Started
  3. Development
  4. License

Installation

  • pip

    pip install Validoopsie

  • uv

    uv add Validoopsie

Getting Started

Validoopsie is incredibly easy to use, so much so that you could do it half-asleep. The simplicity of the library is enhanced by the BYOD (Bring Your Own DataFrame) concept, where you merely need to utilize the Validate class and chain your desired validations together. This approach ensures that you can get started with minimal effort and without any unnecessary complexity.

import pandas as pd

from validoopsie import Validate

p_df = pd.DataFrame(
    {
        "name": ["John", "Doe", "Jane"],
        "target_name": ["John", "Doe", "Jane"],
        "last_name": ["Smith", "Smith", "Smith"],
        "age": [25, 30, 35],
    },
)

# `vd` stands for Validate Data
vd = Validate(p_df)
vd.EqualityValidation.PairColumnEquality(
    column="name",
    target_column="age",
    impact="high",
).UniqueValidation.ColumnUniqueValuesToBeInList(
    column="last_name",
    values=["Smith"],
).ValuesValidation.ColumnValuesToBeBetween(
    column="age",
    min_value=20,
    max_value=40,
)

vd.results

OUTPUT:

{
  "Summary": {
    "passed": false,
    "validations": [
      "PairColumnEquality_name",
      "ColumnUniqueValuesToBeInList_last_name",
      "ColumnValuesToBeBetween_age"
    ],
    "failed_validation": ["PairColumnEquality_name"]
  },
  "PairColumnEquality_name": {
    "validation": "PairColumnEquality",
    "impact": "high",
    "timestamp": "2025-10-07T11:25:04.213211+02:00",
    "column": "name",
    "result": {
      "status": "Fail",
      "threshold_pass": false,
      "message": "The column 'name' is not equal to the column'age'.",
      "failing_items": [
        "Doe - column name - column age - 30",
        "Jane - column name - column age - 35",
        "John - column name - column age - 25"
      ],
      "failed_number": 3,
      "frame_row_number": 3,
      "threshold": 0.0,
      "failed_percentage": 1.0
    }
  },
  "ColumnUniqueValuesToBeInList_last_name": {
    "validation": "ColumnUniqueValuesToBeInList",
    "impact": "low",
    "timestamp": "2025-10-07T11:25:04.216417+02:00",
    "column": "last_name",
    "result": {
      "status": "Success",
      "threshold_pass": true,
      "message": "All items passed the validation.",
      "frame_row_number": 3,
      "threshold": 0.0
    }
  },
  "ColumnValuesToBeBetween_age": {
    "validation": "ColumnValuesToBeBetween",
    "impact": "low",
    "timestamp": "2025-10-07T11:25:04.217300+02:00",
    "column": "age",
    "result": {
      "status": "Success",
      "threshold_pass": true,
      "message": "All items passed the validation.",
      "frame_row_number": 3,
      "threshold": 0.0
    }
  }
}

You can also display the validation results in a formatted table using the display_summary method, which provides a clean and readable view of your validation results:

vd.display_summary()

default display_summary

Display a full detailed table with all available information

vd.display_summary(information="full")

full display_summary

Customize the table format (supports tabulate formatting options)

vd.display_summary(tablefmt="pipe", maxcolwidths=20)

kwargs display_summary

The display_summary method supports two information levels:

  • "short" (default): Shows key metrics like timestamp, impact, status, validation type, column, threshold, and failure details
  • "full": Shows all available validation and result fields

You can also customize the table appearance using any tabulate formatting options such as tablefmt for different table styles (e.g., "github", "grid", "pipe", "html") and maxcolwidths to control column width.

To ensure that all your validations have been correctly executed and to handle any potential errors that may arise during the validation process, you can use the validate method. However, it's important to note that errors will only be raised if the impact level is set to high. Without this setting, potential issues may not trigger an error message.

NOTE: Raised error is a custom ValueError.

import pandas as pd

from validoopsie import Validate

p_df = pd.DataFrame(
    {
        "name": ["John", "Doe", "Jane"],
        "target_name": ["John", "Doe", "Jane"],
        "last_name": ["Smith", "Smith", "Smith"],
        "age": [25, 30, 35],
    },
)

# `vd` stands for Validate Data
vd = Validate(p_df)
vd.EqualityValidation.PairColumnEquality(
    column="name",
    target_column="age",
    impact="high",
).UniqueValidation.ColumnUniqueValuesToBeInList(
    column="last_name",
    values=["Smith"],
).ValuesValidation.ColumnValuesToBeBetween(
    column="age",
    min_value=20,
    max_value=40,
).validate()

Thanks to loguru output will provide a very condenced information on validations and their status in a colorful way.

validation output

Development

Validoopsie includes a Makefile to simplify development tasks:

# Install dependencies
make setup

# Run linters (mypy, ruff)
make lint

# Run tests (includes doctests, stubtest)
make test

# Run both lint and test
make all

For more information on development, check the contribution guidelines.

License

MIT © Validoopsie

Original Creator - Akmal Soliev

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

validoopsie-1.7.1.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

validoopsie-1.7.1-py3-none-any.whl (48.8 kB view details)

Uploaded Python 3

File details

Details for the file validoopsie-1.7.1.tar.gz.

File metadata

  • Download URL: validoopsie-1.7.1.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for validoopsie-1.7.1.tar.gz
Algorithm Hash digest
SHA256 08a9f216cb4449c8612fc792e490df060d1814a075973c89fe1712b6a06ff915
MD5 92aad111c7aaf9c6243a0a05becbb5bf
BLAKE2b-256 83e0a9f9c7c67150dfe732cfff614dcf6f8570c4b8dfa5be0aaa92d438c11e12

See more details on using hashes here.

Provenance

The following attestation bundles were made for validoopsie-1.7.1.tar.gz:

Publisher: publish.yaml on akmalsoliev/Validoopsie

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file validoopsie-1.7.1-py3-none-any.whl.

File metadata

  • Download URL: validoopsie-1.7.1-py3-none-any.whl
  • Upload date:
  • Size: 48.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for validoopsie-1.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4abcc84eb532f9d7fdc043f4730861222a6cfe94590411a3c20f261622b71035
MD5 37845002a46ff1c4c36a58777f42ea74
BLAKE2b-256 c413fa9114839546112438342ee9930ca673b11a8e86b280637415205b5f2933

See more details on using hashes here.

Provenance

The following attestation bundles were made for validoopsie-1.7.1-py3-none-any.whl:

Publisher: publish.yaml on akmalsoliev/Validoopsie

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page