Skip to main content

Are your data meeting all your expecations

Project description

Data Expectations

Are your data meeting your expectations?


License Regression Suite Static Analysis codecov Downloads Code style: black PyPI Latest Release FOSSA Status

Data Expectations is a Python library which takes a delarative approach to asserting qualities of your datasets. Instead of tests like is_sorted to determine if a column is ordered, the expectation is column_values_are_increasing. Most of the time you don't need to know how it got like that, you are only interested what the data looks like now.

Expectations can be used alongside, or in place of a schema validator, however Expectations is intended to perform validation of the data in a dataset, not the structure of a table. Records should be a Python dictionary (or dictionary-like object) and can be processed one-by-one, or against an entire list of dictionaries.

Data Expectations was inspired by the great Great Expectations library, but we wanted something lighter and easier to quickly set up and run. Data Expectations can do less, but it does it with a fraction of the effort and has zero dependencies.

Use Cases

  • Use Data Expectations was as a step in data processing pipelines, testing the data conforms to expectations before it is committed to the warehouse.
  • Use Data Expectations to simplify validating user supplied values.

Provided Expectations

  • expect_column_to_exist (column)
  • expect_column_values_to_not_be_null (column)
  • expect_column_values_to_be_of_type (column, expected_type, ignore_nulls:true)
  • expect_column_values_to_be_in_type_list (column, type_list, ignore_nulls:true)
  • expect_column_values_to_be_more_than (column, threshold, ignore_nulls:true)
  • expect_column_values_to_be_less_than (column, threshold, ignore_nulls:true)
  • expect_column_values_to_be_between (column, maximum, minimum, ignore_nulls:true)
  • expect_column_values_to_be_increasing (column, ignore_nulls:true)
  • expect_column_values_to_be_decreasing (column, ignore_nulls:true)
  • expect_column_values_to_be_in_set (column, symbols, ignore_nulls:true)
  • expect_column_values_to_match_regex (column, regex, ignore_nulls:true)
  • expect_column_values_to_match_like (column, like, ignore_nulls:true)
  • expect_column_values_length_to_be (column, length, ignore_nulls:true)
  • expect_column_values_length_to_be_between (column, maximum, minimum, ignore_nulls:true)

Install

pip install data_expectations

Data Expectations has no external dependencies, can be used ad hoc and in-the-moment without complex set up.

Example Usage

Testing Python Dictionaries

import data_expectations as de
from data_expectations import Expectation
from data_expectations import Behaviors

TEST_DATA = {"name": "charles", "age": 12}

set_of_expectations = [
    Expectation(Behaviors.EXPECT_COLUMN_TO_EXIST, column="name"),
    Expectation(Behaviors.EXPECT_COLUMN_TO_EXIST, column="age"),
    Expectation(Behaviors.EXPECT_COLUMN_VALUES_TO_BE_BETWEEN, column="age", config={"minimum": 0, "maximum": 120}),
]

expectations = de.Expectations(set_of_expectations)
try:
    de.evaluate_record(expectations, TEST_DATA)
except de.errors.ExpectationNotMetError:  # pragma: no cover
    print("Data Didn't Meet Expectations")

Testing individual Values:

import data_expectations as de
from data_expectations import Expectation
from data_expectations import Behaviors

expectation = Expectation(Behaviors.EXPECT_COLUMN_VALUES_TO_BE_BETWEEN, column="age", config={"minimum": 0, "maximum": 120})

try:
    expectation.test_value(55)
except de.errors.ExpectationNotMetError:  # pragma: no cover
    print("Data Didn't Meet Expectations")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_expectations-1.7.0.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

data_expectations-1.7.0-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file data_expectations-1.7.0.tar.gz.

File metadata

  • Download URL: data_expectations-1.7.0.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for data_expectations-1.7.0.tar.gz
Algorithm Hash digest
SHA256 4b945093f32d89d5e743fb074132a62c14171064c27541b66b831b3f083fe79f
MD5 207559b0229cdb39d6cf91073a775ab9
BLAKE2b-256 1127277de37f834f979cbd72b2bcf0692b9a2d120eba3adf7f70682e31dee45e

See more details on using hashes here.

File details

Details for the file data_expectations-1.7.0-py3-none-any.whl.

File metadata

File hashes

Hashes for data_expectations-1.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cf8b4599ddb2d7294431dd43d099b74375ace255e005707cbfea265105984037
MD5 d8e9d7db2a60997bce2d8cd803acc45d
BLAKE2b-256 71f035281bd37b8cfdb0f6829d826b2bdabbe6fa3d2f0d12aa9593cf47112d0e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page