Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

A minimal, yet powerful schema validation library for python

Project description

https://img.shields.io/pypi/v/schemalite.svg https://img.shields.io/travis/SuryaSankar/schemalite.svg Documentation Status

A minimal, yet powerful schema validation library for python

Features

Because I started writing it before I came across Cerberus, Marshmallow and Schema.

While the other schema validation libraries have powerful DSLs, they also have too big an API for simple needs. This library has only one concept I need to keep in mind - A validator is a function that will return a tuple like (False, “Some error message”) or (True, None). Thats all.

A schema is a dict with 2 keys - “fields” and “validators”

validators is a list of validator functions to apply on the input data as a whole instead of at field level. It should again be a function which returns a tuple as output, while accepting a single dictionary as input ( corresponding to the whole input data )

fields - is a dict with keys corresponding to the names of the keys of the dictionary which is being validated. Each field is in turn a dict, which has one or more of the following optional keys

  1. required - True/False. Alternatively it can also be a function which accepts the input dictionary and outputs True/False as output. If not specified, field is assumed to be not required.

  2. type - The type of the data the field is expecting. It can be any valid pythonic type - int / str / unicode / date / datetime / Decimal / list / dict ( or anything else which is a python type). It can also be a list of types in which case the data should be of any one of those types.

  3. validators - A list of validator functions. The function should accept 2 arguments - The value of the particular key being processed and the whole dictionary itself (in case the validator needs access to the whole data instead of that field alone to decide whether the value is valid). It has to return a tuple. Either (True, None) or (False, “some error message”) ( The error need not be a string. It can be any valid json serializable data structure - a list or dict also)

  4. permitted_values - A list of permitted values for the field.

  5. If type is list, you can send the following fields also

    i. list_item_type - Tells the type of each item in the list. It can also be any Python type or a list of types. ii. list_item_schema - If list_item_type is dict, then you can optionally provide list_item_schema also - to validate each dict in the list against another schema

  6. If type is dict, then you can send the following field dict_schema - The schema to validate the dict against.

At both field and schema level, all validators will be applied one after another and their errors will be collected together in the output.

To apply the validator, you can call validate_dict(dictionary, schema) ( or validate_list_of_dicts(list_of_dicts, dict_schema))

The output itself will be tuple of the same format as what we defined above for validators.

Example:

Lets define 2 schemas

person_schema = {
    "fields": {
        "name": {
            "required": True,
            "type": (str, unicode)
        },
        "gender": {
            "required": True,
            "type": (str, unicode),
            "permitted_values": ("Male", "Female")
        },
        "age": {
            "required": func_and_desc(
                lambda person: person['gender']=='Female',
                "Required if gender is female"),
            "type": int,
            "validators": [
                func_and_desc(
                    lambda age, person: (False, "Too old")
                    if age > 40 else (True, None),
                    "Has to be less than 40")
            ]
        },
        "access_levels": {
            "type": list,
            "list_item_type": int,
            "permitted_values_for_list_items": range(1, 10)
        }
    },
}

org_schema = {
    "fields": {
        "name": {
            "required": True,
            "type": (str, unicode)

        },
        "ceo": {
            "required": True,
            "type": dict,
            "dict_schema": person_schema
        },
        "members": {
            "required": True,
            "type": list,
            "list_item_type": dict,
            "list_item_schema": person_schema
        }
    },
    "validators": [
        func_and_desc(
            lambda org: (False, "Non member cannot be CEO")
            if org["ceo"] not in org["members"] else (True, None),
            "Non member cannot be CEO")
    ],
    "allow_unknown_fields": True
}

And some data to validate against the schema

isaac = {"gender": "Male", "name": "Isaac", "age": "new", "access_levels": [1,4,60]}
surya = {"gender": "Male", "name": "Surya", "age": "h", "city": "Chennai"}
senthil = {"gender": "Male", "name": "Senthil"}
mrx = {"gender": "m", "name": "x"}
sharanya = {
    "gender": "Female", "name": "Sharanya",
    "access_levels": [4, 5, 60]}

Lets first validate some persons

validate_dict(mrx, person_schema)

Output is

(False,
{
    'FIELD_LEVEL_ERRORS': {
        'gender': {
            'PERMITTED_VALUES_ERROR': 'Field data can be one of the following only: Male/Female'
        }
    }
})

Another person

validate_dict(surya, person_schema)

Output

(False,
{
    'FIELD_LEVEL_ERRORS': {
        'age': {
            'HAS_TO_BE_LESS_THAN_40': 'Too old',
            'TYPE_ERROR': 'Field data should be of type int'
        }
    },
'UNKNOWN_FIELDS': ['city']
})

Now validating the same person, but allowing unknown fields

validate_dict(surya, person_schema, allow_unknown_fields=True)

Output

(False,
{
    'FIELD_LEVEL_ERRORS': {
        'age': {
            'HAS_TO_BE_LESS_THAN_40': 'Too old',
            'TYPE_ERROR': 'Field data should be of type int'
        }
    }
})

Finally lets create an organization and validate it

inkmonk = {
    "name": "Inkmonk",
    "ceo": isaac,
    "members": [surya, senthil, sharanya],
    "city": "Chennai"
}
validate_dict(inkmonk, org_schema)

Output

(False,
{
    'FIELD_LEVEL_ERRORS': {
        'ceo': {
            'VALIDATION_ERRORS_FOR_OBJECT': {
                'FIELD_LEVEL_ERRORS': {
                    'access_levels': {
                        'VALIDATION_ERRORS_FOR_OBJECTS_IN_LIST': [
                            None,
                            None,
                            {
                                'PERMITTED_VALUES_ERROR': 'Field data can be one of the following only: 1/2/3/4/5/6/7/8/9'
                            }
                        ]
                    },
                    'age': {
                        'HAS_TO_BE_LESS_THAN_40': 'Too old',
                        'TYPE_ERROR': 'Field data should be of type int'
                    }
                }
            }
        },
        'members': {
            'VALIDATION_ERRORS_FOR_OBJECTS_IN_LIST': [
                {
                    'FIELD_LEVEL_ERRORS': {
                        'age': {
                            'HAS_TO_BE_LESS_THAN_40': 'Too old',
                            'TYPE_ERROR': 'Field data should be of type int'
                        }
                    },
                    'UNKNOWN_FIELDS': ['city']
                },
                None,
                {
                    'FIELD_LEVEL_ERRORS': {
                        'access_levels': {
                            'VALIDATION_ERRORS_FOR_OBJECTS_IN_LIST': [
                                None,
                                None,
                                {
                                    'PERMITTED_VALUES_ERROR': 'Field data can be one of the following only: 1/2/3/4/5/6/7/8/9'
                                }
                            ]
                        },
                        'age': {
                            'MISSING_FIELD_ERROR': 'Required if gender is female'
                        }
                    },
                    'MISSING_FIELDS': ['age']
                }
            ]
        }
    },
'SCHEMA_LEVEL_ERRORS': ['Non member cannot be CEO'],
'UNKNOWN_FIELDS': ['city']
})

###Understanding the errors output

The library is structured to provide an error output to any nested level of granularity.

At the outer most level, there are the following keys

“FIELD_LEVEL_ERRORS” - Contains the errors mapped to each field

“SCHEMA_LEVEL_ERRORS” - A list of errors found for the schema as a whole

“UNKNOWN_FIELDS” - If the validation is configured to not allow unknown fields and if the data had any, they will be listed here

“MISSING_FIELDS” - List of all missing required fields.

Inside ‘FIELD_LEVEL_ERRORS’, each field will have a dict of errors mapped to it. The keys of the dict are the names of the errors and values are the error strings. Example for an error dict for a field would be {‘TYPE_ERROR’: “This field should have type int only”} or {“PERMITTED_VALUES_ERROR”: “The object should have value high/low only”}

If a particular field is of type dict, and if dict_schema is defined, then you can also expect to see a key named VALIDATION_ERRORS_FOR_OBJECT inside errors[‘FIELD_LEVEL_ERRORS’][‘particular_field_name’]. In that case errors[‘FIELD_LEVEL_ERRORS’][‘particular_field_name’][‘VALIDATION_ERRORS_FOR_OBJECT’] will contain another errors object obtained by matching the data in this field alone against another schema ( So that errors object will in turn have FIELD_LEVEL_ERRORS, SCHEMA_LEVEL_ERRORS etc)

If a particular field is of type list and if list_type is defined, then if there are validation errors for the objects in the list, you can expect to see errors[‘FIELD_LEVEL_ERRORS’][‘particular_field_name’][‘VALIDATION_ERRORS_FOR_OBJECTS_IN_LIST’]. This will be a list of error objects. If the field is a list of primitive types, then you can expect each error object to have fields like TYPE_ERROR or PERMITTED_VALUES_ERROR. If it is a list of objects of another schema ( defined by list_item_schema), then each item in the errors list would be an error object got by validating against that schema - so it will have FIELD_LEVEL_ERRORS, SCHEMA_LEVEL_ERRORS etc. ( While iterating, if one item has no error, then instead of error object, it will have a null in the errors list at that index.)

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for schemalite, version 0.2.1
Filename, size File type Python version Upload date Hashes
Filename, size schemalite-0.2.1.tar.gz (19.1 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page