Skip to main content

For when truth is a little fuzzy.

Project description

koalified

PyPI version Build Status Coverage Status License Join the chat at https://gitter.im/domaintools/koalified_python

Koalified

for when truth is a little fuzzy.

As engineers, we would love it if all our source data was of perfect quality and structured identically. However, this is not something we always have control over. Koalified is built for the cases where you don’t have control over the source data but want to capture as much data as possible, while getting a sense of the quality of that data versus your known ideal.

Koalified allows you to specify: - What data must contain - What data can contain - What data ideally contains

All within one single and concise schema definition.

It also has built-in support for pulling schemas from a central schema service or service system and composing schemas together.

Koalified is built on top of a YAML base with the following symbol based rules:

  • ! = required

  • ? = fully optional (won’t impact score)

  • + = multiple

  • ~ = weight, needs to be followed by a number (name~20). The default weight is 1.

  • @ = extend schema (provide URI of schema to extend)

  • & = include/nest schema (provide URI of schema to include)

  • = = allow validator to mutate the given data

  • ** = a field name that represents all extra undefined keys in the input data. Can be used to include and normalize extra data than what is strictly defined. All extra data is

Using koalified

Creating a schema:

from koalified.schema import Schema

schema = Schema(text="""
name:
    - match [A-z]
    - str= longest=10:int cut=true:bool
age: int minimum=18:int maximum=120:int
contact+!:
    phone!:
       - phone=
    fax:
       - phone=""")

You can either pass in the YAML data directly, as shown above, or pass in an http or local disc location.

When creating the schema object can specify the following instantiation arguments:

  • fail_fast: (default: True) if set to True, will fail after first requirement is not met, and raise only that exception. If set to False, will collect and return all encountered errors.

  • score_fields: (default: False) if set to True, a score will be returned for all individual fields in addition to the overall score.

  • explain: (default: False) if set to True, a detailed explanation behind the scoring will be returned.

  • allow_imports: (default: True) if set to True, the schema will be allowed to import and extend other schemas either locally or over http.

  • precompile: (default: False) if set to True, the schema will immediately be compiled upon instantiation of the class. If set to False, the schema is compiled upon it’s first use.

  • supported_types: (default: None) a dictionary of type_names to callables that will cast into the given type or raise an exception. Can be used to add custom schema types.

Using a schema:

schema({'name': 'timothy', 'age': 29, 'contact': [{'fax':'1800phonenumber', 'phone': '5555555555'}]}) == \
       {'__metadata__': {'schema_version': '4f5f88bc', 'score': 0.75}, 'age': 29, 'contact': [{'fax': '1800phonenumber', 'phone': '5555555555'}], 'name': 'timothy'}

Installing koalified

Installing koalified is as simple as:

pip3 install koalified --upgrade

Ideally, within a virtual environment.

Why koalified?

Koalified was built to help solve the case where the source of data can’t be fully trusted but needs to be stored. It allows specifying what should be, what can be, and what must be, in one concise schema definition.


Thanks and I hope you find koalified helpful!

~Timothy Crosley

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

koalified-0.0.3.tar.gz (11.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page