Python utility for validating IDS

These details have not been verified by PyPI

Project links

GitHub Statistics

Project description

TetraScience IDS Validator

Overview
Usage
Components
Changelog
- v1.0.0

Overview

Each validation check should lead to a pass or fail
Find as many failures as possible before terminating the validator, to make it easier to fix what’s wrong.
Take definitions into account by using the "jsonref" library

The validator will validate these files in an IDS folder:

schema.json
elasticsearch.json
athena.json (upcoming)

You can find the validation rules in:

Usage

pipenv run python -m ids_validator path/to/ids/folder

This will run the required checks for the @idsConventionVersion mentioned in schema.json.

If @idsConventionVersion is missing in schema.json or if it is not supported by schema_validator, only generic checks will be run.

Components

Node

Node: UserDict class is an abstraction for dict in schema.json
When crawling schema.json, each key-value pair where value is a dict, is casted into Node type.
For each K_V pair, Node has following attributes
- name (default=root): The key
- data: The value:dict
- path (default=root): The fully-qualified path for the key in schema.json
File: ids_node.py

Checker Classes

A checker class must implement AbstractChecker
When crawling schema.json, its run() method will be called for each node.
run() implements the rules/condition to be checked for validating the node.
run() accepts two arguments:
- node: Node: Node for which we are running the checks
- context: dict
  - It contains python dicts for schema.json, athena.json and convention_version.
  - It is used to supplementary data required for running complex checks.

Validator

Validator class is the one that implements the crawler.
It has following attributes:
- ids: dict: schema.json converted to python dict
- athena: dict: athena.json converted to python dict
- checks_list: A list of instantiated checker classes. These list of checks will be run for each node
Validator.traverse_ids() crawls from Node to Node in ids:dict, Calling run() for each checker in the checks_list on the node

List of Checker Classes

Base Classes

AbstractChecker
- Every checker class must implement it.
- File: abstract_checker.py
RuleBasedChecker
- It is base class that allows validating Node against a set of rules
- It comes in handy for implementing checks for property Nodes that has predefined template
- The child class inheriting RulesBasedChecker must define rules
- rules is a dict that maps Node.path to set of rules:dict
- The set of rules for a Node.path may contain following keys:
  - type: str: defines what should be the type value for the Node
  - min_properties: list: defines minimum set of property names, that must exist for the Node. More properties can exist in addition to min_properties
  - properties: list: defines a set of property names that must must exactly match the property list of the Node
  - min_required: list: The required list of the Node must at least contain the values mentioned in min_required
  - required: list: The required list of the Node must only contain values listed in required
Rules based checkers defined for v1 conventions can be found here

Generic

AdditionalPropertyChecker: additional_property.py
RequiredPropertiesChecker: required_property.py
DatacubesChecker: datacubes.py
RootNodeChecker: root_node.py
TypeChecker: type_check.py
AthenaChecker: athena.py WorkInProgress

V1

V1ChildNameChecker: child_name.py
V1ConventionVersionChecker: convention_version_check.py
V1SystemNodeChecker: nodes_checker.py
V1SampleNodeChecker: nodes_checker.py
V1UserNodeChecker: nodes_checker.py
V1RootNodeChecker: root_node.py
V1SnakeCaseChecker: snake_case.py

Writing New Checks

Checkers must implement AbstractCheckers
run() method implement one or more checks for the node
In case of no failure an empty list must be returned
In case of failures, it must return a list of one or more tuple
The tuple will contain two values
- log message:str: The message to be logged when check fails
- criticality: either Log.CRITICAL or Log.WARNING

Extending Checkers Classes

Pattern 1

class ChildChecker(ParentChecker):
    def run(node: Node, context: dict):
        logs = []
        # Implement new checks and append failure to logs

        # Run Parent checkers and append logs
        logs += super().run(node, context)
        return logs

If check_list passed to Validator contains the ChildChecker, then it must not contain ParentChecker in the same list. Doing so will cause ParentCheck to run twice and populate failures logs if any, twice.

TODO: Instead of return logs, we can return set(logs) to remove duplicates, but we cannot avoid executing same code twice

Pattern 2

class ChildChecker(ParentChecker):
    def run(node: Node, context: dict):
        logs = []
        # Implement new checks and append failure to logs
        # use or override helper function of the parent class
        return logs

Running Checks for Specific Nodes

class AdhocChecker(AbstractChecker):
    def run(node: Node, context: dict):
        logs = []
        paths = []
        # paths is a list of fully qualified path to a key in schema.json
        # each path must start form root
        # eg: root.samples
        # eg: root.samples.items.properties.property_name
        if node.path in  paths:
            # Implement new checks and append failure to logs
            logs += perform_new_checks(node, context)
        return logs

List of Checks for Validator

checks_dict, defined here, maps the type of validation that we want to perform to the list the of checks needed to be run for the validation
The list off checks is actually a list of instantiated checker objects

Changelog

v1.0.0

Add checker classes for generic validation
Add checker classes for v1.0.0 convention validation

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

Release history Release notifications | RSS feed

0.10.5

Mar 8, 2024

0.10.4

Feb 7, 2024

0.10.3

Feb 2, 2024

0.10.2

Dec 13, 2023

0.10.1

Dec 12, 2023

0.10.0

Dec 11, 2023

0.9.16

Jun 13, 2023

0.9.15

May 23, 2023

0.9.14

Jan 23, 2023

0.9.13

Dec 7, 2022

0.9.12

May 4, 2022

0.9.11

Feb 23, 2022

0.9.10

Feb 8, 2022

0.9.9

Jan 7, 2022

0.9.8

Dec 14, 2021

0.9.7

Dec 14, 2021

0.9.6

Nov 23, 2021

0.9.5

Nov 17, 2021

This version

0.9.4

Nov 10, 2021

0.9.3

Nov 8, 2021

0.9.2

Nov 8, 2021

0.9.1

Nov 8, 2021

0.9.0

Nov 5, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ts-ids-validator-0.9.4.tar.gz (24.4 kB view hashes)

Uploaded Nov 10, 2021 Source

Built Distribution

ts_ids_validator-0.9.4-py3-none-any.whl (32.1 kB view hashes)

Uploaded Nov 10, 2021 Python 3

Hashes for ts-ids-validator-0.9.4.tar.gz

Hashes for ts-ids-validator-0.9.4.tar.gz
Algorithm	Hash digest
SHA256	`4af5a0aba7bbec0304bcc053ebd8a501f794f8fbebba9ba4a6cf6f401005a9b6`
MD5	`eb9eae9b1c8808a3ced3abaced7652a8`
BLAKE2b-256	`70215264f9ba99d6e869d1dd0764bdc699c3122cc83404bfe181df2ee278e75a`

Hashes for ts_ids_validator-0.9.4-py3-none-any.whl

Hashes for ts_ids_validator-0.9.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d5a0b37a3518acf5ab72154c4f8c9b94ba809386a4d219a9f4bbc2b44cab906e`
MD5	`5d2f2b60d17febbdac6b480a2c69a7ef`
BLAKE2b-256	`4e9f5bbb8ec45b9f7d617738b27855e1e42b81b63acfecd723045344605885a0`