OCSF Schema Validation
Project description
OCSF Schema Validator
A utility to validate contributions to the OCSF schema, intended to prevent human error when contributing to the schema in order to keep the schema machine-readable.
OCSF provides several include mechanisms to facilitate reuse, but this means individual schema files may be incomplete. This complicates using off-the-shelf schema definition tools for validation.
Query is a federated search solution that normalizes disparate security data to OCSF. This validator is adapted from active code and documentation generation tools written by the Query team.
Getting Started
Prerequisites
- python >3.11
- pip
- A copy of the OCSF schema
Installation
You can install the validator with pip
:
$ pip install ocsf-validator
Usage
You can run the validator against your working copy of the schema to identify problems before submitting a PR. Invoke the validator using python
and provide it with the path to the root of your working copy.
Examples:
$ python -m ocsf_validator .
$ python -m ocsf_validator ../ocsf-schema
Tests
The validator performs the following tests on a copy of the schema:
- The schema is readable and all JSON is valid. [FATAL]
- The directory structure meets expectations. [WARNING]
- The targets in
$include
,profiles
, andextends
directives can be found. [ERROR] - All required attributes in schema definition files are present. [WARNING]
- There are no unrecognized attributes in schema definition files. [WARNING]
- All attributes in the attribute dictionary are used. [WARNING]
- There are no name collisions within a record type. [WARNING]
- All attributes are defined in the attribute dictionary. [WARNING]
If any ERROR or FATAL tests fail, the validator exits with a non-zero exit code.
Technical Overview
The OCSF metaschema is represented as record types by filepath, achieved as follows:
- Record types are represented using Python's type system by defining them as Python
TypedDict
s intypes.py
. This allows the validator to take advantage of Python's reflection capabilities. - Files and record types are associated by pattern matching the file paths. These patterns are named in
matchers.py
to allow mistakes to be caught by a type checker. - Types are mapped to filepath patterns in
type_mapping.py
.
The contents of the OCSF schema to be validated are primarily represented as a Reader
defined in reader.py
. Reader
s load the schema definitions to be validated from a source (usually from a filesystem) and contain them without judgement. The process_includes
function and other contents of processor.py
mutate the contents of a Reader
by applying OCSF's various include mechanisms.
Validators are defined in validators.py
and test the schema contents for various problematic conditions. Validators should pass Exception
s to a special error Collector
defined in errors.py
. This module also defines a number of custom exception types that represent problematic schema states. The Collector
raises errors by default, but can also hold them until they're aggregated by a larger validation process (e.g., the ValidationRunner
).
The ValidationRunner
combines all of the building blocks above to read a proposed schema from a filesystem, validate the schema, and provide useful output and a non-zero exit code if any errors were encountered.
Contributing
After checking out, you'll want to install dependencies:
poetry install
Before committing, run the formatters and tests:
poetry run isort
poetry run black
poetry run pyright
poetry run pytest
If you're adding a validator, do the following:
- Write your
validate_
function invalidate.py
to apply a function to the relevant keys in a reader that will run your desired validation. Seevalidators.py
for examples. - Add any custom errors in
errors.py
. - Create an option to change its severity level in
ValidatorOptions
and map it in the constructor ofValidationRunner
inrunner.py
. - Invoke the new validator in
ValidationRunner.validate
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ocsf_validator-0.1.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5996546958b81601a64ee20988debeb960d675efc5ddc96e94de8fb1f90c271a |
|
MD5 | 2ad283467f8f7b72b9dcbd9806b7cd91 |
|
BLAKE2b-256 | b6189c1389f99d5387c3b62519707698cddaebe449e5ed29ff3556f51935a6e8 |