Skip to main content

A Python library for generating OCSF JSON schemas.

Project description

OCSF JSON Schema

[!NOTE] This is a work in progress—primarily for my own learning about OCSF. Please treat this as beta-grade code.

Overview

This project provides a tool for generating JSON Schema files that can be used for validating instances of events that follow the Open Cybersecurity Schema Framework.

Project Goal

The goal of this project is to generate OCSF JSON Schema files locally using Python, eliminating the need to download them from schema.ocsf.io. Given the large number of possible variations across OCSF versions, classes, objects, and profiles, this approach provides a more efficient way to validate events—especially when dealing with diverse inputs. It also reduces reliance on external servers, making validation both faster and more sustainable.

OCSF JSON Schemas are complex, consisting of numerous interrelated classes and objects spanning multiple schema versions. Additionally, the structure must adapt based on the selected OCSF profiles. This tool generates schemas that closely align with those from schema.ocsf.io but introduces key differences:

  • Explicit JSON Schema Draft Version: We explicitly define JSON Schema Draft 2020-12 and validate our outputs against it. The schemas from schema.ocsf.io do not specify a draft version.
  • Extended Features: We add support for the deprecated flag, as well as the at_least_one and just_one constraints.
  • Absolute $id References: Classes and objects are assigned absolute $id values corresponding to their canonical URIs on schema.ocsf.io, ensuring consistency.

OCSF Version

The following OCSF versions are packaged for convenience in ocsf_json_schema/ocsf/. Please also see the README for details on generating the Picket version of the schema files, which can give a slight performance boost.

  • 1.0.0
  • 1.0.0-rc.2
  • 1.0.0-rc.3
  • 1.1.0
  • 1.2.0
  • 1.3.0
  • 1.4.0
  • 1.5.0

You can also bring your own schema if required or desired.

Setup

Requires Python 3.10 or above. There are no other dependencies needed for normal use.

To run the tests, install the dev dependencies (pytest, pytest-cov & jsonschema).

pip install -e '.[dev]' 

Tests can be run with

pytest

Usage

Lookup class schemas

This will generate the JSON schema for:

  • OCSF Schema 1.4.0
  • The 'authentication' class (3002)
  • With the 'cloud' and 'datetime' profiles
  • Including the schema of all referenced OCSF objects embedded.
import json
from ocsf_json_schema import get_ocsf_schema, OcsfJsonSchemaEmbedded

# Loads the packaged 1.4.0 version of the OCSF Schema
ocsf_schema = OcsfJsonSchemaEmbedded(get_ocsf_schema(version='1.4.0'))

# If you only have the class_uid, you can look up the class_name.
class_name = ocsf_schema.lookup_class_name_from_uid(class_uid=3002)

# Returns the JSON schema for the 'authentication' class, 
# with the 'cloud' and 'datetime' profiles applied.
json_schema = ocsf_schema.get_class_schema(
    class_name=class_name, profiles=['cloud', 'datetime']
)

# See what was generated.
print(json.dumps(json_schema, indent=2))

If you don't want objects embedded, i.e. you want the schema only for the class itself, you can use the OcsfJsonSchema rather than OcsfJsonSchemaEmbedded.

Lookup object schemas

This will generate the JSON schema for:

  • OCSF Schema 1.4.0
  • The 'metadata' object
  • With the 'cloud' and 'datetime' profiles
  • Including the schema of all referenced OCSF objects embedded.
import json
from ocsf_json_schema import get_ocsf_schema, OcsfJsonSchemaEmbedded

# Loads the packaged 1.4.0 version of the OCSF Schema
ocsf_schema = OcsfJsonSchemaEmbedded(get_ocsf_schema(version='1.4.0'))

# Returns the JSON schema for the 'metadata' object, 
# with the 'cloud' and 'datetime' profiles applied.
json_schema = ocsf_schema.get_object_schema(
    object_name='metadata', profiles=['cloud', 'datetime']
)

# See what was generated.
print(json.dumps(json_schema, indent=2))

If you don't want other objects embedded, i.e. you want the schema only for the object itself, you can use the OcsfJsonSchema rather than OcsfJsonSchemaEmbedded.

Lookup class or object schemas by their URI

Absolute OCSF schema URIs look like:

This will generate the JSON schema for:

  • OCSF Schema 1.4.0
  • The 'authentication' class (3002)
  • With the 'cloud' and 'datetime' profiles
  • Including the schema of all referenced OCSF objects embedded.
import json
from ocsf_json_schema import get_ocsf_schema, OcsfJsonSchemaEmbedded

# Loads the packaged 1.4.0 version of the OCSF Schema
ocsf_schema = OcsfJsonSchemaEmbedded(get_ocsf_schema(version='1.4.0'))

# Returns the JSON schema for the 'authentication' class, 
# with the 'cloud' and 'datetime' profiles applied.
json_schema = ocsf_schema.get_schema_from_uri(
    uri="https://schema.ocsf.io/schema/1.4.0/classes/authentication?profiles=cloud,datetime"
)

# See what was generated.
print(json.dumps(json_schema, indent=2))

Bring your own schema

If you want to use a version of the OCSF schema that's not packaged (-dev instances, for example), then you can being your own schema.

For example, download the dev schema:

curl -o 1.6.0-dev.json https://schema.ocsf.io/1.6.0-dev/export/schema

Then

import json
from ocsf_json_schema import get_ocsf_schema, OcsfJsonSchemaEmbedded

with open("1.6.0-dev.json", 'r') as file:
    schema_from_file = json.load(file)

# Loads the version of the OCSF Schema from the above file.
ocsf_schema = OcsfJsonSchemaEmbedded(schema_from_file)

# If you only have the class_uid, you can lookup the class_name.
class_name = ocsf_schema.lookup_class_name_from_uid(class_uid=3002)

# Returns the JSON schema for the 'authentication' class, 
# with the 'cloud' and 'datetime' profiles applied.
json_schema = ocsf_schema.get_class_schema(
    class_name=class_name, profiles=['cloud', 'datetime']
)

# See what was generated.
print(json.dumps(json_schema, indent=2))

Validate an OCSF log against the schema

[!NOTE] The validation process itself is outside the scope of this project, but here's an example of how you could do it. jsonschema isn't a dependency of ocsf-json-schema, so you'll need to install it yourself if you wish to use it.

The generated JSON Schema files can be used with any JSON validator that supports 2020-12. Python's jsonschema, for example.

Assuming you have an instance of a OCSF event in the file authentication.log.json:

import json
from jsonschema import validate, exceptions
from ocsf_json_schema import get_ocsf_schema, OcsfJsonSchemaEmbedded

# Loads the packaged 1.4.0 version of the OCSF Schema
ocsf_schema = OcsfJsonSchemaEmbedded(get_ocsf_schema(version='1.4.0'))

# Returns the JSON schema for the 'authentication' class, 
# with the 'cloud' and 'datetime' profiles applied.
json_schema = ocsf_schema.get_class_schema(
    class_name='authentication', profiles=['cloud', 'datetime']
)

with open("authentication.log.json", 'r') as file:
    log_file = json.load(file)

try:
    # An exception is raised if the log file's schema is not as expected.
    validate(instance=log_file, schema=json_schema)
    print("Log's schema is valid.")
except exceptions.SchemaError as e:
    print(f"Log's schema is invalid: {e}")

Notes on validation

  • null values are not supported and, if present, will likely result in the validation failing. If a value is null, the key/value pair should be removed before validation. This aligns with the JSON Schema files from schema.ocsf.io.
  • If you are validating a file that was previously parquet, be careful of fields that should be a dictionary, but may have been converted to a list of tuples. The validator will expect these fields to be a dictionary. This will most likely occur for fields of type object. For example, unmapped.

Build

This repo can be built as a package with the following.

pip install build
rm ocsf_json_schema/ocsf/*.pkl
python -m build

Licence

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocsf_json_schema-1.0.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ocsf_json_schema-1.0.0-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file ocsf_json_schema-1.0.0.tar.gz.

File metadata

  • Download URL: ocsf_json_schema-1.0.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for ocsf_json_schema-1.0.0.tar.gz
Algorithm Hash digest
SHA256 504785efcf28179693fcca750c60d56b02c243408e6ee372eaa2442dd1b8584f
MD5 3d5b052042c75ae784ba554703449679
BLAKE2b-256 96bbf0b01c66a26ec9b317842a123c8827d2250116fa5155b1b4ee1a92c2e88d

See more details on using hashes here.

File details

Details for the file ocsf_json_schema-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ocsf_json_schema-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 86bef579e71866965462752090470d47420a432498eadfc933380895e439096a
MD5 7a801d6aa2c4ad49b56b2f89424c1c00
BLAKE2b-256 248d4020035f3da84815fd67fea74f7f88288eda41a008debf611d77a29eac4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page