Skip to main content

Tools for working with the OCSF schema

Project description

OCSF Library for Python

Tools for building Python scripts and applications leveraging the OCSF.

Quick Start

If you just want to use this library as a CLI tool, install it with pip or poetry and try the following commands:

python -m ocsf.compile path/to/ocsf-schema
python -m ocsf.compare my-schema-export.json path/to/ocsf-schema
python -m ocsf.schema 1.2.0
python -m ocsf.validate.compatibility path/to/ocsf-schema 1.2.0

About

This project began with two goals:

  1. Provide the OCSF community with a validator that tests for breaking changes in ocsf-schema PRs.
  2. Begin to provide the OCSF community with more composable tools and libraries, as well as approachable reference implementations of OCSF related functions, in order to make OCSF more "hackable."

The scope of this project may grow to include things like a reference implementation OCSF schema compiler.

The project targets Python 3.11 for a balance of capability and availability. The root level package, ocsf, is a namespace package so that other repositories and artifacts can also use the ocsf namespace.

This library is divided into several discrete packages.

ocsf.util: The utilities package

The ocsf.util package provides the get_schema function. This function leverages the functionality in the ocsf.schema and ocsf.api packages (below) to easily build an OCSF schema from a file on disk, a working copy of an OCSF repository, or from the API.

schema = get_schema("1.1.0")
schema = get_schema("./1.3.0-dev.json")
schema = get_schema("path/to/ocsf-schema")

ocsf.schema: The Schema Package

The ocsf.schema package contains Python data classes that represent an OCSF schema as represented from the OCSF server's API endpoints. See the ocsf.schema.model module for the data model definitions.

It also includes utilities to parse the schema from a JSON string or file.

ocsf.repository: The Repository Package

The ocsf.repository package contains a typed Python representation of a working copy of an OCSF schema repository. Said another way, it represents the OCSF metaschema and repository contents in Python.

It also includes the read_repo function to read a repository from disk.

ocsf.compile: An OCSF Compiler

The ocsf.compile package "compiles" the OCSF schema from a repository just as the OCSF server does (with very few exceptions). It is meant to provide:

  1. An easy to use CLI tool to compile a repository into a single JSON schema file.
  2. A reference implementation for others looking to better understand OCSF compilation or to create their own compiler.

ocsf.api: The API Package

The ocsf.api package exports an OcsfApiClient, which is a lightweight HTTP client that can retrieve a version of the schema over HTTP and cache it on the local filesystem. It uses thes export/schema, api/versions, api/profiles, and api/extensions endpoints of the OCSF server.

ocsf.compare: The Compare Package

The ocsf_tools.compare package compares two versions of the OCSF schema and generates a type safe difference. Its aim is to make schema comparisons easy to work with.

This package grew out of a library used internally at Query. The original is used extensively to manage upgrading Query's data model to newer versions of OCSF, as well as to build adapters between different OCSF flavors (like AWS Security Lake on rc2 and Query on 1.1).

There is a very simple __main__ implementation to demonstrate the comparison. You can use it as follows:

$ poetry run python -m ocsf_tools.compare 1.0.0 1.2.0

The comparison API is straightforward. Want to look for removed events?

diff = compare(get_schema("1.0.0", "1.1.0"))
for name, event in diff.classes.items():
    if isinstance(event, Removal):
        print(f"Oh no, we've lost {name}!")

Or changed data types?

diff = compare(get_schema("1.0.0", "1.1.0"))
for name, event in diff.classes.items():
    if isinstance(event, ChangedEvent):
        for attr_name, attr in event.attributes.items():
            if isinstance(attr, ChangedAttr):
                if isinstance(attr.type, Change):
                    print(f"Who changed this data type? {name}.{attr_name}")

Or new objects?

diff = compare(get_schema("1.0.0", "1.1.0"))
for name, obj in diff.objects.items():
    if isinstance(obj, Addition):
        print(f"A new object {name} has been discovered!")

ocsf.validate.framework: The Validation Framework Package

The ocsf.validate.framework package provides a lightweight framework for validators. It was inspired by the needs of ocsf-validator, which may be ported to this framework in the future.

ocsf.validate.compatibility: The Backwards Compatibility Validator

The ocsf.validate.compatibility provides a backwards compatibility validator for OCSF schema. This compares the changes between two OCSF schemata and reports any breaking changes between the old and new version.

Getting Started

PyPI

The easiest way to install ocsf-lib is from PyPI using pip or poetry:

$ pip install ocsf-lib

From Source

If you want to work with the source, the recommended installation is with asdf and poetry.

$ asdf install
$ poetry install

Contributing

This project uses ruff for formatting and linting, pyright for type checking, and pytest as its test runner.

Before submitting a PR, make sure you've run following:

$ poetry run ruff format
$ poetry run ruff check
$ poetry run pyright
$ poetry run pytest

Type Checking

With great effort, this library passes pyright's strict mode type checking. Keep it that way! The OCSF schema is big, and even the metaschema is a lot to hold in your head. Having the type checker identify mistakes for you can be very helpful.

There is one cast used from the concrete ChangedModel types (ChangedSchema, ChangedAttr, etc.) in the compare package to the generic type. For the life of me, I can't figure it out. I blame pyright but it's probably my own fault.

Tests

Running unit tests:

$ poetry run pytest -m "not integration"

Running integration tests:

$ poetry run pytest -m integration

NOTE: Some of the integration tests require an OCSF server instance, and are using the public instance at https://schema.ocsf.io. This should probably use a local instance of the OCSF server instead.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocsf_lib-0.8.2.tar.gz (52.9 kB view details)

Uploaded Source

Built Distribution

ocsf_lib-0.8.2-py3-none-any.whl (74.2 kB view details)

Uploaded Python 3

File details

Details for the file ocsf_lib-0.8.2.tar.gz.

File metadata

  • Download URL: ocsf_lib-0.8.2.tar.gz
  • Upload date:
  • Size: 52.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.2 Linux/6.5.0-1025-azure

File hashes

Hashes for ocsf_lib-0.8.2.tar.gz
Algorithm Hash digest
SHA256 61333c0cd2d70f61fa4b7c7d32af353cccdb8fafc682440b4b9ee2c1f8ac49f6
MD5 122aba7f674827adf0f2f3e7c2ba61d9
BLAKE2b-256 a3a22199f062f4bb08ff7aa18b13b43afaf10c503e60851cbe2b28cc20f52ba3

See more details on using hashes here.

File details

Details for the file ocsf_lib-0.8.2-py3-none-any.whl.

File metadata

  • Download URL: ocsf_lib-0.8.2-py3-none-any.whl
  • Upload date:
  • Size: 74.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.2 Linux/6.5.0-1025-azure

File hashes

Hashes for ocsf_lib-0.8.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ca43aa86baa389761c97263dc3556d1af7210767faf013e5afd3cc92ab759c1f
MD5 ad87f84511778872525152d5410821d6
BLAKE2b-256 641e67c0bafd3f881baf6a8247d66a8a0cae6e227243f6e790f29a792cc2110a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page