Skip to main content

A (branching) Behaviour Synthesizer

Project description

Build Status

Absynthe: A (branching) Behavior Synthesizer

Motivation

Absynthe came about in response to the need for test data for analysizing the performance and accuracy of log analysis algorithms. Even though plenty of real life logs are available, e.g. /var/log/ in unix-based laptops, they do not serve the purpose of test data. For that, we need to understand the core application logic that is generating these logs.

A more interesting situation arises while trying to test log analytic (and anomaly detection) solutions for distributed applications where multiple sources or modules emit their respective log messages in a single log queue or stream. This means that consecutive log lines could have originated from different, unrelated application components. Absynthe provides ground truth models to simulate such situations.

You need Absynthe if you wish to simulate the behavior of any well defined process -- whether it's a computer application or a business process flow.

Overview

Each business process or compuater application is modelled as a control flow graph (or CFG), which typically has one or more roots (i.e. entry) nodes and multiple leaf (i.e. end) nodes.

Tree-like CFG

An example of a simple, tree-like CFG generated using Absynthe is shown below. This is like a tree since nodes are laid out in levels, and nodes at level i have outgoing edges only to nodes at level i + 1.

Each behavior is the sequence of nodes encountered while traversing this CFG from a root to a leaf. Of course, a CFG might contain loops which could be traversed multiple times before arriving at the leaf. Moreover, if there are multiple CFGs, then Absynthe can synthesize interleaved behaviors. This means that a single sequence of nodes might contain nodes from multiple CFGs. We are ultimately interested in this interleaving behavior, which is produced by multiple CFGs.

The above screenshot shows logs generated by Absynthe. Each log line starts with a time stamp, followed by a session ID, CFG ID, and a log message. At present, the log message is simply a random concatenation of the node ID to which the log message corresponds. A single CFG might participate in multiple sessions, where each session is a different traversal of the CFG. Therefore, we maintain both session ID and CFG ID in the log line.

Directed Cyclic CFG

An example of a more complex CFG, a directed cyclic graph, is shown in the figure below. It expands the tree-like graph illustrated above by:

  1. attaching loops on some of the nodes,
  2. constructing skip-level edges, i.e. edges from a node at level i to a node at level ≥(i + 2), and
  3. optionally, upward edges (not shown here), i.e. edges from a node at level i to a node at level ≤(i - 1).

The identifiers of nodes appearing loops are helpfully prefixed with the identifiers of nodes where these loops start and finish. Moreover, loops could be traversed multiple times in a single behavior, as illustrated in the figure below.

Installation

This package has been developed with Python 3.6.* and depends on scipy 1.2.1. Things might not work with Python 3.7.* or scipy 1.3.*. Therefore, consider creating a virtual environment if your default configuration differs.

The latest release is available on PyPi, simply pip install absynthe. The master branch of this repository will always provide the latest release.

For the latest features not yet released, clone or download the develop branch and then:

# Change dir to absynthe
cd /path/to/absynthe

# Install dependencies
pip install -r requirements.txt

# Install absynthe
pip install .

Usage

It is possible to start using Absynthe with two classes:

  1. any concrete implementation of the abstract GraphBuilder class, which generates CFGs, and
  2. any concrete implementation of the abstract Behavior class, which traverses the CFGs generated above and emits log messages.

For instance, consider the basicLogGeneration method in ./examples/01_generateSimpleBehavior.py:

from absynthe.graph_builder import TreeBuilder
from absynthe.behavior import MonospaceInterleaving


def basicLogGeneration(numRoots: int = 2, numLeaves: int = 4,
                       branching: int = 2, numInnerNodes: int = 16,
                       loggerNodeTypes: str = "SimpleLoggerNode"):
    # Capture all the arguments required by GraphBuilder class
    tree_kwargs = {TreeBuilder.KW_NUM_ROOTS: str(numRoots),
                   TreeBuilder.KW_NUM_LEAVES: str(numLeaves),
                   TreeBuilder.KW_BRANCHING_DEGREE: str(branching),
                   TreeBuilder.KW_NUM_INNER_NODES: str(numInnerNodes),
                   TreeBuilder.KW_SUPPORTED_NODE_TYPES: loggerNodeTypes}

    # Instantiate a concrete GraphBuilder. Note that the
    # generateNewGraph() method of this class returns a
    # new, randomly generated graph that (more or less)
    # satisfies all the parameters provided to the
    # constructor, viz. tree_kwargs in the present case.
    simpleTreeBuilder = TreeBuilder(**tree_kwargs)

    # Instantiate a concrete behavior generator. Some
    # behavior generators do not print unique session ID
    # for each run, but it's nice to have those.
    wSessionID: bool = True
    exBehavior = MonospaceInterleaving(wSessionID)

    # Add multiple graphs to this behavior generator. The
    # behaviors that it will synthesize would essentially
    # be interleavings of simultaneous traversals of all
    # these graphs.
    exBehavior.addGraph(simpleTreeBuilder.generateNewGraph())
    exBehavior.addGraph(simpleTreeBuilder.generateNewGraph())
    exBehavior.addGraph(simpleTreeBuilder.generateNewGraph())
    exBehavior.addGraph(simpleTreeBuilder.generateNewGraph())

    # Specify how many behaviors are to be synthesized,
    # and get going.
    numTraversalsOfEachGraph: int = 2
    for logLine in exBehavior.synthesize(numTraversalsOfEachGraph):
        print(logLine)
    return

In order to generate behaviors from a directed cyclic CFG, create a DCG as shown in ./examples/03_generateControlFlowDCG.py and then generate behaviors after adding the DCG to a behavior object as shown in the code snippet above.

Note: When generating a behavior, i.e. when traversing a graph, successors of nodes are chosen based on the probability distributions associated with those nodes. Different nodes rely on different distributions and these nodes are randomly assigned in the graphs that are constructed by generateNewGraph() methods, resulting in graphs with a mix of nodes.

Release Notes

Note: This tool is still in alpha stage, so backward compatibility is not guaranteed between releases. However, inasmuch as users stick to graph builders' generateNewGraph() methods, they will stay away from compatibility problems.

Major changes in v0.0.2

  1. Added new graph builders, viz. DAGBuilder and DCGBuilder, which build CFGs with skip-level edges and loops respectively.
  2. Added new node, viz. BinomialNode, which exploits the binomial distribution in order to select its successors at the time of graph traversal.
  3. Added a separate utility class called Utils in absynthe.cfg.utils.py to create a new Node object from any of the concrete implementations of Node at random. All concrete implementations of Node therefore transparently available to graph builders (and everyone else) through this utility.

Coming up in future releases

  1. Sophisticated interleaving behaviors
  2. Logger nodes that emit more life like log messages
  3. Anomalous behaviors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

absynthe-0.0.3.tar.gz (19.0 kB view hashes)

Uploaded Source

Built Distribution

absynthe-0.0.3-py3-none-any.whl (28.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page