Skip to main content

A (branching) Behaviour Synthesizer

Project description

Build Status

Absynthe: A (branching) Behavior Synthesizer

Motivation

Absynthe came about in response to the need for test data for analysizing the performance and accuracy of log analysis algorithms. Even though plenty of real life logs are available, e.g. /var/log/ in unix-based laptops, they do not serve the purpose of test data. For that, we need to understand the core application logic that is generating these logs.

A more interesting situation arises while trying to test log analytic (and anomaly detection) solutions for distributed applications where multiple sources or modules emit their respective log messages in a single log queue or stream. This means that consecutive log lines could have originated from different, unrelated application components. Absynthe provides ground truth models to simulate such situations.

You need Absynthe if you wish to simulate the behavior of any well defined process -- whether it's a computer application or a business process flow.

Overview

Each business process or compuater application is modelled as a control flow graph (or CFG), which typically has one or more roots (i.e. entry) nodes and multiple leaf (i.e. end) nodes.

Tree-like CFG

An example of a simple, tree-like CFG generated using Absynthe is shown below. This is like a tree since nodes are laid out in levels, and nodes at level i have outgoing edges only to nodes at level i + 1.

Each behavior is the sequence of nodes encountered while traversing this CFG from a root to a leaf. Of course, a CFG might contain loops which could be traversed multiple times before arriving at the leaf. Moreover, if there are multiple CFGs, then Absynthe can synthesize interleaved behaviors. This means that a single sequence of nodes might contain nodes from multiple CFGs. We are ultimately interested in this interleaving behavior, which is produced by multiple CFGs.

The above screenshot shows logs generated by Absynthe. Each log line starts with a time stamp, followed by a session ID, CFG ID, and a log message. At present, the log message is simply a random concatenation of the node ID to which the log message corresponds. A single CFG might participate in multiple sessions, where each session is a different traversal of the CFG. Therefore, we maintain both session ID and CFG ID in the log line.

Directed Cyclic CFG

An example of a more complex CFG, a directed cyclic graph, is shown in the figure below. It expands the tree-like graph illustrated above by:

  1. attaching loops on some of the nodes,
  2. constructing skip-level edges, i.e. edges from a node at level i to a node at level ≥(i + 2), and
  3. optionally, upward edges (not shown here), i.e. edges from a node at level i to a node at level ≤(i - 1).

The identifiers of nodes appearing loops are helpfully prefixed with the identifiers of nodes where these loops start and finish. Moreover, loops could be traversed multiple times in a single behavior, as illustrated in the figure below.

Installation

This package has been developed with Python 3.6.* and depends on scipy 1.2.1. Things might not work with Python 3.7.* or scipy 1.3.*. Therefore, consider creating a virtual environment if your default configuration differs.

The latest release is available on PyPi, simply pip install absynthe. The master branch of this repository will always provide the latest release.

For the latest features not yet released, clone or download the develop branch and then:

# Change dir to absynthe
cd /path/to/absynthe

# Install dependencies
pip install -r requirements.txt

# Install absynthe
pip install .

Usage

It is possible to start using Absynthe with two classes:

  1. any concrete implementation of the abstract GraphBuilder class, which generates CFGs, and
  2. any concrete implementation of the abstract Behavior class, which traverses the CFGs generated above and emits log messages.

For instance, consider the basicLogGeneration method in ./examples/01_generateSimpleBehavior.py:

from absynthe.graph_builder import TreeBuilder
from absynthe.behavior import MonospaceInterleaving


def basicLogGeneration(numRoots: int = 2, numLeaves: int = 4,
                       branching: int = 2, numInnerNodes: int = 16,
                       loggerNodeTypes: str = "SimpleLoggerNode"):
    # Capture all the arguments required by GraphBuilder class
    tree_kwargs = {TreeBuilder.KW_NUM_ROOTS: str(numRoots),
                   TreeBuilder.KW_NUM_LEAVES: str(numLeaves),
                   TreeBuilder.KW_BRANCHING_DEGREE: str(branching),
                   TreeBuilder.KW_NUM_INNER_NODES: str(numInnerNodes),
                   TreeBuilder.KW_SUPPORTED_NODE_TYPES: loggerNodeTypes}

    # Instantiate a concrete GraphBuilder. Note that the
    # generateNewGraph() method of this class returns a
    # new, randomly generated graph that (more or less)
    # satisfies all the parameters provided to the
    # constructor, viz. tree_kwargs in the present case.
    simpleTreeBuilder = TreeBuilder(**tree_kwargs)

    # Instantiate a concrete behavior generator. Some
    # behavior generators do not print unique session ID
    # for each run, but it's nice to have those.
    wSessionID: bool = True
    exBehavior = MonospaceInterleaving(wSessionID)

    # Add multiple graphs to this behavior generator. The
    # behaviors that it will synthesize would essentially
    # be interleavings of simultaneous traversals of all
    # these graphs.
    exBehavior.addGraph(simpleTreeBuilder.generateNewGraph())
    exBehavior.addGraph(simpleTreeBuilder.generateNewGraph())
    exBehavior.addGraph(simpleTreeBuilder.generateNewGraph())
    exBehavior.addGraph(simpleTreeBuilder.generateNewGraph())

    # Specify how many behaviors are to be synthesized,
    # and get going.
    numTraversalsOfEachGraph: int = 2
    for logLine in exBehavior.synthesize(numTraversalsOfEachGraph):
        print(logLine)
    return

In order to generate behaviors from a directed cyclic CFG, create a DCG as shown in ./examples/03_generateControlFlowDCG.py and then generate behaviors after adding the DCG to a behavior object as shown in the code snippet above.

Note: When generating a behavior, i.e. when traversing a graph, successors of nodes are chosen based on the probability distributions associated with those nodes. Different nodes rely on different distributions and these nodes are randomly assigned in the graphs that are constructed by generateNewGraph() methods, resulting in graphs with a mix of nodes.

Release Notes

Note: This tool is still in alpha stage, so backward compatibility is not guaranteed between releases. However, inasmuch as users stick to graph builders' generateNewGraph() methods, they will stay away from compatibility problems.

Major changes in v0.0.2

  1. Added new graph builders, viz. DAGBuilder and DCGBuilder, which build CFGs with skip-level edges and loops respectively.
  2. Added new node, viz. BinomialNode, which exploits the binomial distribution in order to select its successors at the time of graph traversal.
  3. Added a separate utility class called Utils in absynthe.cfg.utils.py to create a new Node object from any of the concrete implementations of Node at random. All concrete implementations of Node therefore transparently available to graph builders (and everyone else) through this utility.

Coming up in future releases

  1. Sophisticated interleaving behaviors
  2. Logger nodes that emit more life like log messages
  3. Anomalous behaviors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

absynthe-0.0.3.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

absynthe-0.0.3-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file absynthe-0.0.3.tar.gz.

File metadata

  • Download URL: absynthe-0.0.3.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for absynthe-0.0.3.tar.gz
Algorithm Hash digest
SHA256 6671897e0f7e8995f37cab2f874423b06989633c1a87e9d0227887b7dcbd3cd9
MD5 ebec2edae49863f98f31e2652164eb54
BLAKE2b-256 527f809b44904693d9b6d0fc199d8b4140ddcc679f138674c30f31d707131e68

See more details on using hashes here.

File details

Details for the file absynthe-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: absynthe-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 28.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for absynthe-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 167b799409a85e67bb84de6d852804ac5653e03fb51e10d488a21c8e1dc595c3
MD5 e9fcf76c2a2d5e51894c982ab5a31816
BLAKE2b-256 e2f16a38089203b4bb16780a69760d50e35ddcf1672dad459030b52ca8d644b9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page