grammarinator·PyPI

Grammarinator: Grammar-based Random Test Generator

These details have not been verified by PyPI

Project links

Homepage

Project description

ANTLRv4 grammar-based test generator

Grammarinator is a random test generator / fuzzer that creates test cases according to an input ANTLR v4 grammar. The motivation behind this grammar-based approach is to leverage the large variety of publicly available ANTLR v4 grammars.

The trophy page of the found issues is available from the wiki.

Requirements

Python >= 3.7
Java SE >= 7 JRE or JDK (the latter is optional)

Install

To use Grammarinator in another project, it can be added to setup.cfg as an install requirement (if using setuptools with declarative config):

[options]
install_requires =
    grammarinator

To install Grammarinator manually, e.g., into a virtual environment, use pip:

pip install grammarinator

The above approaches install the latest release of Grammarinator from PyPI. Alternatively, for the development version, clone the project and perform a local install:

pip install .

Usage

As a first step, Grammarinator takes an ANTLR v4 grammar and creates a test generator script in Python3. Grammarinator supports a subset of the features of the ANTLR grammar which is introduced in the Grammar overview section of the documentation. The produced generator can be subclassed later to customize it further if needed.

Basic command-line syntax of test generator creation:

grammarinator-process <grammar-file(s)> -o <output-directory> --no-actions

Notes

Grammarinator uses the ANTLR v4 grammar format as its input, which makes existing grammars (lexer and parser rules) easily reusable. However, because of the inherently different goals of a fuzzer and a parser, inlined code (actions and conditions, header and members blocks) are most probably not reusable, or even preventing proper execution. For first experiments with existing grammar files, grammarinator-process supports the command-line option --no-actions, which skips all such code blocks during fuzzer generation. Once inlined code is tuned for fuzzing, that option may be omitted.

After having generated and optionally customized a fuzzer, it can be executed by the grammarinator-generate script (or by manually instantiating it in a custom-written driver, of course).

Basic command-line syntax of grammarinator-generate:

grammarinator-generate <generator> -r <start-rule> -d <max-depth> \
  -o <output-pattern> -n <number-of-tests> \
  -t <transformer1> -t <transformer2>

Beside generating test cases from scratch based on the ANTLR grammar, Grammarinator is also able to recombine existing inputs or mutate only a small portion of them. To use these additional generation approaches, a population of selected test cases has to be prepared. The preparation happens with the grammarinator-parse tool, which processes the input files with an ANTLR grammar (possibly with the same one as the generator grammar) and builds grammarinator tree representations from them (with .grt extension). Having a population of such .grt files, grammarinator-generate can make use of them with the --population cli option. If the --population option is set, then Grammarinator will choose a strategy (generation, mutation, or recombination) randomly at the creation of every new test case. If any of the strategies is unwanted, they can be disabled with the --no-generate, --no-mutate or --no-recombine options.

Basic command line syntax of grammarinator-parse:

grammarinator-parse <grammar-file(s)> -r <start-rule> \
  -i <input_file(s)> -o <output-directory>

Notes

Real-life grammars often use recursive rules to express certain patterns. However, when using such rule(s) for generation, we can easily end up in an unexpectedly deep call stack. With the --max-depth or -d options, this depth - and also the size of the generated test cases - can be controlled.

Another specialty of the ANTLR grammars is that they support so-called hidden tokens. These rules typically describe such elements of the target language that can be placed basically anywhere without breaking the syntax. The most common examples are comments or whitespaces. However, when using these grammars - which don’t define explicitly where whitespace may or may not appear in rules - to generate test cases, we have to insert the missing spaces manually. This can be done by applying a serializer (with the -s option) to the tree representation of the output tests. A simple serializer - that inserts a space after every unparser rule - is provided by Grammarinator (grammarinator.runtime.simple_space_serializer).

In some cases, we may want to postprocess the output tree itself (without serializing it). For example, to enforce some logic that cannot be expressed by a context-free grammar. For this purpose the transformer mechanism can be used (with the -t option). Similarly to the serializers, it will take a tree as input, but instead of creating a string representation, it is expected to return the modified (transformed) tree object.

As a final thought, one must not forget that the original purpose of grammars is the syntax-wise validation of various inputs. As a consequence, these grammars encode syntactic expectations only and not semantic rules. If we still want to add semantic knowledge into the generated test, then we can inherit custom fuzzers from the generated ones and redefine methods corresponding to lexer or parser rules in ways that encode the required knowledge (e.g.: HTMLCustomGenerator).

Working Example

The repository contains a minimal example to generate HTML files. To give it a try, run the processor first:

grammarinator-process examples/grammars/HTMLLexer.g4 examples/grammars/HTMLParser.g4 \
  -o examples/fuzzer/

Then, use the generator to produce test cases:

grammarinator-generate HTMLCustomGenerator.HTMLCustomGenerator -r htmlDocument -d 20 \
  -o examples/tests/test_%d.html -n 100 \
  -s HTMLGenerator.html_space_serializer \
  --sys-path examples/fuzzer/

Compatibility

Grammarinator was tested on:

Linux (Ubuntu 16.04 / 18.04 / 20.04)
OS X / macOS (10.12 / 10.13 / 10.14 / 10.15 / 11)
Windows (Server 2012 R2 / Server version 1809 / Windows 10)

Citations

Background on Grammarinator is published in:

Renata Hodovan, Akos Kiss, and Tibor Gyimothy. Grammarinator: A Grammar-Based Open Source Fuzzer. In Proceedings of the 9th ACM SIGSOFT International Workshop on Automating Test Case Design, Selection, and Evaluation (A-TEST 2018), pages 45-48, Lake Buena Vista, Florida, USA, November 2018. ACM. https://doi.org/10.1145/3278186.3278193

Copyright and Licensing

Licensed under the BSD 3-Clause License.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

23.7

Jul 15, 2023

19.3

Mar 30, 2019

18.10

Oct 31, 2018

17.7

Jul 26, 2017

17.5

May 15, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grammarinator-23.7.tar.gz (87.5 kB view details)

Uploaded Jul 15, 2023 Source

Built Distribution

grammarinator-23.7-py3-none-any.whl (81.6 kB view details)

Uploaded Jul 15, 2023 Python 3

File details

Details for the file grammarinator-23.7.tar.gz.

File metadata

Download URL: grammarinator-23.7.tar.gz
Upload date: Jul 15, 2023
Size: 87.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for grammarinator-23.7.tar.gz
Algorithm	Hash digest
SHA256	`c61df71d66e8b5794d8c81f3942fb8d420d758c6e67f4a9212267832488dbca9`
MD5	`fe765791bf2f243983620d65951e3a20`
BLAKE2b-256	`e2c0f9b0d419b70185969e7b8b33b5c4ad95427c94f56d48c5ce47a0317337eb`

See more details on using hashes here.

File details

Details for the file grammarinator-23.7-py3-none-any.whl.

File metadata

Download URL: grammarinator-23.7-py3-none-any.whl
Upload date: Jul 15, 2023
Size: 81.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for grammarinator-23.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cbd1f4f5a82a864930a87831c3043c280b2aeb055d6e3e9d34e37fcd015cfddc`
MD5	`af1745bfc65163dbf9bfcc15e296658f`
BLAKE2b-256	`f9b0ddc5755d427d4e8a6fa5413e29e5ec3b44be32a96f3b967d6c88e8bd69c8`

See more details on using hashes here.

grammarinator 23.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Requirements

Install

Usage

Working Example

Compatibility

Citations

Copyright and Licensing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes