A reverse-parser as a Hypotheses strategy: generate examples from an EBNF grammar
Project description
Hypothesis-Grammar
(pre-alpha... the stuff I've tried all works, not well tested yet though)
What is it?
Hypothesis-Grammar is a "reverse parser" - given a grammar it will generate examples of that grammar.
It is implemented as a Hypothesis strategy.
(If you are looking to generate text from a grammar for purposes other than testing with Hypothesis then this lib can still be useful, but I stongly recommend looking at the tools provided with NLTK instead.)
Usage
So, how does this look?
First you need a grammar. Our grammar format is based on that used by the Lark parser library. You can see our grammar-parsing grammar here. More details of our grammar format below.
Here is an example of using Hypothesis-Grammar:
from hypothesis_grammar import strategy_from_grammar
st = strategy_from_grammar(
grammar="""
DET: "the" | "a"
N: "man" | "park" | "dog"
P: "in" | "with"
s: np vp
np: DET N
pp: P np
vp: "slept" | "saw" np | "walked" pp
""",
start="s",
)
st.example()
# ['a', 'dog', 'saw', 'the', 'man']
st.example()
# ['a', 'park', 'saw', 'a', 'man']
st.example()
# ['the', 'man', 'slept']
or as a test...
from hypothesis import given
from hypothesis_grammar import strategy_from_grammar
@given(
strategy_from_grammar(
grammar="""
DET: "the" | "a"
N: "man" | "park" | "dog"
P: "in" | "with"
s: np vp
np: DET N
pp: P np
vp: "slept" | "saw" np | "walked" pp
""",
start="s",
)
)
def test_grammar(example):
nouns = {"man", "park", "dog"}
assert any(noun in example for noun in nouns)
The grammar is taken from an example in the NLTK docs and converted into our "simplified Lark" format.
start="s"
tells the parser that the start rule is s
.
As you can see, we have produced a Hypothesis strategy which is able to generate examples which match the grammar (in this case, short sentences which sometimes makes sense).
The output will always be a flat list of token strings. If you want a sentence you can just " ".join(example)
.
But the grammar doesn't have to describe text, it might represent a sequence of actions for example. In that case you might want to convert your result tokens into object instances, which could be done via a lookup table.
(But if you're generating action sequences for tests then probably you should check out Hypothesis' stateful testing features first)
Grammar details
- Whitespace is ignored
- 'Terminals' must be named all-caps (terminals only reference literals, not other rules), e.g.
DET
- 'Rules' must be named all-lowercase, e.g.
np
- LHS (name) and RHS are separated by
:
- String literals must be quoted with double-quotes e.g.
"man"
- You can also use regex literals, they are delimited with forward-slash, e.g.
/the[a-z]{0,2}/
. Content for the regex token is generated using Hypothesis'from_regex
strategy, withfullmatch=True
. - Adjacent tokens are concatenated, i.e.
DET N
means aDET
followed by aN
. |
is alternation, so"in" | "with"
means one-of"in"
or"with"
?
means optional, i.e."in"?
means"in"
is expected zero-or-one time.*
i.e."in"*
means"in"
is expected zero-or-many times.+
i.e."in"+
means"in"
is expected one-or-many times.~ <num>
means exactly-<num> times.~ <min>..<max>
is a range, expected between-<min>-and-<max> times.(
and)
are for grouping, the group can be quantified using any of the modifiers above.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file hypothesis-grammar-0.1.1.tar.gz
.
File metadata
- Download URL: hypothesis-grammar-0.1.1.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.3 CPython/3.7.6 Darwin/18.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75cf5d73df5d3ed468524622de12f2a8ae07aa0231266ab9ba52eb81a7753429 |
|
MD5 | cb1a6cbb33c6c2ae4197b35a1c7d9442 |
|
BLAKE2b-256 | 05966cb3a356499a79ec9b4b630377d99a9874642a2abe8c331a5013698ab406 |
File details
Details for the file hypothesis_grammar-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: hypothesis_grammar-0.1.1-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.3 CPython/3.7.6 Darwin/18.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1650dee38ba371d2e2e5f95c66eeb388c4d0d6345907570be662eba23239d7e3 |
|
MD5 | 1443f0abc0bfedc19859a063a21d2772 |
|
BLAKE2b-256 | cc808a934ce0aa939b8caf36522d36d665701178336792cfeb6b8ddddbf5d1bd |