Tools to build automata from your custom rule
Project description
Automata Tools
Tools to build automata from your custom rule.
This package provides a set of handy tools to programmatically build automata, so you can build NFA、DFA、MinimizedDFA、WFA from any custom rules.
Usage
Install
conda install -c conda-forge automata-tools # not available yet
# or
pip install automata-tools
Import
See example in examples/NFAfromCustomRule.py
from typing import List
from automata_tools import BuildAutomata, Automata
automata: List[Automata] = []
BuildAutomata
characterStruct
Build simple (0)-[a]->(1)
automata
automata.append(BuildAutomata.characterStruct(char))
unionStruct
Build automata that is an "or" of two sub-automata (1)<-[a]-(0)-[b]->(1)
# to match "a|b"
a = automata.pop()
b = automata.pop()
if operator == "|":
automata.append(BuildAutomata.unionStruct(b, a))
concatenationStruct
Build automata that is an "and" of two sub-automata (0)-[a]->(1)-[b]->(2)
# to match "ab"
a = automata.pop()
b = automata.pop()
automata.append(BuildAutomata.concatenationStruct(b, a))
starStruct
Build automata that looks like the "Kleene closure"
# to match "a*"
if operator == "*":
a = automata.pop()
automata.append(BuildAutomata.starStruct(a))
skipStruct
Build automata that looks like the "Kleene closure" but without the loop back (1)<-[ε]-(2)
, so it only match the token once at most.
# to match "a*"
if operator == "?":
a = automata.pop()
automata.append(BuildAutomata.skipStruct(a))
repeatRangeStruct
Build automata that will match the same token for several times (0)-[a]->(1)-[a]->(2)-[a]->(3)
# to match "a{3}"
repeatedAutomata = BuildAutomata.repeatStruct(automata, 3)
repeatStruct
Build automata that will match the same token for n to m times
(0)-[a]->(1)-[a]->(4), (0)-[a]->(2)-[a]->(3)-[a]->(4)
# to match "a{2,3}"
repeatedAutomata = BuildAutomata.repeatRangeStruct(automata, 2, 3)
Automata
See example in features/steps/customRule.py
from automata_tools import DFAFromNFA, Automata
from your_implementation import NFAFromRegex, executor
nfa: Automata = NFAFromRegex().buildNFA(rule)
minDFA: Automata = DFAFromNFA(nfa).getMinimizedDFA()
minDFA.setExecuter(executor)
print(minDFA.execute(someText))
where executor
is a function like the one in examples/NFAfromCustomRule.py:
def executor(tokens, startState, finalStates, transitions):
return True
setExecuter
Set an executor to the automata that can freely use state and transition of the automata, and return a boolean value.
from automata_tools import IAutomataExecutor
defaultExecuter: IAutomataExecutor = lambda tokens, startState, finalStates, transitions: True
minDFA.setExecuter(defaultExecuter)
setTokenizer
Set an tokenizer to the automata that can transform string to list of string token, which will be used by the executer.
minDFA.setExecuter(lambda input: input.split(' ')[::-1])
NFAtoDFA
Make automata state transitions not so ambiguous
nfa = NFAFromRegex().buildNFA(rule)
dfa = NFAtoDFA(nfa)
DFAtoMinimizedDFA
Allow you minify Automata state
nfa = NFAFromRegex().buildNFA(rule)
minDFA = DFAtoMinimizedDFA(NFAtoDFA(nfa))
Weighted Finite Automata
WFA, it can execute automata use matrix multiplication, so it can be very fast compare to brute force execution, especially when state space is large.
from automata_tools import WFA, get_word_to_index
_, wordToIndex = get_word_to_index([ruleParser(context.rule), tokenizer(text)])
wfa = WFA(minDFA, wordToIndex, dfa_to_tensor)
wfa.execute(text)
get_word_to_index
Given [['token', 'another'], ['token_in_rule']]
, return something like
{'token': 0, 'another': 1, ...}
So we can translate automata state to a matrix.
WFA
Given an automata, a word index like {'token': 0, 'another': 1, ...}
, and a function that transform automata to tensor (see example at customRuleDFAToTensor), return a WFA instance.
Development
Environment
Create environment from the text file:
conda env create --file automataTools-env.txt
conda activate automataTools
Save env file: conda list --explicit > automataTools-env.txt
Python Path
Create a .env
file with content PYTHONPATH=automataTools
Publish
To pypi
rm -rf ./dist && rm -rf ./build && rm -rf ./automata_tools.egg-info && python3 setup.py sdist bdist_wheel && twine upload dist/*
To Conda
# I'm learning how to do...
Resources
Probably the original reference source
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file automata-tools-2.0.1.tar.gz
.
File metadata
- Download URL: automata-tools-2.0.1.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0.post20200210 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a7c7e64ecc5f82f5f928e71093d5c05d569cd89e9f118babebeebf22518126f |
|
MD5 | 8bcf455a3290af3a28962ff79ee336e1 |
|
BLAKE2b-256 | 0ead7c21625d06489ad561e74d37c322532e9a842977de7d053ba19443af114c |
File details
Details for the file automata_tools-2.0.1-py3-none-any.whl
.
File metadata
- Download URL: automata_tools-2.0.1-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0.post20200210 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b055cd1b29fd97a472fcd4782b6f5cfa1fe2bb5653969ab3dbf35a0d33b2117 |
|
MD5 | d66eb4b99babc4db829787da996e0253 |
|
BLAKE2b-256 | e29613e3e6a7f6219382f69280bf2099a384d329ac980e62425be9128815f660 |