Skip to main content

Tools to build automata from your custom rule

Project description

Automata Tools

Tools to build automata from your custom rule.

This package provides a set of handy tools to programmatically build automata, so you can build NFA、DFA、MinimizedDFA、WFA from any custom rules.

Usage

Install

conda install -c conda-forge automata-tools # not available yet
# or
pip install automata-tools

Import

See example in examples/NFAfromCustomRule.py

from typing import List
from automata_tools import BuildAutomata, Automata

automata: List[Automata] = []

BuildAutomata

characterStruct

Build simple (0)-[a]->(1) automata

automata.append(BuildAutomata.characterStruct(char))

unionStruct

Build automata that is an "or" of two sub-automata (1)<-[a]-(0)-[b]->(1)

# to match "a|b"
a = automata.pop()
b = automata.pop()
if operator == "|":
    automata.append(BuildAutomata.unionStruct(b, a))

concatenationStruct

Build automata that is an "and" of two sub-automata (0)-[a]->(1)-[b]->(2)

# to match "ab"
a = automata.pop()
b = automata.pop()
automata.append(BuildAutomata.concatenationStruct(b, a))

starStruct

Build automata that looks like the "Kleene closure"

# to match "a*"
if operator == "*":
    a = automata.pop()
    automata.append(BuildAutomata.starStruct(a))

skipStruct

Build automata that looks like the "Kleene closure" but without the loop back (1)<-[ε]-(2), so it only match the token once at most.

# to match "a*"
if operator == "?":
    a = automata.pop()
    automata.append(BuildAutomata.skipStruct(a))

repeatRangeStruct

Build automata that will match the same token for several times (0)-[a]->(1)-[a]->(2)-[a]->(3)

# to match "a{3}"
repeatedAutomata = BuildAutomata.repeatStruct(automata, 3)

repeatStruct

Build automata that will match the same token for n to m times

(0)-[a]->(1)-[a]->(4), (0)-[a]->(2)-[a]->(3)-[a]->(4)

# to match "a{2,3}"
repeatedAutomata = BuildAutomata.repeatRangeStruct(automata, 2, 3)

Automata

See example in features/steps/customRule.py

from automata_tools import DFAFromNFA, Automata

from your_implementation import NFAFromRegex, executor

nfa: Automata = NFAFromRegex().buildNFA(rule)
minDFA: Automata = DFAFromNFA(nfa).getMinimizedDFA()
minDFA.setExecuter(executor)

print(minDFA.execute(someText))

where executor is a function like the one in examples/NFAfromCustomRule.py:

def executor(tokens, startState, finalStates, transitions):
    return True

setExecuter

Set an executor to the automata that can freely use state and transition of the automata, and return a boolean value.

from automata_tools import IAutomataExecutor

defaultExecuter: IAutomataExecutor = lambda tokens, startState, finalStates, transitions: True
minDFA.setExecuter(defaultExecuter)

setTokenizer

Set an tokenizer to the automata that can transform string to list of string token, which will be used by the executer.

minDFA.setExecuter(lambda input: input.split(' ')[::-1])

NFAtoDFA

Make automata state transitions not so ambiguous

nfa = NFAFromRegex().buildNFA(rule)
dfa = NFAtoDFA(nfa)

DFAtoMinimizedDFA

Allow you minify Automata state

nfa = NFAFromRegex().buildNFA(rule)
minDFA = DFAtoMinimizedDFA(NFAtoDFA(nfa))

Weighted Finite Automata

WFA, it can execute automata use matrix multiplication, so it can be very fast compare to brute force execution, especially when state space is large.

from automata_tools import WFA, get_word_to_index

_, wordToIndex = get_word_to_index([ruleParser(context.rule), tokenizer(text)])
wfa = WFA(minDFA, wordToIndex, dfa_to_tensor)
wfa.execute(text)

get_word_to_index

Given [['token', 'another'], ['token_in_rule']], return something like

{'token': 0, 'another': 1, ...}

So we can translate automata state to a matrix.

WFA

Given an automata, a word index like {'token': 0, 'another': 1, ...}, and a function that transform automata to tensor (see example at customRuleDFAToTensor), return a WFA instance.

Development

Environment

Create environment from the text file:

conda env create --file automataTools-env.txt
conda activate automataTools

Save env file: conda list --explicit > automataTools-env.txt

Python Path

Create a .env file with content PYTHONPATH=automataTools

Publish

To pypi

rm -rf ./dist && rm -rf ./build && rm -rf ./automata_tools.egg-info && python3 setup.py sdist bdist_wheel && twine upload dist/*

To Conda

# I'm learning how to do...

Resources

Automata Theory Course Slides

Probably the original reference source

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for automata-tools, version 2.0.1
Filename, size File type Python version Upload date Hashes
Filename, size automata_tools-2.0.1-py3-none-any.whl (12.3 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size automata-tools-2.0.1.tar.gz (12.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page