Skip to main content

Tools to build automata from your custom rule

Project description

Automata Tools

Tools to build automata from your custom rule.

This package provides a set of handy tools to programmatically build automata, so you can build NFA、DFA、MinimizedDFA、WFA from any custom rules.

Usage

Install

conda install -c conda-forge automata-tools # not available yet
# or
pip install automata-tools

Import

See example in examples/NFAfromCustomRule.py

from typing import List
from automata_tools import BuildAutomata, Automata

automata: List[Automata] = []

BuildAutomata

characterStruct

Build simple (0)-[a]->(1) automata

automata.append(BuildAutomata.characterStruct(char))

unionStruct

Build automata that is an "or" of two sub-automata (1)<-[a]-(0)-[b]->(1)

# to match "a|b"
a = automata.pop()
b = automata.pop()
if operator == "|":
    automata.append(BuildAutomata.unionStruct(b, a))

concatenationStruct

Build automata that is an "and" of two sub-automata (0)-[a]->(1)-[b]->(2)

# to match "ab"
a = automata.pop()
b = automata.pop()
automata.append(BuildAutomata.concatenationStruct(b, a))

starStruct

Build automata that looks like the "Kleene closure"

# to match "a*"
if operator == "*":
    a = automata.pop()
    automata.append(BuildAutomata.starStruct(a))

skipStruct

Build automata that looks like the "Kleene closure" but without the loop back (1)<-[ε]-(2), so it only match the token once at most.

# to match "a*"
if operator == "?":
    a = automata.pop()
    automata.append(BuildAutomata.skipStruct(a))

repeatRangeStruct

Build automata that will match the same token for several times (0)-[a]->(1)-[a]->(2)-[a]->(3)

# to match "a{3}"
repeatedAutomata = BuildAutomata.repeatStruct(automata, 3)

repeatStruct

Build automata that will match the same token for n to m times

(0)-[a]->(1)-[a]->(4), (0)-[a]->(2)-[a]->(3)-[a]->(4)

# to match "a{2,3}"
repeatedAutomata = BuildAutomata.repeatRangeStruct(automata, 2, 3)

Automata

See example in features/steps/customRule.py

from automata_tools import DFAFromNFA, Automata

from your_implementation import NFAFromRegex, executor

nfa: Automata = NFAFromRegex().buildNFA(rule)
minDFA: Automata = DFAFromNFA(nfa).getMinimizedDFA()
minDFA.setExecuter(executor)

print(minDFA.execute(someText))

where executor is a function like the one in examples/NFAfromCustomRule.py:

def executor(tokens, startState, finalStates, transitions):
    return True

setExecuter

Set an executor to the automata that can freely use state and transition of the automata, and return a boolean value.

from automata_tools import IAutomataExecutor

defaultExecuter: IAutomataExecutor = lambda tokens, startState, finalStates, transitions: True
minDFA.setExecuter(defaultExecuter)

setTokenizer

Set an tokenizer to the automata that can transform string to list of string token, which will be used by the executer.

minDFA.setExecuter(lambda input: input.split(' ')[::-1])

NFAtoDFA

Make automata state transitions not so ambiguous

nfa = NFAFromRegex().buildNFA(rule)
dfa = NFAtoDFA(nfa)

DFAtoMinimizedDFA

Allow you minify Automata state

nfa = NFAFromRegex().buildNFA(rule)
minDFA = DFAtoMinimizedDFA(NFAtoDFA(nfa))

Weighted Finite Automata

WFA, it can execute automata use matrix multiplication, so it can be very fast compare to brute force execution, especially when state space is large.

from automata_tools import WFA, get_word_to_index

_, wordToIndex = get_word_to_index([ruleParser(context.rule), tokenizer(text)])
wfa = WFA(minDFA, wordToIndex, dfa_to_tensor)
wfa.execute(text)

get_word_to_index

Given [['token', 'another'], ['token_in_rule']], return something like

{'token': 0, 'another': 1, ...}

So we can translate automata state to a matrix.

WFA

Given an automata, a word index like {'token': 0, 'another': 1, ...}, and a function that transform automata to tensor (see example at customRuleDFAToTensor), return a WFA instance.

Development

Environment

Create environment from the text file:

conda env create --file automataTools-env.txt
conda activate automataTools

Save env file: conda list --explicit > automataTools-env.txt

Python Path

Create a .env file with content PYTHONPATH=automataTools

Publish

To pypi

rm -rf ./dist && rm -rf ./build && rm -rf ./automata_tools.egg-info && python3 setup.py sdist bdist_wheel && twine upload dist/*

To Conda

# I'm learning how to do...

Resources

Automata Theory Course Slides

Probably the original reference source

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

automata-tools-2.0.1.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

automata_tools-2.0.1-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file automata-tools-2.0.1.tar.gz.

File metadata

  • Download URL: automata-tools-2.0.1.tar.gz
  • Upload date:
  • Size: 12.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0.post20200210 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.8.1

File hashes

Hashes for automata-tools-2.0.1.tar.gz
Algorithm Hash digest
SHA256 1a7c7e64ecc5f82f5f928e71093d5c05d569cd89e9f118babebeebf22518126f
MD5 8bcf455a3290af3a28962ff79ee336e1
BLAKE2b-256 0ead7c21625d06489ad561e74d37c322532e9a842977de7d053ba19443af114c

See more details on using hashes here.

File details

Details for the file automata_tools-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: automata_tools-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0.post20200210 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.8.1

File hashes

Hashes for automata_tools-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0b055cd1b29fd97a472fcd4782b6f5cfa1fe2bb5653969ab3dbf35a0d33b2117
MD5 d66eb4b99babc4db829787da996e0253
BLAKE2b-256 e29613e3e6a7f6219382f69280bf2099a384d329ac980e62425be9128815f660

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page