A tiny and minimal finite state machine for Python.
Project description
TinyFSM
A tiny and minimal finite state machine for Python.
About
TinyFSM is a fast and minimal finite state machine engine for Python. It runs the finite state defined as list of traversals, where each traversal is given by current and next state name and a traversal function that is evaluated on currently processed event. Once traversal function returns true, the current state is changed to the next state according to matched traversal.
This library can be used as a backbone for creating larger scale FSM-based tokenizers and parsers.
Installation
$ pip install tinyfsm
Quickstart
Here's a simple text tokenizer written using TinyFSM library that splits
text composed of alphanumeric characters into groups of words and numbers and
returns list of (token_name, token_data) tuples:
from tinyfsm.api import Traversal, StateMachineRunner, InputRejectedError
definition = [
Traversal[str]("initial", "word", str.isalpha),
Traversal[str]("initial", "number", str.isdigit),
Traversal[str]("initial", "final", lambda input: input == ""),
Traversal[str]("word", "word", str.isalpha),
Traversal[str]("word", "number", str.isdigit),
Traversal[str]("word", "final", lambda input: input == ""),
Traversal[str]("number", "number", str.isdigit),
Traversal[str]("number", "word", str.isalpha),
Traversal[str]("number", "final", lambda input: input == ""),
]
class Listener:
def __init__(self, output: list[tuple[str, str]]):
self._output = output
self._buffer = ""
def on_state_change(self, input: str, prev_state: str, current_state: str):
if prev_state != current_state:
if prev_state == "word":
self._output.append(("WORD", self._buffer))
if prev_state == "number":
self._output.append(("NUMBER", self._buffer))
self._buffer = ""
def on_dispatch_done(self, input: str, current_state: str):
self._buffer += input
def tokenize(text: str) -> list[tuple[str, str]]:
out = []
listener = Listener(out)
runner = StateMachineRunner(definition, listener)
with runner:
for char in text:
runner.dispatch(char)
runner.dispatch("")
return out
And the tokenizer from above will split any text composed of just words and numbers into something like this:
out = tokenize("foo123bar456")
print(out) # Would print: [("WORD", "foo"), ("NUMBER", "123"), ("WORD", "bar"), ("NUMBER", "456")]
License
This project is released under the terms of the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tinyfsm-0.0.1.tar.gz.
File metadata
- Download URL: tinyfsm-0.0.1.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.4 Linux/6.1.0-47-amd64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf557399930c8aa7178fd35f7fa67e4a8644206d50f29716158cce3355df1136
|
|
| MD5 |
91460e58d8dbe0ad3ac3b10d7c4dd2b0
|
|
| BLAKE2b-256 |
98bed9144044c4b1a3e8f6cb4626f257787783b7792529abe7d570b73b87b7ba
|
File details
Details for the file tinyfsm-0.0.1-py3-none-any.whl.
File metadata
- Download URL: tinyfsm-0.0.1-py3-none-any.whl
- Upload date:
- Size: 7.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.4 Linux/6.1.0-47-amd64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b3b1f9b963a843f4d55a6c0a659e9745aa443b6963f8d7ab2004405bcf81b0b
|
|
| MD5 |
b44593a7f6616d78b80242f03fbd3676
|
|
| BLAKE2b-256 |
b0fc94166e4c50f81c64c0be8007e48b986e80e07df94ada110015396b2fbe16
|