Generic tree dataclass with visitor
Project description
dataclass-tree
Overview
dataclass-tree provides a set of base classes and utilities for building and traversing tree-like data structures using Python's dataclass feature. It is designed to make tree construction, traversal, and manipulation both intuitive and type-safe, with a particular emphasis on consistent type hinting for maximum IDE support.
Main Classes
Tree
The Tree class serves as the abstract base for all tree nodes. It is designed around the concept of child groups that can have arbitrary names and may contain any number of Tree items (or subclasses thereof).
It is a dataclass, so any field annotated with list[Tree] will treated as child groups, while other fields function like normal. Child groups are treated special in the following way:
- They are automatically initialized to empty lists, even if no default factory is specified, and even if the field is annotated as optional with
list[Tree]|None. - They can be initialized with the
optional_treelist()function to clearly mark them as such when hovering over them in an IDE. - Child groups can be traversed with the
visit()method, with support with different orders:PREORDER,POSTORDER,INORDER, andLEVELORDER.
The Tree base class has no child groups defined, so you will need to subclass it (or LabeledTree) to use it. (See examples below). Subclasses must use the @dataclass decorator,
and any fields defined in the subclass are added to those defined in the superclass. Fields with the same name will replace those in the superclass, and order is significant. (This is standard dataclass behavior).
You can create trees containing a mix of Tree subclasses, each with their own uniquely nae child groups. For example, you could have a tree representing an expression in a programming language, with a Binop class with left and right child groups, and a Unop class having just an expression child group. Even though they have different names, they would be visited in a consistent way by a visitor, regardless their names.
LabeledTree
A subclass of Tree with an additional label field. This is a very common usecase, so this class is provided as a convenience to save you some typing.
Visitor
The Visitor class is designed to process Tree nodes while traversing a tree. It uses a flexible dispatch mechanism that locates the appropriate visitor method for each node type and is typically used as an argument for the Tree.visit method. The Visitor class does not traverse a tree on its own, but will be called to process a node by the Tree.visit method.
A Visitor has the following features:
- Dynamically dispatches to visitor methods based on the node's class name and the visitor's class name.
- Supports strict enforcement of visitor methods, i.e. raising a
NotImplementedexception if no suitable method is found, or it can be configured to allow a generic fallback. - Traversal logic is separated from the operation performed on each node, making it easy to implement new behaviors with minimal boilerplate.
Type Hinting for IDE Support
All core classes and their fields use type annotations. For example, child groups are always annotated as list[Tree] or Optional[list[Tree]]. This clarity enables IDEs to:
- Offer autocompletion for node attributes and methods.
- Provide real-time type checking to catch mistakes early.
- Generate informative tooltips as you code.
To make tooltips even more explicit for attributes that are optional (and any field annotated as list[Tree] will be optional and initialized to an empty list automatically),
the function optional_treelist() may be used as a field initializer.
For example, if you would simply declare a class like this, the hints by your IDE would be incomplete:
@dataclass
class Example(Tree):
left: list[Tree]
right: list[Tree]
They would show up like :arrow_right: left: list[Tree], right: list[Tree] -> Example
The following would make the type checker unhappy (it would flag the None initializers)
@dataclass
class Example(Tree):
left: list[Tree] = None
right: list[Tree] = None
This would be ok
@dataclass
class Example(Tree):
left: list[Tree]|None = None
right: list[Tree]|None = None
The hint would show up like :arrow_right: left: list[Tree]|None = None, right: list[Tree]|None = None -> Example
But the following alternative is more readable (IMHO, your taste might be different), and it will make
sure that when hovering over one of the attributes in an instance, it will show the type as list[Tree] instead of
list[Tree]|None, which is more correct because it is always initialized to an empty list and assigning None to
it is not a good idea.
@dataclass
class Example(Tree):
left: list[Tree] = optional_treelist()
right: list[Tree] = optional_treelist()
The hint for the constructor would be :arrow_right: left: list[Tree] = optional_treelist(), right: list[Tree] = optional_treelist() -> Example
and hovering over myexamplenode.left would show :arrow_right: left: list[Tree]
Example Usage
A simple example showing class with different child groups.
from dataclasses import dataclass
from dataclass_tree import Tree, LabeledTree, Visitor, Order, optional_treelist
@dataclass
class Unop(LabeledTree):
expression: : list[Tree] = optional_treelist()
@dataclass
class Binop(LabeledTree):
left: list[Tree] = optional_treelist()
right: list[Tree] = optional_treelist()
# a leaf class, i.e. without any fields annotated as list[Tree]
@dataclass
class Literal(Tree):
value: str
# Build a simple tree
root = Unop("-")
root.expression = Binop(label="+")
root.left.append(Literal("42"))
root.right.append(Literal("7"))
# Define a visitor with a single, generic method that will apply to any node
class Printer(Visitor):
def _do_printer(self, node: Tree, level: int):
print(" " * (2 * level) + str(node))
# Traverse the tree
root.visit(Printer(), order=Order.PREORDER)
License
GPL V3
Supported traversal methods
Preorder, postorder, inorder, and levelorder.
See https://en.wikipedia.org/wiki/Tree_traversal
Limitations and possible future enhancements
The Tree.visit() function is implemented with simple recursion. This means that for large trees Python's recusion depth may be a limitation and you might get
RecursionError: maximum recursion depth exceeded. You can overcome this by setting a different limit
but in the future I might replace this by a queue based implementation.
Currently the Tree.visit() method has its visitor attribute annotated as Visitor; It might be worth while to replace this by a protocol instead, to make applying a visitor even more versatile.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file dataclass_tree-0.5.0.tar.gz.
File metadata
- Download URL: dataclass_tree-0.5.0.tar.gz
- Upload date:
- Size: 24.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6503567990242d136dfa656c8a9a907585a52b476ddc61b1635497f5179ef446
|
|
| MD5 |
7498173cd74a98ecd27840dc50eb7de5
|
|
| BLAKE2b-256 |
5237481351e06eed63b6ca767d447bcb82853d650274cb6602f7036774b96b9c
|
Provenance
The following attestation bundles were made for dataclass_tree-0.5.0.tar.gz:
Publisher:
python-publish.yml on varkenvarken/dataclass-tree
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dataclass_tree-0.5.0.tar.gz -
Subject digest:
6503567990242d136dfa656c8a9a907585a52b476ddc61b1635497f5179ef446 - Sigstore transparency entry: 427714924
- Sigstore integration time:
-
Permalink:
varkenvarken/dataclass-tree@f4901072c3c5f516b29280f36362046dc2abeda7 -
Branch / Tag:
refs/tags/v0.5.0-beta - Owner: https://github.com/varkenvarken
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@f4901072c3c5f516b29280f36362046dc2abeda7 -
Trigger Event:
release
-
Statement type: