Skip to main content

Generic tree dataclass with visitor

Project description

PythonTestCoveragePypi

dataclass-tree

Overview

dataclass-tree provides a set of base classes and utilities for building and traversing tree-like data structures using Python's dataclass feature. It is designed to make tree construction, traversal, and manipulation both intuitive and type-safe, with a particular emphasis on consistent type hinting for maximum IDE support.

Main Classes

Tree

The Tree class serves as the abstract base for all tree nodes. It is designed around the concept of child groups that can have arbitrary names and may contain any number of Tree items (or subclasses thereof).

It is a dataclass, so any field annotated with list[Tree] will treated as child groups, while other fields function like normal. Child groups are treated special in the following way:

  • They are automatically initialized to empty lists, even if no default factory is specified, and even if the field is annotated as optional with list[Tree]|None.
  • They can be initialized with the optional_treelist() function to clearly mark them as such when hovering over them in an IDE.
  • Child groups can be traversed with the visit() method, with support with different orders: PREORDER, POSTORDER, INORDER, and LEVELORDER.

The Tree base class has no child groups defined, so you will need to subclass it (or LabeledTree) to use it. (See examples below). Subclasses must use the @dataclass decorator, and any fields defined in the subclass are added to those defined in the superclass. Fields with the same name will replace those in the superclass, and order is significant. (This is standard dataclass behavior).

You can create trees containing a mix of Tree subclasses, each with their own uniquely nae child groups. For example, you could have a tree representing an expression in a programming language, with a Binop class with left and right child groups, and a Unop class having just an expression child group. Even though they have different names, they would be visited in a consistent way by a visitor, regardless their names.

LabeledTree

A subclass of Tree with an additional label field. This is a very common usecase, so this class is provided as a convenience to save you some typing.

Visitor

The Visitor class is designed to process Tree nodes while traversing a tree. It uses a flexible dispatch mechanism that locates the appropriate visitor method for each node type and is typically used as an argument for the Tree.visit method. The Visitor class does not traverse a tree on its own, but will be called to process a node by the Tree.visit method.

A Visitor has the following features:

  • Dynamically dispatches to visitor methods based on the node's class name and the visitor's class name.
  • Supports strict enforcement of visitor methods, i.e. raising a NotImplemented exception if no suitable method is found, or it can be configured to allow a generic fallback.
  • Traversal logic is separated from the operation performed on each node, making it easy to implement new behaviors with minimal boilerplate.

Type Hinting for IDE Support

All core classes and their fields use type annotations. For example, child groups are always annotated as list[Tree] or Optional[list[Tree]]. This clarity enables IDEs to:

  • Offer autocompletion for node attributes and methods.
  • Provide real-time type checking to catch mistakes early.
  • Generate informative tooltips as you code.

To make tooltips even more explicit for attributes that are optional (and any field annotated as list[Tree] will be optional and initialized to an empty list automatically), the function optional_treelist() may be used as a field initializer.

For example, if you would simply declare a class like this, the hints by your IDE would be incomplete:

    @dataclass
    class Example(Tree):
        left: list[Tree]
        right: list[Tree]

They would show up like :arrow_right: left: list[Tree], right: list[Tree] -> Example

The following would make the type checker unhappy (it would flag the None initializers)

    @dataclass
    class Example(Tree):
        left: list[Tree] = None
        right: list[Tree] = None

This would be ok

    @dataclass
    class Example(Tree):
        left: list[Tree]|None = None
        right: list[Tree]|None = None

The hint would show up like :arrow_right: left: list[Tree]|None = None, right: list[Tree]|None = None -> Example

But the following alternative is more readable (IMHO, your taste might be different), and it will make sure that when hovering over one of the attributes in an instance, it will show the type as list[Tree] instead of list[Tree]|None, which is more correct because it is always initialized to an empty list and assigning None to it is not a good idea.

    @dataclass
    class Example(Tree):
        left: list[Tree] = optional_treelist()
        right: list[Tree] = optional_treelist()

The hint for the constructor would be :arrow_right: left: list[Tree] = optional_treelist(), right: list[Tree] = optional_treelist() -> Example

and hovering over myexamplenode.left would show :arrow_right: left: list[Tree]

Example Usage

A simple example showing class with different child groups.

from dataclasses import dataclass
from dataclass_tree import Tree, LabeledTree, Visitor, Order, optional_treelist

@dataclass
class Unop(LabeledTree):
    expression: : list[Tree] = optional_treelist()

@dataclass
class Binop(LabeledTree):
    left: list[Tree] = optional_treelist()
    right: list[Tree] = optional_treelist()

# a leaf class, i.e. without any fields annotated as list[Tree]
@dataclass
class Literal(Tree):
    value: str

# Build a simple tree
root = Unop("-")
root.expression = Binop(label="+")
root.left.append(Literal("42"))
root.right.append(Literal("7"))

# Define a visitor with a single, generic method that will apply to any node
class Printer(Visitor):
    def _do_printer(self, node: Tree, level: int):
        print(" " * (2 * level) + str(node))

# Traverse the tree
root.visit(Printer(), order=Order.PREORDER)

License

GPL V3

Supported traversal methods

Preorder, postorder, inorder, and levelorder.

See https://en.wikipedia.org/wiki/Tree_traversal

Limitations and possible future enhancements

The Tree.visit() function is implemented with simple recursion. This means that for large trees Python's recusion depth may be a limitation and you might get RecursionError: maximum recursion depth exceeded. You can overcome this by setting a different limit but in the future I might replace this by a queue based implementation.

Currently the Tree.visit() method has its visitor attribute annotated as Visitor; It might be worth while to replace this by a protocol instead, to make applying a visitor even more versatile.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataclass_tree-0.5.0.tar.gz (24.8 kB view details)

Uploaded Source

File details

Details for the file dataclass_tree-0.5.0.tar.gz.

File metadata

  • Download URL: dataclass_tree-0.5.0.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataclass_tree-0.5.0.tar.gz
Algorithm Hash digest
SHA256 6503567990242d136dfa656c8a9a907585a52b476ddc61b1635497f5179ef446
MD5 7498173cd74a98ecd27840dc50eb7de5
BLAKE2b-256 5237481351e06eed63b6ca767d447bcb82853d650274cb6602f7036774b96b9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataclass_tree-0.5.0.tar.gz:

Publisher: python-publish.yml on varkenvarken/dataclass-tree

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page