Skip to main content

Library for building and working with arbitrary ASTs on top dataclasses

Project description

pypi pypi python Ruff Test Status

Commerial-grade, well tested, documented and typed Python library for modeling, building, traversing, transforming, transferring and even pattern matching abstract syntax trees (AST) for arbtirary languages.

Features

  • 🌳 Easy to use dataclasses based, strictly typed, pseudo-immutable AST modeling
  • 📝 "Magic", auto-maintained node registry, allowing node cross-referncing and retrievel
  • 📚 Source origin tracking for each node
  • 📺 Zero-setup pretty printing using Rich
  • 💾 json, msgpack, yaml or plain dict (de)serialization
  • 🏃‍♀️ AST traversal: depth-first or breadth-first, top down or bottom up, with filtering and pruning
  • 🎯 Xpath-like AST search (top to the node)
  • 👯‍♂️ Node pattern matching with ability to capture specific sub-trees or attributes
  • ... and more!

Installation

pip install pyoak

Basic Usage

from dataclasses import dataclass

from pyoak.node import ASTNode
from pyoak.origin import CodeOrigin, TextSource, get_code_range
from rich import print


@dataclass
class Name(ASTNode):
    identifier: str


@dataclass
class NumberLiteral(ASTNode):
    value: int


@dataclass
class Stmt(ASTNode):
    pass


@dataclass
class AssignStmt(Stmt):
    name: Name
    value: NumberLiteral


def parse_assignment(code: str) -> AssignStmt:
    # This should be a real parse logic
    s = TextSource(source_uri="important_place", source_type="text/plain", _raw=code)
    stmt_o = CodeOrigin(source=s, position=get_code_range(0, 1, 0, 5, 1, 5))
    x_o = CodeOrigin(source=s, position=get_code_range(0, 1, 0, 1, 1, 1))
    num_o = CodeOrigin(source=s, position=get_code_range(4, 1, 4, 5, 1, 5))
    return AssignStmt(
        name=Name(identifier="x", origin=x_o),
        value=NumberLiteral(value=1, origin=num_o),
        origin=stmt_o,
    )


node = parse_assignment("x = 1\nother_stuff")
print(node)  # Prints a pretty tree
print(
    f"Original source code: {node.origin.get_raw()}"
)  # prints `Original source code: x = 1`
assert node is AssignStmt.get(node.id)  # you can always get the node using only its id
dup_node = node.duplicate()  # returns a deep copy of the node
assert node is not dup_node  # they are different, including id, but...
dup_node.original_id = node.id  # ...they can be linked together
assert dup_node == node  # True
# which is the same as
assert (
    dup_node.content_id == node.content_id
)  # content id uniquely represents the subtree

# Now let's get an iterable of all Name, NumberLiteral nodes
# this will traverse all the way down the tree, not just the direct children
some_children = node.gather((Name, NumberLiteral))

for subtree in some_children:
    print(subtree)
    print(f"Original source code: {subtree.origin.get_raw()}")

and this is just the tip of the iceberg!

Documentation

TBD

Credits

  • The (de)serialization code is based on the awesome mashumaro
  • Pattern matching definitions and engine runs on lark-parser

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyoak-1.0.0.tar.gz (40.4 kB view hashes)

Uploaded Source

Built Distribution

pyoak-1.0.0-py3-none-any.whl (42.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page