Skip to main content

No project description provided

Project description

sequoia-diff

An awesome tool to work with abstract syntax trees, providing:

  • Algorithms to generate mappings between trees to see what nodes exist in the other
  • Algorithms to create "edit scripts", or the sequence of actions to transform one tree into the other.

Named after the giant sequoias, this name implies strength, resilience, and the ability to handle large, complex structures, much like managing complex code structures in a diff tool.

  • sequoia (/sɪˈkwɔɪ.ə/) - Either of two huge coniferous California trees of the bald cypress family that may reach a height of over 300 feet ^1.
  • diff (/ dɪf /) - An operation that computes and displays the data difference or differences between two files ^2.

Getting Started

[!WARNING] This project is under active development. As a result, things might severely break between versions. Use it at your own risk!

It's recommended that this library be installed in a virtual environment.

python -m venv .venv
source .venv/bin/activate
pip install sequoia-diff

Nodes

The core data structure of sequoia-diff is the Node. Nodes have a "type" (like structural elements like "if_statement") and a "label" (like text attached to the node).

You can construct Nodes either manually or using loaders like so:

# Building nodes using loaders. For example, the tree_sitter loader

from sequoia_diff.models import Node
from sequoia_diff.loaders import from_tree_sitter_tree
import tree_sitter_java
import tree_sitter as ts

parser = ts.Parser(ts.Language(tree_sitter_java.language()))
ts_tree = parser.parse(b"public class Test { }")
loader_root = from_tree_sitter_tree(ts_tree, "java")

print(loader_root.pretty_str())
"""
Node(type="program", subtree_hash=0xd19e33244ca...)
  Node(type="class_declaration", subtree_hash=0x38bb1992e23...)
    Node(type="modifiers", subtree_hash=0xa89cb5ba69f...)
      Node(type="public", label="public", subtree_hash=0x7c1f47c6fb9...)
    Node(type="class", label="class", subtree_hash=0x53ba79b0932...)
    Node(type="identifier", label="Test", subtree_hash=0x1759f434dde...)
    Node(type="class_body", subtree_hash=0x2e39dfad18f...)
"""
# Building nodes manually

from sequoia_diff.models import Node

manual_root = Node(type="root", label=None, children=[
  Node(type="mid_level", label="a"),
  Node(type="mid_level", label="b"),
  Node(type="another_mid_level", label="c"),
])

print(manual_root.pretty_str())
"""
Node(type="root", subtree_hash=0x691693519f9...)
  Node(type="mid_level", label="a", subtree_hash=0xf4c1d8f2e8a...)
  Node(type="mid_level", label="b", subtree_hash=0x1b09b1156a8...)
  Node(type="another_mid_level", label="c", subtree_hash=0x1d6921ca9ee...)
"""

You can also modify the Nodes like so:

# Using convenience methods, which correctly set the parent-child relationship
child1 = Node(type="child", label="Child1")
child2 = Node(type="child", label="Child2")

manual_root.children[2].children_append(child2)
manual_root.children_insert(1, child1)
manual_root.children_remove(manual_root.children[1])
manual_root.set_parent(Node(type="new_root", label=None))

# Deep copy the node
copy_of_root = manual_root.parent.deep_copy()

print(copy_of_root.pretty_str())
"""
Node(type="new_root", subtree_hash=0x42e6081b4af...)
  Node(type="root", subtree_hash=0xd33e85a2b21...)
    Node(type="mid_level", label="a", subtree_hash=0xf4c1d8f2e8a...)
    Node(type="mid_level", label="b", subtree_hash=0x1b09b1156a8...)
    Node(type="another_mid_level", label="c", subtree_hash=0xdf4d5ccb84c...)
      Node(type="child", label="Child2", subtree_hash=0x5e4d9f60de3...)
"""

Mappings and Edit Script (Tree Diff)

You can generate mappings between trees to see which Nodes correspond to which between trees. Support for many different algorithms exist, but the default is match_greedy_top_down followed by match_greedy_bottom_up.

from sequoia_diff import get_tree_diff
from sequoia_diff.matching import generate_mappings
from sequoia_diff.models import Node
from tests.util import dictize_action, dictize_mapping
import yaml

# Building nodes manually
old_root = Node(type="root", label=None, children=[
    Node(type="a", label="a"),
    Node(type="b", label="b", children=[
        Node(type="b-1", label="b-1"),
        Node(type="b-2", label="b-2"),
    ]),
    Node(type="c", label="c", children=[
        Node(type="c-1", label="c-1"),
        Node(type="c-2", label="c-2"),
    ]),
])

new_root = Node(type="root", label=None, children=[
    Node(type="a", label="ayyy", children=[
        Node(type="a-1", label="a-1"),
        Node(type="a-2", label="a-2"),
    ]),
    Node(type="c", label="c", children=[
        Node(type="c-1", label="c-1"),
        Node(type="c-2", label="c-2"),
    ]),
    Node(type="b", label="b", children=[
        Node(type="b-1", label="b-1"),
        Node(type="b-2", label="b-2"),
    ]),
])

mappings = generate_mappings(old_root, new_root)

print(yaml.dump([dictize_mapping(m) for m in mappings]))

actions = get_tree_diff(old_root, new_root)

print(yaml.dump([dictize_action(a) for a in actions]))

Development

Package built with setuptools using a flat layout.

To install all development dependencies, perform:

python -m venv .venv
pip install -e .[dev]

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sequoia_diff-0.0.7.tar.gz (26.2 kB view details)

Uploaded Source

Built Distribution

sequoia_diff-0.0.7-py3-none-any.whl (22.0 kB view details)

Uploaded Python 3

File details

Details for the file sequoia_diff-0.0.7.tar.gz.

File metadata

  • Download URL: sequoia_diff-0.0.7.tar.gz
  • Upload date:
  • Size: 26.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for sequoia_diff-0.0.7.tar.gz
Algorithm Hash digest
SHA256 eb91c754cc7b8937b8528ba3134e35124e2f63ac0cb792d0f30f2c39ed74183a
MD5 84a5e3bee8088eca0910d5e2982f5b62
BLAKE2b-256 1d7d78251e45017043d3d66caf3298aaa70d6a1be94638fcdec2b256f5f7346b

See more details on using hashes here.

File details

Details for the file sequoia_diff-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: sequoia_diff-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 22.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for sequoia_diff-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 c3f7a9ef7873594e6f8f336e8dae4d9945eadfcc590f0f82d6e9036b5b24fa25
MD5 e0e600ee13c95ff294e410d66fc4b376
BLAKE2b-256 88eb6b72a9f7eaf47c60c02b0372fc7a7b3c707105109b708818fea2a7b7e8ce

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page