Skip to main content

Python bindings for the Tree-Sitter parsing library

Project description

Python Tree-sitter

Build Status

This module provides Python bindings to the tree-sitter parsing library.

Installation

The package has no library dependencies and provides pre-compiled wheels for all major platforms.

[!NOTE] If your platform is not currently supported, please submit an issue on GitHub.

pip install tree-sitter

Usage

Setup

Install languages

Tree-sitter language implementations also provide pre-compiled binary wheels. Let's take Python as an example.

pip install tree-sitter-python

Then, you can load it as a Language object:

import tree_sitter_python as tspython
from tree_sitter import Language, Parser

PY_LANGUAGE = Language(tspython.language(), "python")

Build from source

[!WARNING] This method of loading languages is deprecated and will be removed in v0.22.0. You should only use it if you need languages that have not updated their bindings. Keep in mind that you will need a C compiler in this case.

First you'll need a Tree-sitter language implementation for each language that you want to parse.

git clone https://github.com/tree-sitter/tree-sitter-go
git clone https://github.com/tree-sitter/tree-sitter-javascript
git clone https://github.com/tree-sitter/tree-sitter-python

Use the Language.build_library method to compile these into a library that's usable from Python. This function will return immediately if the library has already been compiled since the last time its source code was modified:

from tree_sitter import Language, Parser

Language.build_library(
    # Store the library in the `build` directory
    "build/my-languages.so",
    # Include one or more languages
    ["vendor/tree-sitter-go", "vendor/tree-sitter-javascript", "vendor/tree-sitter-python"],
)

Load the languages into your app as Language objects:

GO_LANGUAGE = Language("build/my-languages.so", "go")
JS_LANGUAGE = Language("build/my-languages.so", "javascript")
PY_LANGUAGE = Language("build/my-languages.so", "python")

Basic parsing

Create a Parser and configure it to use a language:

parser = Parser()
parser.set_language(PY_LANGUAGE)

Parse some source code:

tree = parser.parse(
    bytes(
        """
def foo():
    if bar:
        baz()
""",
        "utf8",
    )
)

If you have your source code in some data structure other than a bytes object, you can pass a "read" callable to the parse function.

The read callable can use either the byte offset or point tuple to read from buffer and return source code as bytes object. An empty bytes object or None terminates parsing for that line. The bytes must encode the source as UTF-8.

For example, to use the byte offset:

src = bytes(
    """
def foo():
    if bar:
        baz()
""",
    "utf8",
)


def read_callable_byte_offset(byte_offset, point):
    return src[byte_offset : byte_offset + 1]


tree = parser.parse(read_callable_byte_offset)

And to use the point:

src_lines = ["\n", "def foo():\n", "    if bar:\n", "        baz()\n"]


def read_callable_point(byte_offset, point):
    row, column = point
    if row >= len(src_lines) or column >= len(src_lines[row]):
        return None
    return src_lines[row][column:].encode("utf8")


tree = parser.parse(read_callable_point)

Inspect the resulting Tree:

root_node = tree.root_node
assert root_node.type == 'module'
assert root_node.start_point == (1, 0)
assert root_node.end_point == (4, 0)

function_node = root_node.children[0]
assert function_node.type == 'function_definition'
assert function_node.child_by_field_name('name').type == 'identifier'

function_name_node = function_node.children[1]
assert function_name_node.type == 'identifier'
assert function_name_node.start_point == (1, 4)
assert function_name_node.end_point == (1, 7)

function_body_node = function_node.child_by_field_name("body")

if_statement_node = function_body_node.child(0)
assert if_statement_node.type == "if_statement"

function_call_node = if_statement_node.child_by_field_name("consequence").child(0).child(0)
assert function_call_node.type == "call"

function_call_name_node = function_call_node.child_by_field_name("function")
assert function_call_name_node.type == "identifier"

function_call_args_node = function_call_node.child_by_field_name("arguments")
assert function_call_args_node.type == "argument_list"


assert root_node.sexp() == (
    "(module "
        "(function_definition "
            "name: (identifier) "
            "parameters: (parameters) "
            "body: (block "
                "(if_statement "
                    "condition: (identifier) "
                    "consequence: (block "
                        "(expression_statement (call "
                            "function: (identifier) "
                            "arguments: (argument_list))))))))"
)

Walking syntax trees

If you need to traverse a large number of nodes efficiently, you can use a TreeCursor:

cursor = tree.walk()

assert cursor.node.type == "module"

assert cursor.goto_first_child()
assert cursor.node.type == "function_definition"

assert cursor.goto_first_child()
assert cursor.node.type == "def"

# Returns `False` because the `def` node has no children
assert not cursor.goto_first_child()

assert cursor.goto_next_sibling()
assert cursor.node.type == "identifier"

assert cursor.goto_next_sibling()
assert cursor.node.type == "parameters"

assert cursor.goto_parent()
assert cursor.node.type == "function_definition"

[!IMPORTANT] Keep in mind that the cursor can only walk into children of the node that it started from.

See examples/walk_tree.py for a complete example of iterating over every node in a tree.

Editing

When a source file is edited, you can edit the syntax tree to keep it in sync with the source:

new_src = src[:5] + src[5 : 5 + 2].upper() + src[5 + 2 :]

tree.edit(
    start_byte=5,
    old_end_byte=5,
    new_end_byte=5 + 2,
    start_point=(0, 5),
    old_end_point=(0, 5),
    new_end_point=(0, 5 + 2),
)

Then, when you're ready to incorporate the changes into a new syntax tree, you can call Parser.parse again, but pass in the old tree:

new_tree = parser.parse(new_src, tree)

This will run much faster than if you were parsing from scratch.

The Tree.changed_ranges method can be called on the old tree to return the list of ranges whose syntactic structure has been changed:

for changed_range in tree.changed_ranges(new_tree):
    print("Changed range:")
    print(f"  Start point {changed_range.start_point}")
    print(f"  Start byte {changed_range.start_byte}")
    print(f"  End point {changed_range.end_point}")
    print(f"  End byte {changed_range.end_byte}")

Pattern-matching

You can search for patterns in a syntax tree using a tree query:

query = PY_LANGUAGE.query(
    """
(function_definition
  name: (identifier) @function.def
  body: (block) @function.block)

(call
  function: (identifier) @function.call
  arguments: (argument_list) @function.args)
"""
)

Captures

captures = query.captures(tree.root_node)
assert len(captures) == 2
assert captures[0][0] == function_name_node
assert captures[0][1] == "function.def"

The Query.captures() method takes optional start_point, end_point, start_byte and end_byte keyword arguments, which can be used to restrict the query's range. Only one of the ..._byte or ..._point pairs need to be given to restrict the range. If all are omitted, the entire range of the passed node is used.

Matches

matches = query.matches(tree.root_node)
assert len(matches) == 2

# first match
assert matches[0][1]["function.def"] == function_name_node
assert matches[0][1]["function.block"] == function_body_node

# second match
assert matches[1][1]["function.call"] == function_call_name_node
assert matches[1][1]["function.args"] == function_call_args_node

The Query.matches() method takes the same optional arguments as Query.captures(). The difference between the two methods is that Query.matches() groups captures into matches, which is much more useful when your captures within a query relate to each other. It maps the capture's name to the node that was captured via a dictionary.

To try out and explore the code referenced in this README, check out examples/usage.py.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tree-sitter-0.21.0.tar.gz (155.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

tree_sitter-0.21.0-cp311-cp311-win_amd64.whl (109.6 kB view details)

Uploaded CPython 3.11Windows x86-64

tree_sitter-0.21.0-cp311-cp311-musllinux_1_1_x86_64.whl (493.6 kB view details)

Uploaded CPython 3.11musllinux: musl 1.1+ x86-64

tree_sitter-0.21.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (498.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

tree_sitter-0.21.0-cp311-cp311-macosx_11_0_arm64.whl (125.9 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

tree_sitter-0.21.0-cp311-cp311-macosx_10_9_x86_64.whl (133.3 kB view details)

Uploaded CPython 3.11macOS 10.9+ x86-64

tree_sitter-0.21.0-cp310-cp310-win_amd64.whl (109.6 kB view details)

Uploaded CPython 3.10Windows x86-64

tree_sitter-0.21.0-cp310-cp310-musllinux_1_1_x86_64.whl (492.6 kB view details)

Uploaded CPython 3.10musllinux: musl 1.1+ x86-64

tree_sitter-0.21.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (496.6 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

tree_sitter-0.21.0-cp310-cp310-macosx_11_0_arm64.whl (125.9 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

tree_sitter-0.21.0-cp310-cp310-macosx_10_9_x86_64.whl (133.3 kB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

tree_sitter-0.21.0-cp39-cp39-win_amd64.whl (109.9 kB view details)

Uploaded CPython 3.9Windows x86-64

tree_sitter-0.21.0-cp39-cp39-musllinux_1_1_x86_64.whl (494.3 kB view details)

Uploaded CPython 3.9musllinux: musl 1.1+ x86-64

tree_sitter-0.21.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (498.3 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

tree_sitter-0.21.0-cp39-cp39-macosx_11_0_arm64.whl (126.3 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

tree_sitter-0.21.0-cp39-cp39-macosx_10_9_x86_64.whl (133.5 kB view details)

Uploaded CPython 3.9macOS 10.9+ x86-64

tree_sitter-0.21.0-cp38-cp38-win_amd64.whl (109.9 kB view details)

Uploaded CPython 3.8Windows x86-64

tree_sitter-0.21.0-cp38-cp38-musllinux_1_1_x86_64.whl (499.1 kB view details)

Uploaded CPython 3.8musllinux: musl 1.1+ x86-64

tree_sitter-0.21.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (503.8 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

tree_sitter-0.21.0-cp38-cp38-macosx_11_0_arm64.whl (126.2 kB view details)

Uploaded CPython 3.8macOS 11.0+ ARM64

tree_sitter-0.21.0-cp38-cp38-macosx_10_9_x86_64.whl (133.5 kB view details)

Uploaded CPython 3.8macOS 10.9+ x86-64

File details

Details for the file tree-sitter-0.21.0.tar.gz.

File metadata

  • Download URL: tree-sitter-0.21.0.tar.gz
  • Upload date:
  • Size: 155.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for tree-sitter-0.21.0.tar.gz
Algorithm Hash digest
SHA256 c74ec9eff30e0c5b9f00ee578cca64df322b9885c8a15364a2c537f485abcc77
MD5 1d669608a00c6f495fa366e8a16f8b18
BLAKE2b-256 6b46c28e3280b4219cb662dd826377c511a4ef9267da471a7859c911e596e785

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 f308f7d05019c35d43ea609c1f04ed896fd713ddf5c95db92b3c56dbbd3c1e09
MD5 ba1b1746910c93707e481eeff6045dde
BLAKE2b-256 d25a6b47081ed5928ead57a7c71e91339292bb685c2df483018bc62c54221404

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp311-cp311-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 63458f19b266d9ec58057d8435f92728a164a6debc24fb3c545664bc0b4fa087
MD5 7f852a8df845a15351849eed98b66c98
BLAKE2b-256 981ea69f443b0fd406a07eff8a88ad33186d5de2cb703b6141ae799fafc7917d

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3e2f0723c025bb532e9a0cad5e1b1f99b6e35049b948321c3ad7617be716bc7a
MD5 165457e2eda14051b04bd5d5e35045c9
BLAKE2b-256 d7a03589fca1044d954bda9d306eebdd57caf25a8c621a102b3bc950314851b8

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5ce18b46634a4132c6932e032ea4717e572cf0db40223afcf0eab69664ad3500
MD5 fc8ae2dfd20dd38f1a6eb637fb1063d6
BLAKE2b-256 e0ce4911165a4b0e913d73b411e1e6d96783bba62bddba948edcb4a9ef30bb3e

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 384c103edf6e5ac98710c1a71e2956849d692a0c47d55cd5ba3aedc000b7c3d6
MD5 df4335ecdc4beb3d82b7b59aa7ec1986
BLAKE2b-256 afe4c20140468faa7c37ce08d9ec6830b1f5eb0e310ee128d77f9ab1882f92d1

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 d78e2cf597f6e54bbfe27beae5dae6cde290d934753cb1a00d8b12d66f2034d8
MD5 f6c5d3a18371c5854380e79d1f44d75e
BLAKE2b-256 5f7b458cb4f3c0894b58b1d15b1bb1352e9a7e9de0867d0d27da2506aa068777

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp310-cp310-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 ca8f27570acb03a06fc3b663e4ebf1d224c442846ed3837a518b044bad20582b
MD5 7dac2ef86b5121e434a1ba7d985560a4
BLAKE2b-256 eef3ed9cd9b99f322621418e5121368d6ad0135b24e1b1857703c6a775aa7ecd

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cfcae2748f9d210370fca00fff95f7516e49953e60b4919bf42de64906417717
MD5 551438bdfd3a486accc3bb5439d5d658
BLAKE2b-256 748c0cc2b14181ac78b90a78c9b3ae2d4d7d22f5d84953b505c426b4e972fc80

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8da1b036a13a5b1e7e499535060137f0127a05a0b63aec2b7e9992f27e1b0b2e
MD5 bfa707e594c3aafc9870a5b1d5232135
BLAKE2b-256 2855d770d549ed631571888b4fbef9e698449a595b8a0ed06b0adf53bd41449d

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 bb334076814909e8b1f22aba5128a07a953e9eacfd269d3e4f1819cf2b290f5b
MD5 361c77f7a21e517a9c42490e14c2b861
BLAKE2b-256 1f1c7d80337d65cbc51e61f56af98d9b86b1232577ebff37c33971c2668e55f3

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tree_sitter-0.21.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 109.9 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for tree_sitter-0.21.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 55307318d10325ebafb51025cb630663f61172855f43c4054942a4ba69d4f11c
MD5 6d72f1d314748f1b7181aeb2f36f44d4
BLAKE2b-256 20bcc3e84c4bfe1b70aaeec248ab26a45e490fb1389aa839247c179a87740995

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp39-cp39-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 348df964896bc34ec4c55d45d052decbc0ec6519624bd15f2e31580477be5d6d
MD5 37698f4cd2f3888aa133a5122a484bb3
BLAKE2b-256 d4f42da325074001cc2ff338d15826aefcff395bacb322ff7f83adb2469de08e

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0a9d007426365f3c027ef50c68ac03ab0b777ca6e613bfe921474ed69ad8ea1f
MD5 5357b1b4c53cd93116eceea0d5f17224
BLAKE2b-256 e2d5d3be00d93b6042c4b81128156a6eb27ee76004a0c542812389b602a0ec30

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0f8eae679c13f519ae4437610f5d7a3a849b401f61d0bc08ad0ee580d4102387
MD5 cb9bbf92d700423f161bd68e7ca46a48
BLAKE2b-256 63729c345f3a095967105f6b8cbc7ead6e2eb95b37404c9a0391cb57c6dc04db

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 72e109678704785cf75682edae5265106d9d13b5b146c2f6a5cd2ea0ed293e9c
MD5 8b301852337b264ceb136eb72157c62d
BLAKE2b-256 58bea8a5cf1736628a633333dbb63fa2a599826a4e5c01d89d2f9f88d6737460

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tree_sitter-0.21.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 109.9 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for tree_sitter-0.21.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 eb89b951958c26f902ea3ae32f5d899ca9231d1aea64823d634f71e77af41b11
MD5 c29496193ffe0f22de568a189236e2c6
BLAKE2b-256 ab741b2281e3d3f2779cd8db8b49db14c87051e228d706785a6aada2ec160d51

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp38-cp38-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 46539b86c01463d4d5ac7b4834300504218f79e359fb496e0c4da40c24ddb167
MD5 60b7bd154e6c760b25197552e3a9e026
BLAKE2b-256 3ca9f23188339c7c3fe21b61c132b972d5b356566778f72e81ab5d06cc14dd17

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3310284a92d326d050cef385b4f6611a2166fbf04b2789fd87f279bfe939d6cb
MD5 40fcccf864bd250b6a71a1de13d186cc
BLAKE2b-256 6f179eff79901be490ede241412d5d5eee26017d93ab75566297aa2e47e8dc70

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ef283479cb0d5c37f4c8e6cefa6e28d4de9a1eb858b43e964c4e822282394300
MD5 17e8c30d5fa358e16dc37c297daa88b5
BLAKE2b-256 794901dee0977f314ab4d378e6e95cfce4d1eec51619947ec5c790ca9f5b5d35

See more details on using hashes here.

File details

Details for the file tree_sitter-0.21.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for tree_sitter-0.21.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ab516edce67cd201312ecd55b65b195ce118ab900cc649fc868a1185e462a9bc
MD5 472f1c2e0a37542c3a0a4e78d77dc9d3
BLAKE2b-256 4aa1b96e8148f41b599a4346a73449574a0ec8342a975a48d849f3c096cf4c07

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page