Python bindings to the Tree-sitter parsing library
Project description
py-tree-sitter
This module provides Python bindings to the tree-sitter parsing library.
Installation
This package currently only works with Python 3. There are no library dependencies, but you do need to have a C compiler installed.
pip3 install tree_sitter
Usage
Setup
First you'll need a Tree-sitter language implementation for each language that you want to parse. You can clone some of the existing language repos or create your own:
git clone https://github.com/tree-sitter/tree-sitter-go
git clone https://github.com/tree-sitter/tree-sitter-javascript
git clone https://github.com/tree-sitter/tree-sitter-python
Use the Language.build_library method to compile these into a library that's usable from Python. This function will return immediately if the library has already been compiled since the last time its source code was modified:
from tree_sitter import Language, Parser
Language.build_library(
# Store the library in the `build` directory
'build/my-languages.so',
# Include one or more languages
[
'vendor/tree-sitter-go',
'vendor/tree-sitter-javascript',
'vendor/tree-sitter-python'
]
)
Load the languages into your app as Language objects:
GO_LANGUAGE = Language('build/my-languages.so', 'go')
JS_LANGUAGE = Language('build/my-languages.so', 'javascript')
PY_LANGUAGE = Language('build/my-languages.so', 'python')
Basic Parsing
Create a Parser and configure it to use one of the languages:
parser = Parser()
parser.set_language(PY_LANGUAGE)
Parse some source code:
tree = parser.parse(bytes("""
def foo():
if bar:
baz()
""", "utf8"))
Inspect the resulting Tree:
root_node = tree.root_node
assert root_node.type == 'module'
assert root_node.start_point == (1, 0)
assert root_node.end_point == (3, 13)
function_node = root_node.children[0]
assert function_node.type == 'function_definition'
assert function_node.child_by_field_name('name').type == 'identifier'
function_name_node = function_node.children[1]
assert function_name_node.type == 'identifier'
assert function_name_node.start_point == (1, 4)
assert function_name_node.end_point == (1, 7)
assert root_node.sexp() == "(module "
"(function_definition "
"name: (identifier) "
"parameters: (parameters) "
"body: (block "
"(if_statement "
"condition: (identifier) "
"consequence: (block "
"(expression_statement (call "
"function: (identifier) "
"arguments: (argument_list))))))))"
Walking Syntax Trees
If you need to traverse a large number of nodes efficiently, you can use
a TreeCursor:
cursor = tree.walk()
assert cursor.node.type == 'module'
assert cursor.goto_first_child()
assert cursor.node.type == 'function_definition'
assert cursor.goto_first_child()
assert cursor.node.type == 'def'
# Returns `False` because the `def` node has no children
assert not cursor.goto_first_child()
assert cursor.goto_next_sibling()
assert cursor.node.type == 'identifier'
assert cursor.goto_next_sibling()
assert cursor.node.type == 'parameters'
assert cursor.goto_parent()
assert cursor.node.type == 'function_definition'
Editing
When a source file is edited, you can edit the syntax tree to keep it in sync with the source:
tree.edit(
start_byte=5,
old_end_byte=5,
new_end_byte=5 + 2,
start_point=(0, 5),
old_end_point=(0, 5),
new_end_point=(0, 5 + 2),
)
Then, when you're ready to incorporate the changes into a new syntax tree,
you can call Parser.parse again, but pass in the old tree:
new_tree = parser.parse(new_source, tree)
This will run much faster than if you were parsing from scratch.
Pattern-matching
You can search for patterns in a syntax tree using a tree query:
query = PY_LANGUAGE.query("""
(function_definition
name: (identifier) @function.def)
(call
function: (identifier) @function.call)
""")
captures = query.captures(tree.root_node)
assert len(captures) == 2
assert captures[0][0] == function_name_node
assert captures[0][1] == "function.def"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tree_sitter-0.2.2.tar.gz.
File metadata
- Download URL: tree_sitter-0.2.2.tar.gz
- Upload date:
- Size: 110.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2618b781b065d419237dcd234f7a7cd68920c4f15c6809a99ed2c2dfd6d15d01
|
|
| MD5 |
f60d77793b3fee698bce03fefbe138d7
|
|
| BLAKE2b-256 |
cdc27816b62138532028ea760268aef746dae78c542f55f4751bb5f0ef7d28e4
|
File details
Details for the file tree_sitter-0.2.2-cp38-cp38-macosx_10_14_x86_64.whl.
File metadata
- Download URL: tree_sitter-0.2.2-cp38-cp38-macosx_10_14_x86_64.whl
- Upload date:
- Size: 118.5 kB
- Tags: CPython 3.8, macOS 10.14+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15e7616f0e41127ca70880a40301a6ebfe58ce92db4f142eac775a38e9c38cec
|
|
| MD5 |
9228a9557998bebfc18b00e07db8ff95
|
|
| BLAKE2b-256 |
3d35ac0b5fdce92852f5ad1de17bd3e38fec4e148a42bca47bfc02f4241705a5
|