Skip to main content

A light-weight, Extendable, high level, universal code parser built on top of tree-sitter

Project description

tree-hugger

A light-weight, high level, universal code parser built on top of tree-sitter

Browse the doc

  1. What is it?

  2. Why do I need it?

  3. Design Goals

  4. Installation

  5. Building the .so Files

  6. A Quick Example

  7. Roadmap


What is it?

tree-hugger is a light weight wrapper around the excellent tree-sitter library and it's Python binding.

Why do I need it?

tree-sitter is a great library and does it's job without any problem and very very fast. But it is also pretty low-level. The Python binding makes you work with ugly looking sexp to run a query and get the result. It also does not support the NodeVisitor kind of features that are available in Python's native ast module.

At CodistAI we have been using tree-sitter for some time now to create a language independent layer for our code analysis and code intelligence platform. While bulding that, we faced the pain as well. And we wrote some code to easily extend our platform to different languages. We believe some others may as well need to have the same higher level library to easily parse and gain insight about various different code files.

Design Goals

  • Light-weight
  • Extendable
  • Provides easy higher-level abstrctions
  • (Should)Offer some kind of normalization across languages

Installation

From pip:

Just do

pip install tree-hugger

From Source:

git clone https://github.com/autosoft-dev/tree-hugger.git

cd tree-hugger

pip install -e .

The installation process is tested in macOS Mojave, we have a separate docker binding for compiling the libraries for Linux and soon this library will be integrated in that as well

You may need to install libgit2. In case you are in mac just use brew install libgit2

Building the .so files

Please note that building the libraries has been tested under a macOS Mojave with Apple LLVM version 10.0.1 (clang-1001.0.46.4)

Please check out our Linux specific instructions here

Once this library is installed it gives you a command line utility to download and compile tree-sitter .so files with ease. As an example -

create_libs python

Here is the full usage guide of the command

usage: create_libs [-h] [-c] [-l LIB_NAME] langs [langs ...]

positional arguments:
  langs                 Give the name of languages for tree-sitter (php,
                        python, go ...)

optional arguments:
  -h, --help            show this help message and exit
  -c, --copy-to-workspace
                        Shall we copy the created libs to the present dir?
                        (default: False)
  -l LIB_NAME, --lib-name LIB_NAME
                        The name of the generated .so file

A Quick Example

First run the above command to generate the libraries.

In our settings we just use the -c flag to copy the generated tree-sitter library's .so file to our workspace. And once copied, we place it under a directory called tslibs (It is in the .gitignore).

Another thing that we need before we can analyze any code file is an yaml with queries. We have suuplied one example query file under queries directory.

Please note that, you can set up two environment variables QUERY_FILE_PATH and TS_LIB_PATH for the query file path and tree-sitter lib path and then the libary will use them automatically. Otherwise, as an alternative, you can pass it when creating any *Parser object

Assuming that you have the necessary environment variable setup. The following line of code will create a PythonParser object

from tree_hugger.core import PythonParser

pp = PythonParser()

And then you can pass in any Python file that you want to analyze, like so :

pp.parse_file("tests/assets/file_with_different_functions.py")
Out[3]: True

parse_file returns True if success

And then you are free to use the methods exposed by that particular Parser object. As an example -

pp.get_all_function_names()
Out[4]:
['first_child',
 'second_child',
 'say_whee',
 'wrapper',
 'my_decorator',
 'parent']

OR

pp.get_all_function_docstrings()
Out[5]:
{'parent': '"""This is the parent function\n    \n    There are other lines in the doc string\n    This is the third line\n\n    And this is the fourth\n    """',
 'first_child': "'''\n        This is first child\n        '''",
 'second_child': '"""\n        This is second child\n        """',
 'my_decorator': '"""\n    Outer decorator function\n    """',
 'say_whee': '"""\n    Hellooooooooo\n\n    This is a function with decorators\n    """'}

(Notice that, in the last call, it only returns the functions which has a docstring)

Roadmap

  • Finish PythonParser

  • Create pypi packages and make it installable via pip

  • Write more documentation

  • Write *Parser class for other languages

Languages Status-Finished Author
Python 40% Shubhadeep
PHP 0% NULL
Java 0% NULL
JavaScript 0% NULL

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tree-hugger-0.1.3.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tree_hugger-0.1.3-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file tree-hugger-0.1.3.tar.gz.

File metadata

  • Download URL: tree-hugger-0.1.3.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.5

File hashes

Hashes for tree-hugger-0.1.3.tar.gz
Algorithm Hash digest
SHA256 f75bb901f76e49a57278a1d5f9316e27d826bab594b0a939c9d23d416964e077
MD5 3f68177fffe88cfd092b92efdfa2a5f1
BLAKE2b-256 eb032641cf03112dd8696684b3dab8c004f628a4974bb140446265eee65e33c7

See more details on using hashes here.

File details

Details for the file tree_hugger-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: tree_hugger-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.5

File hashes

Hashes for tree_hugger-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 664c83b7b7728fa6c4dffffd86d2d1d45680d2262c67d03713503439330c23b5
MD5 0254cc06ffc3bee70064c8c81fef770e
BLAKE2b-256 c20abe2bb547c93b4e83e3a5671ea3646c04b5458b1f3cadc753d10204e7fead

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page