Skip to main content

A query tool for Python abstract syntax trees

Project description

https://badge.fury.io/py/pyastgrep.svg https://github.com/spookylukey/pyastgrep/actions/workflows/tests.yml/badge.svg

A command-line utility for grepping Python files using XPath syntax against the Python AST (Abstract Syntax Tree).

In other words, this allows you to search Python code against specific syntax elements (function definitions, arguments, assignments, variables etc), instead of grepping for string matches.

The interface and behaviour is designed to match grep and ripgrep as far as it makes sense to do so.

Installation

Python 3.7+ required.

We recommend pipx to install it conveniently in an isolated environment:

pipx install pyastgrep

You can also use pip:

pip install pyastgrep

Understanding the XML structure

To get started, you’ll need some understanding of how Python AST is structured, and how that is mapped to XML. Some methods for doing that are below:

  1. Use Python AST Explorer to play around with what AST looks like.

  2. Dump out the AST and/or XML structure of the top-level statements in a Python file. The top-level XML elements are <Module><body>, and don’t correspond to actual source lines. To get the statements within the body, you can use an XPath expression /Module/body/* or ./*/*:

    $ pyastgrep --xml --ast './*/*' myfile.py
    myfile.py:1:1:import os
    Import(
        lineno=1,
        col_offset=0,
        end_lineno=1,
        end_col_offset=9,
        names=[alias(lineno=1, col_offset=7, end_lineno=1, end_col_offset=9, name='os', asname=None)],
    )
    ...
    <Import lineno="1" col_offset="0">
      <names>
        <alias lineno="1" col_offset="7" type="str" name="os"/>
      </names>
    </Import>
    ...

Note that the XML format is a very direct translation of the Python AST as produced by the ast module (with some small additions made to improve usability for a few cases). This AST is not stable across Python versions, so the XML is not stable either. Normally changes in the AST correspond to new syntax that is added to Python, but in some cases a new Python version will make significant changes made to the AST generated for the same code.

You can also pipe specific Python fragments using - to specify stdin as the input file:

$ echo 'a + b' | pyastgrep --xml './*/*' -
<stdin>:1:1:a + b
<Expr lineno="1" col_offset="0">
  <value>
    <BinOp lineno="1" col_offset="0">
      <left>
        <Name lineno="1" col_offset="0" type="str" id="a">
          <ctx>
            <Load/>
          </ctx>
        </Name>
      </left>
      <op>
        <Add/>
      </op>
  ...

You’ll also need some understanding of how to write XPath expressions (see links at the bottom), but the examples below should get you started.

Examples

Find all usages of a function called open:

$ pyastgrep ".//Call/func/Name[@id='open']"
src/pyastgrep/search.py:88:18:            with open(path) as f:

Find all literal numbers (Python 3.8+):

$ pyastgrep './/Constant[@type="int" or @type="float"]'
tests/examples/test_xml/everything.py:5:20:    assigned_int = 123
tests/examples/test_xml/everything.py:6:22:    assigned_float = 3.14

Names longer than 42 characters:

$ pyastgrep './/Name[string-length(@id) > 42]'

except clauses that raise a different exception class than they catch:

$ pyastgrep "//ExceptHandler[body//Raise/exc//Name and not(contains(body//Raise/exc//Name/@id, type/Name/@id))]"

Functions whose name contain a certain substring:

$ pyastgrep './/FunctionDef[contains(@name, "something")]'

Classes whose name matches a regular expression:

$ pyastgrep ".//ClassDef[re:match('M.*', @name)]"

This uses the Python re.match method. You can also use re:search to use the Python re.search method.

Docstrings of functions/methods whose value contains “hello”:

$ pyastgrep './/FunctionDef/body/Expr[1]/value/Constant[@type="str"][contains(@value, "hello")]'

For-loop variables called i or j (including those created by tuple unpacking):

$ pyastgrep './/For/target//Name[@id="i" or @id="j"]'

Ignoring files

Files/directories matching .gitignore entries (global and local) are automatically ignored, unless specified as paths on the command line.

Currently there are no other methods to add or remove this ignoring logic. Please open a ticket if you want this feature. Most likely we should try to make it work like ripgrep filtering if that makes sense.

Tips

To get pyastgrep to print absolute paths in results, pass the current absolute path as the directory to search:

pyastgrep "..." $(pwd)

Limitations

pyastgrep is useful for grepping Python code at a fairly low level. It can be used for various refactoring or linting tasks. Some linting tasks require higher level understanding of a code base. For example, to detect use of a certain function, you need to cope with various ways that the function may be imported and used, and avoid detecting a function with the same name but from a different module. For these kinds of tasks, you might be interested in:

If you are using this as a library, you should note that while AST works well for linting, it’s not as good for rewriting code, because AST does not contain or preserve things like formatting and comments. For a better approach, have a look at libCST.

Use as a library

pyastgrep is structured internally to make it easy to use a library as well as a CLI. However, while we will try not to break things without good reason, at this point we are not documenting or guaranteeing API stability for these functions.

Contributing

Get test suite running:

pip install -r requirements-test.txt
pytest

Run tests against all versions:

pip install tox
tox

Please install pre-commit in the repo:

pre-commit install

This will add Git hooks to run linters when committing, which ensures our style (black) and other things.

You can manually run these linters using:

pre-commit run --all --all-files

Run mypy (we only expect it to pass on Python 3.10):

mypy .

Bug fixes and other changes can be submitted using pull requests on GitHub. For large changes, it’s worth opening an issue first to discuss the approach.

History

This project was forked from https://github.com/hchasestevens/astpath by H. Chase Stevens. Main changes:

  • Added a test suite

  • Many bugs fixed

  • Significant rewrite of parts of code

  • Changes to match grep/ripgrep, including formatting and automatic filtering.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyastgrep-0.6.tar.gz (17.7 kB view hashes)

Uploaded Source

Built Distribution

pyastgrep-0.6-py3-none-any.whl (15.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page