Skip to main content

Find the Python code for specified symbols

Project description

symbex

PyPI Changelog Tests License

Find the Python code for specified symbols

Read symbex: search Python code for functions and classes, then pipe them into a LLM for background on this project.

Installation

Install this tool using pip:

pip install symbex

Or using Homebrew:

brew install simonw/llm/symbex

Usage

symbex can search for names of functions and classes that occur at the top level of a Python file.

To search every .py file in your current directory and all subdirectories, run like this:

symbex my_function

You can search for more than one symbol at a time:

symbex my_function MyClass

Wildcards are supported - to search for every test_ function run this (note the single quotes to avoid the shell interpreting the * as a wildcard):

symbex 'test_*'

To search for methods within classes, use class.method notation:

symbex Entry.get_absolute_url

Wildcards are supported here as well:

symbex 'Entry.*'
symbex '*.get_absolute_url'
symbex '*.get_*'

Or to view every method of every class:

symbex '*.*'

To search within a specific file, pass that file using the -f option. You can pass this more than once to search multiple files.

symbex MyClass -f my_file.py

To search within a specific directory and all of its subdirectories, use the -d/--directory option:

symbex Database -d ~/projects/datasette

If you know that you want to inspect one or more modules that can be imported by Python, you can use the -m/--module name option. This example shows the signatures for every symbol available in the asyncio package:

symbex -m asyncio -s --imports

You can search the directory containing the Python standard library using --stdlib. This can be useful for quickly looking up the source code for specific Python library functions:

symbex --stdlib -in to_thread

-in is explained below. If you provide --stdlib without any -d or -f options then --silent will be turned on automatically, since the standard library otherwise produces a number of different warnings.

The output starts like this:

# from asyncio.threads import to_thread
async def to_thread(func, /, *args, **kwargs):
    """Asynchronously run function *func* in a separate thread.
    # ...

You can exclude files in specified directories using the -x/--exclude option:

symbex Database -d ~/projects/datasette -x ~/projects/datasette/tests

If symbex encounters any Python code that it cannot parse, it will print a warning message and continue searching:

# Syntax error in path/badcode.py: expected ':' (<unknown>, line 1)

Pass --silent to suppress these warnings:

symbex MyClass --silent

Filters

In addition to searching for symbols, you can apply filters to the results.

The following filters are available:

  • --function - only functions
  • --class - only classes
  • --async - only async def functions
  • --documented - functions/classes that have a docstring
  • --undocumented - functions/classes that do not have a docstring
  • --typed - functions that have at least one type annotation
  • --untyped - functions that have no type annotations
  • --partially-typed - functions that have some type annotations but not all
  • --fully-typed - functions that have type annotations for every argument and the return value
  • --no-init - Exclude __init__(self) methods. This is useful when combined with --fully-typed '*.*' to avoid returning __init__(self) methods that would otherwise be classified as fully typed, since __init__ doesn't need argument or return type annotations.

For example, to see the signatures of every async def function in your project that doesn't have any type annotations:

symbex -s --async --untyped

For class methods instead of functions, you can combine filters with a symbol search argument of *.*.

This example shows the full source code of every class method in the Python standard library that has type annotations for all of the arguments and the return value:

symbex --fully-typed --no-init '*.*' --stdlib

Example output

In a fresh checkout of Datasette I ran this command:

symbex MessagesDebugView get_long_description

Here's the output of the command:

# File: setup.py Line: 5
def get_long_description():
    with open(
        os.path.join(os.path.dirname(os.path.abspath(__file__)), "README.md"),
        encoding="utf8",
    ) as fp:
        return fp.read()

# File: datasette/views/special.py Line: 60
class PatternPortfolioView(View):
    async def get(self, request, datasette):
        await datasette.ensure_permissions(request.actor, ["view-instance"])
        return Response.html(
            await datasette.render_template(
                "patterns.html",
                request=request,
                view_name="patterns",
            )
        )

Just the signatures

The -s/--signatures option will list just the signatures of the functions and classes, for example:

symbex -s -f symbex/lib.py
# File: symbex/lib.py Line: 107
def function_definition(function_node: AST)

# File: symbex/lib.py Line: 13
def find_symbol_nodes(code: str, filename: str, symbols: Iterable[str]) -> List[Tuple[(AST, Optional[str])]]

# File: symbex/lib.py Line: 175
def class_definition(class_def)

# File: symbex/lib.py Line: 209
def annotation_definition(annotation: AST) -> str

# File: symbex/lib.py Line: 227
def read_file(path)

# File: symbex/lib.py Line: 253
class TypeSummary

# File: symbex/lib.py Line: 258
def type_summary(node: AST) -> Optional[TypeSummary]

# File: symbex/lib.py Line: 304
def quoted_string(s)

# File: symbex/lib.py Line: 315
def import_line_for_function(function_name: str, filepath: str, possible_root_dirs: List[str]) -> str

# File: symbex/lib.py Line: 37
def code_for_node(code: str, node: AST, class_name: str, signatures: bool, docstrings: bool) -> Tuple[(str, int)]

# File: symbex/lib.py Line: 71
def add_docstring(definition: str, node: AST, docstrings: bool, is_method: bool) -> str

# File: symbex/lib.py Line: 82
def match(name: str, symbols: Iterable[str]) -> bool

This can be combined with other options, or you can run symbex -s to see every symbol in the current directory and its subdirectories.

To include estimated import paths, such as # from symbex.lib import match, use --imports. These will be calculated relative to the directory you specified, or you can pass one or more --sys-path options to request that imports are calculated relative to those directories as if they were on sys.path:

~/dev/symbex/symbex match --imports -s --sys-path ~/dev/symbex

Example output:

# File: symbex/lib.py Line: 82
# from symbex.lib import match
def match(name: str, symbols: Iterable[str]) -> bool

To suppress the # File: ... comments, use --no-file or -n.

So to both show import paths and suppress File comments, use -in as a shortcut:

symbex -in match

Output:

# from symbex.lib import match
def match(name: str, symbols: Iterable[str]) -> bool

To include docstrings in those signatures, use --docstrings:

symbex match --docstrings -f symbex/lib.py

Example output:

# File: symbex/lib.py Line: 82
def match(name: str, symbols: Iterable[str]) -> bool
    "Returns True if name matches any of the symbols, resolving wildcards"

Counting symbols

If you just want to count the number of functions and classes that match your filters, use the --count option. Here's how to count your classes:

symbex --class --count

Or to count every async test function:

symbex --async 'test_*' --count

Using with LLM

This tool is primarily designed to be used with LLM, a CLI tool for working with Large Language Models.

symbex makes it easy to grab a specific class or function and pass it to the llm command.

For example, I ran this in the Datasette repository root:

symbex Response | llm --system 'Explain this code, succinctly'

And got back this:

This code defines a custom Response class with methods for returning HTTP responses. It includes methods for setting cookies, returning HTML, text, and JSON responses, and redirecting to a different URL. The asgi_send method sends the response to the client using the ASGI (Asynchronous Server Gateway Interface) protocol.

Replacing a matched symbol

The --replace option can be used to replace a single matched symbol with content piped in to standard input.

Given a file called my_code.py with the following content:

def first_function():
    # This will be ignored
    pass

def second_function():
    # This will be replaced
    pass

Run the following:

echo "def second_function(a, b):
    # This is a replacement implementation
    return a + b + 3
" | symbex second_function --replace

The result will be an updated-in-place my_code.py containing the following:

def first_function():
    # This will be ignored
    pass

def second_function(a, b):
    # This is a replacement implementation
    return a + b + 3

This feature should be used with care! I recommend only using this feature against code that is already checked into Git, so you can review changes it makes using git diff and revert them using git checkout my_code.py.

Replacing a matched symbol by running a command

The --rexec COMMAND option can be used to replace a single matched symbol by running a command and using its output.

The command will be run with the matched symbol's definition piped to its standard input. The output of that command will be used as the replacement text.

Here's an example that uses sed to add a # to the beginning of each matching line, effectively commenting out the matched function:

symbex first_function --rexec "sed 's/^/# /'"

This modified the first function in place to look like this:

# def first_function():
#    # This will be ignored
#    pass

A much more exciting example uses LLM. This example will use the gpt-3.5-turbo model to add type hints and generate a docstring:

symbex second_function \
  --rexec "llm --system 'add type hints and a docstring'"

I ran this against this code:

def first_function():
    # This will be ignored
    pass

def second_function(a, b):
    return a + b + 3

And the second function was updated in place to look like this:

def second_function(a: int, b: int) -> int:
    """
    Returns the sum of two integers (a and b) plus 3.

    Parameters:
    a (int): The first integer.
    b (int): The second integer.

    Returns:
    int: The sum of a and b plus 3.
    """
    return a + b + 3

Using in CI

The --check option causes symbex to return a non-zero exit code if any matches are found for your query.

You can use this in CI to guard against things like functions being added without documentation:

symbex --function --undocumented --check

This will fail silently but set a 1 exit code if there are any undocumented functions.

Using in CI

The --check option causes symbex to return a non-zero exit code if any matches are found for your query.

You can use this in CI to guard against things like functions being added without documentation:

symbex --function --undocumented --check

This will fail silently but set a 1 exit code if there are any undocumented functions.

Similar tools

  • pyastgrep by Luke Plant offers advanced capabilities for viewing and searching through Python ASTs using XPath.
  • cq is a tool thet lets you "extract code snippets using CSS-like selectors", built using Tree-sitter and primarily targetting JavaScript and TypeScript.

symbex --help

Usage: symbex [OPTIONS] [SYMBOLS]...

  Find symbols in Python code and print the code for them.

  Example usage:

      # Search current directory and subdirectories
      symbex my_function MyClass

      # Search using a wildcard
      symbex 'test_*'

      # Find a specific class method
      symbex 'MyClass.my_method'

      # Find class methods using wildcards
      symbex '*View.handle_*'

      # Search a specific file
      symbex MyClass -f my_file.py

      # Search within a specific directory and its subdirectories
      symbex Database -d ~/projects/datasette

      # View signatures for all symbols in current directory and subdirectories
      symbex -s

      # View signatures for all test functions
      symbex 'test_*' -s

      # View signatures for all async functions with type definitions
      symbex --async --typed -s

      # Count the number of --async functions in the project
      symbex --async --count

      # Replace my_function with a new implementation:
      echo "def my_function(a, b):
          # This is a replacement implementation
          return a + b + 3
      " | symbex my_function --replace

Options:
  --version                  Show the version and exit.
  -f, --file FILE            Files to search
  -d, --directory DIRECTORY  Directories to search
  --stdlib                   Search the Python standard library
  -x, --exclude DIRECTORY    Directories to exclude
  -s, --signatures           Show just function and class signatures
  -n, --no-file              Don't include the # File: comments in the output
  -i, --imports              Show 'from x import y' lines for imported symbols
  -m, --module TEXT          Modules to search within
  --sys-path TEXT            Calculate imports relative to these on sys.path
  --docs, --docstrings       Show function and class signatures plus docstrings
  --count                    Show count of matching symbols
  --silent                   Silently ignore Python files with parse errors
  --async                    Filter async functions
  --function                 Filter functions
  --class                    Filter classes
  --documented               Filter functions with docstrings
  --undocumented             Filter functions without docstrings
  --typed                    Filter functions with type annotations
  --untyped                  Filter functions without type annotations
  --partially-typed          Filter functions with partial type annotations
  --fully-typed              Filter functions with full type annotations
  --no-init                  Filter to exclude any __init__ methods
  --check                    Exit with non-zero code if any matches found
  --replace                  Replace matching symbol with text from stdin
  --rexec TEXT               Replace with the result of piping to this tool
  --help                     Show this message and exit.

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd symbex
python -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

just

You can also install just and use it to run the tests and linters like this:

just

Or to list commands:

just -l
Available recipes:
    black         # Apply Black
    cog           # Rebuild docs with cog
    default       # Run tests and linters
    lint          # Run linters
    test *options # Run pytest with supplied options

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

symbex-1.1.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

symbex-1.1-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file symbex-1.1.tar.gz.

File metadata

  • Download URL: symbex-1.1.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for symbex-1.1.tar.gz
Algorithm Hash digest
SHA256 df1f0685366ed5938235dd130b0343fc55bac91f111069a55f89de2a3ea5b800
MD5 ac680a1eb820832e0a04cb113fc7b13d
BLAKE2b-256 4e72197389866706eb6d75f9313391de6f331d0bf90ee6dd4fae781c60cc2b99

See more details on using hashes here.

File details

Details for the file symbex-1.1-py3-none-any.whl.

File metadata

  • Download URL: symbex-1.1-py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for symbex-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 750d9647db2b6a0d9ddba84c727f061130f287056a33aebc3196d6f7566c4a5a
MD5 5ff6dd97de17a4fbf0b58961aacf7f52
BLAKE2b-256 e2f1f5c9378a4d794a4bb87b0791337cfc583de11409af212ab9cad0d47c9cb1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page