Skip to main content

Bicycle Repair Man - Rewrite Python Sources

Project description

Bicycle Repair Man

BRM is a python source modification library to perform lossless modifications with the guarantee of full-roundtripability. It is generally used for unstructured source parts, where the modification can be done directly on the tokens.

A simple example would be the TokenTransformer, where we change each + (plus) operator to a - (minus) operator.

class DestoryAllOfThem(TokenTransformer):
    
    # Replace each PLUS token with a MINUS
    def visit_plus(self, token):
        return token._replace(string="-")

transformer = DestoryAllOfThem()
assert transformer.transform("(2p) + 2 # with my precious comment") == "(2p) - 2 # with my precious comment"

One advantage of token based refactoring over any form of structured tree representation is that, you are much more liberal about what you can do. Do you want to prototype a new syntax idea, for example a operator; here you go:

class SquareRoot(TokenTransformer):

    # Register a new token called `squareroot`
    def register_squareroot(self):
        return "√"

    # Match a squareroot followed by a number
    @pattern("squareroot", "number")
    def remove_varprefix(self, operator, token):
        return self.quick_tokenize(f"int({token.string} ** 0.5)")

sqr = SquareRoot()
assert eval(sqr.transform("√9")) == 3

Why BRM

  • BRM is an extremely simple, dependency-free, pure-python tool with 500 LoC that you can easily vendor.
  • BRM supports each new Python syntax out of the box, no need to wait changes on our upstream.
  • BRM supports incomplete files (and files that contain invalid python syntax).
  • BRM supports introducing new syntax and making it permanent for prototypes.

If you need any of these, BRM might be the right fit. But I would warn against using it for complex refactoring tasks, since that is not a problem we intend to tackle. If you need such a tool, take a look at refactor or parso.

Permanency

If you loved the concept of transformers and use them in real world code, BRM exposes a custom encoding that will run your transformers automatically when specified.

  • Write a transformer
  • Copy it to the ~/.brm folder, or simply use cp <file>.py $(python -m brm)
  • Specify # coding: brm on each file

Example:

from brm import TokenTransformer, pattern

class AlwaysTrue(TokenTransformer):

    STRICT = False

    # Make every if/elif statement `True`
    @pattern("name", "*any", "colon")
    def always_true_if(self, *tokens):
        statement, *_, colon = tokens
        if statement.string not in {"if", "elif"}:
            return
        true, = self.quick_tokenize("True")
        return (statement, true, colon)

Let's put our transformer to the BRM's transformer folder, and run our example.

(.venv) [  9:12ÖS ]  [ isidentical@x200:~ ]
 $ cat -n r.py
     1  # coding: brm
     2
     3  a = 2
     4  if a > 2:
     5      print("LOL")
(.venv) [  9:12ÖS ]  [ isidentical@x200:~ ]
 $ cp test.py $(python -m brm)
(.venv) [  9:12ÖS ]  [ isidentical@x200:~ ]
 $ python r.py
LOL

TA-DA!

BRM Pattern Syntax

For BRM, a python source code is just a sequence of tokens. It doesn't create any relationships between them, or even verify the file is syntactically correct. For example take a look at the following file:

if a == x:
    2 + 2 # lol

For BRM, in an abstract fashion, the file is just the following text:

NAME NAME EQEQUAL NAME COLON NEWLINE INDENT NUMBER PLUS NUMBER COMMENT NEWLINE DEDENT ENDMARKER

And internally it is processed like this:

brm pattern show gif

If you want to match binary plus operation here (2 + 2), you can create pattern with number, plus, name.

Note: If you want to visualize your patterns and see what they match, give examples/visualize.py a shot.

Extras

If you are using the TokenTransformer, there are a few handy functions that you might check out:

Function Returns Description
quick_tokenize(source: str, *, strip: bool = True) List[TokenInfo] Break the given source text into a list of tokens. If strip is True, then the last 2 tokens (NEWLINE, EOF) will be omitted.
quick_untokenize(tokens: List[TokenInfo]) str Convert the given sequence of tokens back to a representation which would yield the same tokens back when tokenized (a lossy conversion). If you want a full round-trip / lossless conversion, use tokenize.untokenize.
directional_length(tokens: List[TokenInfo]) int Calculate the linear distance between the first and the last token of the sequence.
shift_all(tokens: List[TokenInfo], x_offset: int, y_offset: int) List[TokenInfo] Shift each token in the given sequence by x_offset in the column offsets, and by y_offset in the line numbers. Return the new list of tokens.
until(toktype: int, stream: List[TokenInfo]) Iterator[TokenInfo] Yield all tokens until a token of toktype is seen. If there are no such tokens seen, it will raise a ValueError
_get_type(token: TokenInfo) int Return the type of the given token. Useful with until(). (internal)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

brm-0.3.0.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

brm-0.3.0-py2.py3-none-any.whl (9.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file brm-0.3.0.tar.gz.

File metadata

  • Download URL: brm-0.3.0.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for brm-0.3.0.tar.gz
Algorithm Hash digest
SHA256 37c9275988aebcf052bf2b1811923c21f3c6571a1bea588c8c86040147f325d0
MD5 571ccfe82498ec16a6f27ed505f09198
BLAKE2b-256 65d2848dc4f221f96d33843f3454bfb68ee40b2ec630aa4a249a7d19e94aff05

See more details on using hashes here.

File details

Details for the file brm-0.3.0-py2.py3-none-any.whl.

File metadata

  • Download URL: brm-0.3.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for brm-0.3.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 23ae8033545d41a1b5fa44d5df50470702e0445880921f9ff4c55dd4d78ec312
MD5 936d628b8d00176d10c92eb102154eef
BLAKE2b-256 5b6702869a547dd78f51cff83b6e8b624af008f5a5fba972c5f3706777488af1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page