Bicycle Repair Man - Rewrite Python Sources
Project description
Bicycle Repair Man
BRM is a python source modification library to perform lossless modifications with the guarantee of full-roundtripability. It is generally used for unstructured source parts, where the modification can be done directly on the tokens.
A simple example would be the TokenTransformer
, where we change each +
(plus) operator
to a -
(minus) operator.
class DestoryAllOfThem(TokenTransformer):
# Replace each PLUS token with a MINUS
def visit_plus(self, token):
return token._replace(string="-")
transformer = DestoryAllOfThem()
assert transformer.transform("(2p) + 2 # with my precious comment") == "(2p) - 2 # with my precious comment"
One advantage of token based refactoring over any form of structured tree representation is that, you are much more
liberal about what you can do. Do you want to prototype a new syntax idea, for example a √
operator; here you go:
class SquareRoot(TokenTransformer):
# Register a new token called `squareroot`
def register_squareroot(self):
return "√"
# Match a squareroot followed by a number
@pattern("squareroot", "number")
def remove_varprefix(self, operator, token):
return self.quick_tokenize(f"int({token.string} ** 0.5)")
sqr = SquareRoot()
assert eval(sqr.transform("√9")) == 3
Why BRM
- BRM is an extremely simple, dependency-free, pure-python tool with 500 LoC that you can easily vendor.
- BRM supports each new Python syntax out of the box, no need to wait changes on our upstream.
- BRM supports incomplete files (and files that contain invalid python syntax).
- BRM supports introducing new syntax and making it permanent for prototypes.
If you need any of these, BRM might be the right fit. But I would warn against using it for complex refactoring tasks, since that is not a problem we intend to tackle. If you need such a tool, take a look at refactor or parso.
Permanency
If you loved the concept of transformers and use them in real world code, BRM exposes a custom encoding that will run your transformers automatically when specified.
- Write a transformer
- Copy it to the
~/.brm
folder, or simply usecp <file>.py $(python -m brm)
- Specify
# coding: brm
on each file
Example:
from brm import TokenTransformer, pattern
class AlwaysTrue(TokenTransformer):
STRICT = False
# Make every if/elif statement `True`
@pattern("name", "*any", "colon")
def always_true_if(self, *tokens):
statement, *_, colon = tokens
if statement.string not in {"if", "elif"}:
return
true, = self.quick_tokenize("True")
return (statement, true, colon)
Let's put our transformer to the BRM's transformer folder, and run our example.
(.venv) [ 9:12ÖS ] [ isidentical@x200:~ ]
$ cat -n r.py
1 # coding: brm
2
3 a = 2
4 if a > 2:
5 print("LOL")
(.venv) [ 9:12ÖS ] [ isidentical@x200:~ ]
$ cp test.py $(python -m brm)
(.venv) [ 9:12ÖS ] [ isidentical@x200:~ ]
$ python r.py
LOL
TA-DA!
BRM Pattern Syntax
For BRM, a python source code is just a sequence of tokens. It doesn't create any relationships between them, or even verify the file is syntactically correct. For example take a look at the following file:
if a == x:
2 + 2 # lol
For BRM, in an abstract fashion, the file is just the following text:
NAME NAME EQEQUAL NAME COLON NEWLINE INDENT NUMBER PLUS NUMBER COMMENT NEWLINE DEDENT ENDMARKER
And internally it is processed like this:
If you want to match binary plus operation here (2 + 2
), you can create pattern with number, plus, name
.
Note: If you want to visualize your patterns and see what they match, give
examples/visualize.py
a shot.
Extras
If you are using the TokenTransformer
, there are a few handy functions that you might check out:
Function | Returns | Description | |
---|---|---|---|
quick_tokenize(source: str, *, strip: bool = True) |
List[TokenInfo] |
Break the given source text into a list of tokens. If strip is True , then the last 2 tokens (NEWLINE , EOF ) will be omitted. |
|
quick_untokenize(tokens: List[TokenInfo]) |
str |
Convert the given sequence of tokens back to a representation which would yield the same tokens back when tokenized (a lossy conversion). If you want a full round-trip / lossless conversion, use tokenize.untokenize . |
|
directional_length(tokens: List[TokenInfo]) |
int |
Calculate the linear distance between the first and the last token of the sequence. | |
shift_all(tokens: List[TokenInfo], x_offset: int, y_offset: int) |
List[TokenInfo] |
Shift each token in the given sequence by x_offset in the column offsets, and by y_offset in the line numbers. Return the new list of tokens. |
|
until(toktype: int, stream: List[TokenInfo]) |
Iterator[TokenInfo] |
Yield all tokens until a token of toktype is seen. If there are no such tokens seen, it will raise a ValueError |
|
_get_type(token: TokenInfo) |
int |
Return the type of the given token. Useful with until() . (internal ) |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file brm-0.3.0.tar.gz
.
File metadata
- Download URL: brm-0.3.0.tar.gz
- Upload date:
- Size: 9.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37c9275988aebcf052bf2b1811923c21f3c6571a1bea588c8c86040147f325d0 |
|
MD5 | 571ccfe82498ec16a6f27ed505f09198 |
|
BLAKE2b-256 | 65d2848dc4f221f96d33843f3454bfb68ee40b2ec630aa4a249a7d19e94aff05 |
File details
Details for the file brm-0.3.0-py2.py3-none-any.whl
.
File metadata
- Download URL: brm-0.3.0-py2.py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 23ae8033545d41a1b5fa44d5df50470702e0445880921f9ff4c55dd4d78ec312 |
|
MD5 | 936d628b8d00176d10c92eb102154eef |
|
BLAKE2b-256 | 5b6702869a547dd78f51cff83b6e8b624af008f5a5fba972c5f3706777488af1 |